Saturday, November 5, 2011

Affect and concepts for multimedia retrieval (Halloween III)

This Halloween I just kept on noticing what I am calling "affect pumpkins". These are jack-o-lantern faces labeled with emotion words. Jack-o-lanterns and decorations (such as the ones in this image) that depict jack-o-lanterns are typical for celebrations of Halloween.

I don't remember having my jack-o-lanterns labeled with adjectives when I was a child, so I am rather curious about this phenomenon and have been observing it a bit. Apparently, the activity of giving jack-o-laterns emotion words is quite fun and is, all and all, a harmonious process, characterized by a lack of disagreement or other inter-personal strife. If you have happy jack-o-lantern, there appears to be a high degree of consensus about the applicability of the label 'happy'.

I contrast this smooth and fun pumpkin labeling procedure with the disagreement in the multimedia community that has apparently developed into full-fledged distate for what are referred to as "subjective user tags", tags that express feelings or personal perspectives. Such tags have been referred to as "imprecise and meaningless" in Liu et al. 2009 published at WWW (page 351) and my impression is that many, many researchers agree with this point of view. In the authors' defense, had they used what I feel as the more appropriate formulation of "imprecise and meaningless with respect to a certain subset of multimedia retrieval tasks", the community would still probably be on a rampage against personal and affective tags.

Sometimes it seems everyone has simply made this spontaneous decision to take up arms against the insight of Rosalind Picard, who in 1995 wrote, "Although affective annotations, like content annotations, will not be universal, they will still help reduce time searching for the 'right scene.' Both types of annotation are potentially powerful; we should be exploring them in digital audio and visual libraries." (from "TR 321" p. 11). Do we have a huge case of sour grapes? Have we decided that we have irreversibly failed over the past 15+ years to exploit affective image labels and are therefore now deciding that we should never have considered them potentially interesting in the first place?

Oh, I hope not. Just look at this wall and think about all the walls like this, all the jack-o-lantern pictures that were created this Halloween and posted to the Internet. There are too many pictures of Halloween pumpkins out there that we can afford to overlook the chance to organize them by affect. Of course, some people might hold that this silly pumpkin should actually also be considered a happy pumpkin: We can anticipate some disagreement. However, it is important to keep two points in mind: (1) Labels that are ostensibly 'objective' and have nothing to do with affect are also subject to lack of consensus on their applicability, e.g., the ambiguity on whether a depicted object is a 'pumpkin' and 'jack-o-lantern' discussed in my previous post. (2) Even if we do not agree on the exact affective label, we do have intuitions that we do not agree and on other possible interpretations. For example, someone who insists on 'silly' will also admit that someone else might consider this pumpkin 'happy', but that it would be less likely to expect anyone to find 'sad' as the most appropriate label.

Interestingly, in my observations, I have seen that the emotion word used to describe a jack-o-lantern seem to be chosen from one of two perspectives: Depicted in the image above are "pumpkin perspective" emotion words ('happy', 'silly', 'sad' and 'mad') which designate the emotion being experienced by the jack-o-lantern that explains the jack-o-lantern's expression. In the picture book page in the image from my previous post there is a mixture of this "pumpkin perspective" with a "people perspective". The book reads, "We'll make our jack-o-lanterns--it might be messy, but it's fun!" and then asks "Will yours be scary?" A jack-o-lantern is scary if it causes fear from the perspective of people looking at it. And then it goes on to ask "Happy? Sad?" which are "pumpkin perspective" words. And finally "A sweet or silly one?". Other perspectives are also possible: the affect label could reflect what the carver of the jack-o-lantern intended to achieve by making the pumpkin.

In my own work, I tend to insist on the importance of distinguishing these different perspectives, with the idea that if the underlying model of affect is complete and sound, it will provide a more stable foundation for building a system of annotation. However, in practical use, the affect labels don't need to distinguish the experiencer or understand the principle of empathic sympathy: we simply know a happy pumpkin when we see one and that of course makes us a little happy ourselves.

Dong Liu, Xian-Sheng Hua, Linjun Yang, Meng Wang, and Hong-Jiang Zhang. 2009. Tag ranking. In Proceedings of the 18th international conference on World wide web (WWW '09). ACM, New York, NY, USA, 351-360.

Rosalind W. Picard, Affective computing, MIT, Media Laboratory Perceptual Computing Section Technical Report 321, November 1995.