Friday, October 29, 2010

ACM Multimedia SSCS 2010 Workshop on Searching Spontaneous Conversational Speech

The Fourth Workshop on Searching Spontaneous Conversational Speech took place on 29 October 2010 at ACM Multimedia. Papers were presented about techniques for speech retrieval, speaker role recognition, spoken term detection and concept detection. Invited speakers addressed challenges for the future of spoken content retrieval, including interview data, multimedia archives and the Spoken Web. The demonstrations were a highlight of the workshop. These were first introduced in a boaster session and then presented to workshop participants in an interactive session. Here's the Wordle Word cloud made from the title and the abstracts of all the papers presented!

Currently, we are getting ready for an upcoming special issue on searching speech in ACM Transactions on Information Systems.

Sunday, October 24, 2010

MediaEval 2010 Workshop Report

We were delighted that Bill Bowles attended the MediaEval 2010 workshop and that he made us our own MediaEval video trailer, in which he tells the story of MediaEval from his own point of view. The MediaEval 2010 Affect Task was devoted to analyzing Bill's travelogue video from his Travel Project and ranking it by how boring viewers reported it to be. As a filmmaker, another rational reaction would be "Who are these people, what did they do to my video? I don't want to get anywhere near them!" But instead, he came, participated and told us about ourselves using the very same medium we devote so much effort to studying.



I was amazed at how quickly this video accumulated views, it quickly outstripped any video I've ever posted to the Internet. However, if video is not your thing and you want the text version of what happend here is the text of a workshop report written for a project newsletter.

MediaEval 2010 Workshop Report

The MediaEval 2010 workshop was held on Sunday, October 24, 2010 in Pisa, Italy at Santa Croce in Fossabanda. MediaEval is a benchmarking initiative for multimedia retrieval, focusing on speech, language and contextual aspects of multimedia (geographical and social context) and their combination with visual features. Its central sponsor is the PetaMedia Network of Excellence. In total, four tasks were run during MediaEval 2010. To approach the tasks, participants could make use of spoken, visual, and audio content as well as accompanying metadata. Two “Tagging Tasks’ (a version for professional content and one for Internet video) required participants to automatically predict the tags that humans assign to video content. An ‘Affect Task’ involved automatic prediction of viewer-reported boredom for Travelogue video. Finally, a ‘Placing Task’ required participants to automatically predict the geo-coordinates of Flickr video. The Placing Task was co-organized by PetaMedia and Glocal. It was also given special mention in the talk of Gerald Friedland entitled “Multimodal Location Estimation” in the “Brave New Ideas” session at ACM Multimedia 2010.

During the MediaEval 2010 workshop, researchers presented and discussed the algorithms developed and the results achieved on the MediaEval 2010 tasks. The workshop drew 29 participants from 3 continents. More information about the 2010 results including participants’ short working notes papers, are available at: http://www.multimediaeval.org/mediaeval2010
Currently, MediaEval 2010 participants are working towards a special session at the 2010 ACM International Conference on Multimedia Retrieval (ICMR 2010), which will be dedicated to presenting extended results on MediaEval 2010 tasks.

Mediaeval 2011 will be organized again with sponsorship from PetaMedia and in collaboration with other projects from the Media Search Cluster. The task offering in 2011 will be decided on the basis of participants' interest, assessed, as last year, via a survey. At this time, we anticipate that we will run a Tagging Task and a Placing Task as well as a couple innovative other, new tasks as dictated by popularity. If you are interested in participating in MediaEval 2011 or if your project would like to organize a task, please contact Martha Larson m.a.larson@tudelft.nl Additional information on MediaEval 2011 is available on the website: http://www.multimediaeval.org

Saturday, October 9, 2010

Drink recommendation

Within the last ten days I've been in Asia, Europe and North America. I've taken jetlag to a new level. Usually there is a reference point, you can say, "It's past midnight in the Netherlands at the moment, my internal clock thinks it's past my bedtime and that's why I am so tired." Now I have no clue why time my internal clock reads.

At the grocery store, I just picked out a four pack of energy drink in order to try to jump start myself and get re-aligned with the cycle of the sun at my current location. I stood for ten minutes in front of the selections, looking at the cans and then reading the labels. I wanted something not too expensive, sugar free and also with guarana. A Brazilian colleague had recommended guarana as one of the best "pick up" ingredients you can get in an energy drink.

What I could use is a good drink recommendation system. The Asian part of this odyssey took place in Tokyo, and the following video was what YouTube there listed as a popular video. It had received 44466 views in the one day since it had been uploaded.

1 dag geleden 44466 keer bekeken



It is a news report on a drink vending machine (a Tokyo fixture) that recommends drinks by taking your picture and doing a little bit of multimedia content analysis that gives it clues as to your age and gender.

In my current situation, age and gender wouldn't have been enough. Rather the system would need information about my internal state -- the camera would have to have noticed the unfocused glaze of my tired eyes. In this situation, internal-state information could be inferred if the system had access to information about my geo-coordinates within the last ten days. Access to a recent history of my sleeping-waking pattern would provide an even better source of evidence.

However, another key bit of information, that would be critical to get to the correct drink would be that at the moment I do not want to be tired. I can't be tired. I don't want something that will relax me -- no chamomile, not yet. I need to work.

The bottom line is clear: barring a system that has access to all that information and the ability to use it in the right way, the Brazilian colleague remains the best source of drink recommendations.

And it looks like the drink is working already, since I have already reached a level of alertness to attempt a blog post.