Monday, July 26, 2010

Advances in Multimedia Retrieval Tutorial at ACM Multimedia 2010

I worked a long day today and now I'm home and have eaten dinner and thinking about how to relax. I'd like to watch an episode of Merlin on the Internet. Preferably legal and one I've never seen before -- wouldn't mind paying if it was a site I trusted. That seems like a pretty complicated search and my prediction is that it will lead to frustration. So here I am writing in my blog instead.

Today the longplay description of our upcoming tutorial on Frontiers in Multimedia Search went online. We want to start out by addressing the question of how can multimedia search benefit people's daily lives, at work and otherwise. I'm feeling rather a strong need for the benefit of multimedia at the moment. If I can't have my Merlin, right now I wouldn't mind browsing back through recordings of the SIGIR presentations that I heard last week -- and maybe some of the ones that I have missed.

Then we are planning to go on to take a look at new approaches to multimedia retrieval that we divide into three categories (I include a couple of my own notes on each):
  • Making the most of the user: In motion on the Internet we dribble information behind us. We tag, we query, we click, we brush over a page without a second glance. We have the capacity to glance at a set of snippets, glaze over what does not interest us to find what does. Making the most of the user is about letting the search engine turn the computational crank and do the look ups, leaving those fine-grained semantic judgments to the human brain.
  • Making the most of the collection: Sometimes the collection can speak for itself. Pseudo-relevance feedback may dilute our queries, but it also is a valuable tool for increasing recall. And then there is collaborative filtering: Making use of patterns that we as users leave behind -- but now at the collection or community level.
  • Making the most of individual items: What is important here is how you can do the best that you can with noisy sources of features (speech recognition, visual concept detection) to represent items. You don't need to necessarily provide a complete representation of an item -- information that helps distinguish items or keeps them from getting confused can sometimes be a big help.
All in all, the aim is to present our favorite new multimedia search work -- working to inject new techniques and perspectives from the IR community and speech community into multimedia. And, of course, develop our own understanding of multimedia search along the way. Maybe I will then also feel better equipped to find something that I could watch now that would make me as happy as Merlin. Or at least to understand exactly what is still missing...