At the end of last year, VideoCLEF became MediaEval. I thought it was a great name for a multimedia retrieval benchmark evaluation and a mashed up a new logo in an enthused rush. When I needed to go back and find the original illumated "M" that I used, it seemed to be the perfect job for content based image search. I recalled the Best Paper from ACM Multimedia 2009 onVisual Query Suggestion and headed off to Bing image search to try my hand at some combined text and image search.
I quickly found myself wishing that I had more options. In particular, I wanted to chose more than a single image at a time that was related to my query. The VIPER group at the University of Geneva has a Cross-Model search engine that lets you select multiple relavant images for each feedback iteration. You can also select a set of images for negative feedback, which would have been helpful.
But for this particular search, the Bing option to limiting search to black & white proved helpful. After a few iterations, I came up with some nice looking results that gave me a sense that I was really moving the right direction.
However, my search did not return the "M" that I had originally used. I went to Google images, formulated and reformulated. "Letter M illuminated", "Medieval manuscript M", "Illuminated medieval letter"...nothing seemed to help. Arg! Isn't this task easy? Shouldn't this just be duplicate detection?
Then I remembered that when I was looking for the original "M" I wanted to make sure that there would be no licensing issues so that MediaEval could use it freely. I had been experimenting at the time with the Creative Commons search engine so I went back there and put in the simplest of all possible queries "Illuminated M."
Bingo. The original M from the Chronica Polonorum on Wikimedia commons.
How often when we are searching do we remember that Web search is all about recall? Multimodal relevance feedback may expand our queries, but it also limits our results. If I weren't engaging in known-item search I would have never known the "M" I was missing. Similarity along radically simplistic visual dimensions is useful, but enevitably something will fall between the cracks. Thankfully it seldom seems to matter, but we shouldn't let our awareness that we might be missing something slip from our consciousness.
The more interesting observation was that the key to re-finding my image was reconstructing the way that I found it in the first place. Not only knowledge about the "M", but also detailed knowledge of where and how I should be looking for it turned out to be critical.
The search process is entertaining in and of itself. I am not going to reveal how much time I was willing to devote to finding that "M" and browsing through the images that Bing came up with as similar. The visual feedback did turn up a useful by-product -- not the direct target of my search: a beautiful high-resolution "M" that should satisfy gripes about the low quality of our MediaEval logo.