Today, I discovered an interesting segment of a video clip illustrating someone connecting search and spirituality. Search in a broader sense (beyond "information retrieval") does seem to have a lot to do with our belief systems and our relationship to a sense of higher purpose in life. Coming across a tangible example of the connection between finding information and someone's inner spiritual world stopped to make me reflect. I was struck by the implications for the design of user experience with search engines. What responsibilities do we have as scientists in designing our algorithms and our applications if these then get incorporated into the personal, internal process of individual human beings to find meaning in their own lives by connecting with universal truth?
At the moment I am doing the final spot check on the development set for the MediaEval 2011 Genre Tagging release. I was checking out a video with the genre label personal_or_auto-biographical, one of the 26 categories that we are using this year.
I started playing this video to get an idea of what it exactly was about and I was amazed to listen to this guy and watch him speaking. Perhaps the reaction dates me. There is just a striking immediacy to it that I was not expecting. Apparently, he's alone in his car, and talking only for himself and for the camera.
To really not know who this is, or what happened to him later in 2009 when he stopped publishing episodes is a bit of a science-fiction feeling for me. Watching his video, I am caught up in the present moment of someone who I don't know, over two years after that moment actually occurred. This effect is quite contrary to what he himself is describing. He talks about remaining with himself (someone he nearly by definition must know well) in the present moment.
Or is my witnessing of this nameless present-moment occurring in the past actually simply a new kind of being present?
It certainly seems like it exists on some other plane. Although, I jump immediately to considering what it would take to track the guy down. Gerald Friedland gave a talk at our lab last week about Cybercasing, using geo-tagged information available online to mount real-world attacks. It's fresh in my mind, the array of possibilities for finding someone by following the trail they leave uploading multimedia to the Internet. One video doesn't seem to hurt, but we quickly loose the intuitions for how our uploading behavior might scale -- allowing people to find us on the basis of who we are and in terms of how we are vulnerable.
On the other hand, this yearning to be present in the moment is so universal, so common to so many, that it really doesn't make this guy so special. He's special, perhaps, in that he can operate a camera and get his video online. Also, clearly he has the gift to generate a speech stream that other people then identify as reflecting their own inner processes. But he's specialness ends in a certain way right there. What he is saying in a way so intensely personal that it once again becomes universal -- it's simply what we look like on the inside -- like the pictures that they show us in grade school of the chambers of our hearts and the insides of our large intestines. This video was in that sense made to be lost in the multimedia avalanche of the Internet.
The guy mentions a name in his metadata, Eckhart Tolle, and I followed the trail and very quickly realizing, by clicking into an Eckhart Tolle YouTube video, that Eckhart Tolle is who my mystery guy is talking about rather than who he is himself. That brought a smile, since this distinction is one that we've previously observed as important for speech media [1].
I listened to Eckhart Tolle for a bit, pondering the metaphor involving the universal similarity of people's large intestines. All of a sudden Eckhart Tolle is saying, "The mind even started to look at ads for flying back to England, fares, and then the impulse came..." He's sort of hesitating, so you wonder if he's also finding this a little strange, but for me it just seemed like a moment that search for information is playing a clearly in central role in what we would otherwise call our own internal states that make up part of our spirituality. It's the kind of search that we would do nowadays with a search engine.
Eckhart Tolle goes on to talk about "obedience to what came out of the present moment"...it guided his decision making process on where to be when. He goes on to say, "...don't do it on an impulse that is a restless impulse or comes out of any kind of negative emotion". If people listen to what he is saying, and a lot do, and if they combine their search for information with interaction with search engines, I land at the following conclusion: our individual spiritual development is not disconnected from our search engines and especially not from our experience of interacting with them.
In the end, the reason I blog about this might just be that I want to use the YouTube link to that Eckhart Tolle video that will take you right to the jump-in point that I am writing about http://youtu.be/K1_R3uKJOB4?t=4m18s Goodness knows how much time I've spent discussing video fragment linking and trying to get research money to work on it as a searching speech problem -- I really get a kick out of being able to link into the stream.
We are late releasing the data for the MediaEval 2011 Genre Tagging task. The initial delay was small, but then other things just got in the way compounding the situation. Today, I am trying to be very present in the moment, in order to ignore the stress that I feel about being so late and be very careful about getting the release right the first time around.
And today's experience reminds me of how careful we need to be in all our research. If our search engines are part of our spiritual worlds, we need to design our algorithms and applications with awareness of their potential impact on the trails that we following in our paths of personal development and on the collective, common digestive system of humanity.
[1] Besser, J., Larson, M., Hofmann, K., Podcast Search: User Goals and Retrieval Technologies, Online Information Review: The international journal of digital information research and use, Vol. 34, No. 3, pp. 395-419, 2010.
Saturday, June 11, 2011
Saturday, June 4, 2011
LikeLines: Crowdsourced intelligent mulitmedia player
Today we're doing the putting the final touches on LikeLines: Video highlights via web-scale aggregation of moments that viewers like, our entry to the mozilla Drumbeat Unlocking Video challenge, which is closing tomorrow.
The challenge addresses the question "How can new web video tools transform news storytelling?" Our answer to this question is the paradigm of distributed directing that allows news reports to be generated automatically, but without a central reporter. The raw material is footage captured by individuals with cameras and mobile phones who witness an event. One challenge faced is how to filter this footage: in particular, how to find the most interesting points? LikeLines gives the answer to this question.
The LikeLines concept is basically a heatmap that shows how many people found certain portions of a video interesting. You use it if you don't want to watch a video all the way through. Instead, you click the heatmap to jump in to just to the places that are worth watching start watching from there. What's worth watching is decided on the basis of what other viewers found worth watching -- either they tag those segments explicitly by clicking a "like" button or else they let the player record their stop, starting and cuing behavior. We also want the player to be able to make use of multimedia content analysis (visual analysis or speech recognition) in order to be able to "seed" interesting moments. This sort of seeding user contributions with multimedia content analysis has been used by our colleagues:
Ewine Smits and Alan Hanjalic. 2010. A System Concept for Socially Enriched Access to Soccer Video Collections. IEEE MultiMedia 17, 4 (October 2010), 26-35.
Our entry is in the form of video:
The video has been finished for a while now, and now I am just adding some text to make it clear that the idea is elegant, but also quite clever in that you combine user input and multimedia content analysis, which allows you to bootstrap from raw video.
It's a little crazy trying to write, because I need to switch out of research paper mode in to the mode of "hey, this will really work" and "hey look everyone, this is totally needed, totally non-trivial and totally does not exist anywhere yet". I am working now (when I stopped to write a blog post) on a sentence communicating that we can address the cold start issue with content analysis based seeding. And that verification using content analysis will help to control spam. And that if all goes well the whole thing should be able to learn by itself: It will require some R&D effort, but all the pieces of technology needed already exist.
Also, I had a little bit of trouble getting the right tone for the biography. So we're big shots at a cool technical university in the Netherlands? I guess that's important to communicate. But how to say that we are also passionate about supporting distributed and democratic news? Do I divulge that the first draft of LikeLines was churned out on a bus from Boston to Portland, the video was recorded in a long after-hours effort, and the whole thing has been discussed in every detail in chat sessions?
And how to communicate that we are doing this because it's what we love to do? We had some light-hearted lines in the bio to convey this tone (about our cat-video habit on YouTube and about me largely eschewing social media for the traditional postcard), but those got dumped in favor of some harder hitting facts about our experience in this area: right people, right skills, right place, right time...
In the end, I'm also in this to experience the crowdsourcing aspect of working on innovation in a open collaboration environment. What a breath of fresh air in the daily grind of publish and perish. And the giddy joy of communicating a concept that is ripe, feasible and useful.
The challenge addresses the question "How can new web video tools transform news storytelling?" Our answer to this question is the paradigm of distributed directing that allows news reports to be generated automatically, but without a central reporter. The raw material is footage captured by individuals with cameras and mobile phones who witness an event. One challenge faced is how to filter this footage: in particular, how to find the most interesting points? LikeLines gives the answer to this question.
The LikeLines concept is basically a heatmap that shows how many people found certain portions of a video interesting. You use it if you don't want to watch a video all the way through. Instead, you click the heatmap to jump in to just to the places that are worth watching start watching from there. What's worth watching is decided on the basis of what other viewers found worth watching -- either they tag those segments explicitly by clicking a "like" button or else they let the player record their stop, starting and cuing behavior. We also want the player to be able to make use of multimedia content analysis (visual analysis or speech recognition) in order to be able to "seed" interesting moments. This sort of seeding user contributions with multimedia content analysis has been used by our colleagues:
Ewine Smits and Alan Hanjalic. 2010. A System Concept for Socially Enriched Access to Soccer Video Collections. IEEE MultiMedia 17, 4 (October 2010), 26-35.
Our entry is in the form of video:
The video has been finished for a while now, and now I am just adding some text to make it clear that the idea is elegant, but also quite clever in that you combine user input and multimedia content analysis, which allows you to bootstrap from raw video.
It's a little crazy trying to write, because I need to switch out of research paper mode in to the mode of "hey, this will really work" and "hey look everyone, this is totally needed, totally non-trivial and totally does not exist anywhere yet". I am working now (when I stopped to write a blog post) on a sentence communicating that we can address the cold start issue with content analysis based seeding. And that verification using content analysis will help to control spam. And that if all goes well the whole thing should be able to learn by itself: It will require some R&D effort, but all the pieces of technology needed already exist.
Also, I had a little bit of trouble getting the right tone for the biography. So we're big shots at a cool technical university in the Netherlands? I guess that's important to communicate. But how to say that we are also passionate about supporting distributed and democratic news? Do I divulge that the first draft of LikeLines was churned out on a bus from Boston to Portland, the video was recorded in a long after-hours effort, and the whole thing has been discussed in every detail in chat sessions?
And how to communicate that we are doing this because it's what we love to do? We had some light-hearted lines in the bio to convey this tone (about our cat-video habit on YouTube and about me largely eschewing social media for the traditional postcard), but those got dumped in favor of some harder hitting facts about our experience in this area: right people, right skills, right place, right time...
In the end, I'm also in this to experience the crowdsourcing aspect of working on innovation in a open collaboration environment. What a breath of fresh air in the daily grind of publish and perish. And the giddy joy of communicating a concept that is ripe, feasible and useful.
Subscribe to:
Posts (Atom)