Showing posts with label YouTube. Show all posts
Showing posts with label YouTube. Show all posts

Friday, February 3, 2012

Google's new privacy policy will further squelch my social life, and possibly also kill Christmas

So yesterday I saw some screen flash up again from Google concerning the new privacy policy. But I was intent on logging into Gmail to chat my Groundhog Day greetings out to my little social world that I clicked on 'Got it' without really a second glance at all of that carefully crafted text that they had written on the page. Hmm. Looks pretty lean for a privacy policy, but I have a bad feeling. I don't think I'll like this.

And I don't. Just as sure as the groundhog saw his shadow yesterday and we received a generous blanket of snow today. I believe there's another six weeks (well, five weeks and six days) of winter coming.

And there's a serious chilly weather coming on in my social life. Google is going to mix up my Gmail, which is a chaotic mixture of shopping and friends, with my YouTube account, which I use to post video for work. The result is sure to be some weird distorted echo of myself that's neither productive nor fun.

I mean, as kids we do the thing of taking a range of food items out of the refrigerator and putting them in the blender. It starts out a strawberry milkshake, but then you get the idea to add a some capers and a spoonful of peanut butter and then some powdered sugar and a zap of lemon juice. It's fun to mix stuff up but the result is usually pretty gross and the main life lesson we learn with this experiment is that there truly is a reason why we generally like to eat our food in separate dishes and don't mash everything together in the middle of the plate.

Now I have to worry about chats from my friends contaminating the little YouTube garden I have been cultivating on the topic of multimedia information retrieval. Am I going to be looking at my YouTube page with colleagues at work and have to wondering if they will see recommendations for videos of things that I have been chatting about? Is my video recommendation list about to be infected by groundhogs? Other rodents?

Today someone wrote me an email mentioning a Wizards/Raptors game. I have no idea what sport this is: and YouTube's going to be mixing it up with the videos I actually want to see? Am I going to write back and say "Please, include no Wizards or Raptors in your messages to me"? Oh dear, that's really going to go down well with my friends.

For me, it's a cold chill without the mitigating effect of a truly cute and relevant rodent or the promise of a mere six weeks duration.

This year, I will be definitely worrying about accidentally doing a Google search in the presence of one of my family members and having my Christmas shopping list lifted from my email and plunked down in the ads or the personalization of my search results. What happened to surprising people with a unexpected holiday gift? Who would have thought that Google turns out to be the Grinch who Stole Christmas? Do I seriously have to hesitate before spontaneously turning my laptop screen so that someone else can seen my browser and I can help them find something on the Internet?

The New York Times tells us that the EU is pressing Google to delay the new privacy policy until the implications can be better understood. I am not the only one, apparently, with the bad feeling. It looks lean, but its not at all a reassuring lean.

Social media is about the joy of the spur of the moment chat -- and what I have right now is a bad taste in my mouth. My natural human urge to share spontaneous Groundhog greetings prompted me initially to click right through the page of information on the new privacy policy without reading it in detail. That was not a responsible click. That bad taste is not only the oncoming strawberry, caper, peanut butter milkshake, but also the yet further erosion of my ability to trust my social intuitions of what topics to raise when with whom. Google seems to want to help make that decision for me.

It looks like I'll be pulling the adjective 'insidious' off the shelf, dusting it off and using it more often.

Just look at this blog post: it's rodents, raptors, strawberries, videos and snow. Any algorithm that's trying to make me a content recommendation on the basis of this text is going to come up with 'quirky', nothing more. Mix all my personal stuff together, Google, and you get...well, useless mud: It is useless, strawberry flavored mud with perhaps a hint of groundhog. You don't need this stuff Google. Think of something else and don't send you privacy policy off in completely the wrong direction.

Saturday, December 24, 2011

Peaceable Kingdom: Snowflake button for YouTube snow


A post that mixes the magic and delight of the holiday season with multimedia information retrieval? Let's try it and see what happens.

The past couple of weeks holiday cards have been dropping through the mail slot in the front door -- but also emails have been entering my inbox: greetings, photos, and yes, also videos. This morning it was an email with a greeting and a link to a music video "Peaceable Kingdom".

I watched the video for a while and pondered its relationship with Christmas: The music is melodious, soothing and the lyrics take the listener to the manger to make the connection with the adoring state of mind of those who gathered there the first Christmas Eve. Unexpected minor cadences highlight that this is no usual Christmas carol and invite consideration of the multiplicity of the Christmas experience -- how the holiday itself integrates traditions preceding Christianity and how, as each new group and generation reinvents it for their own spirit and needs, it will continue to develop into some future Christmas. From the perspective of the here and now, that future Christmas could seem full of sweetness, hope and light, but also distorted and distinctly pagan.

Of course, the strongest signal I get from the video is that of Margaret Atwood's dystopic visions. I haven't read The Year of the Flood, but what has been written and said about the book has so much fascinated and disturbed me, that the existence of the book as itself as a text seems somehow less important to me -- the setting is already so palpable that what it tells is, in a way, no longer left to be said.

In the end, maybe my personal Christmas feeling associated with the video is that it gives me a chance to spend some time feeling close to the person who sent it to me. The strength of this feeling of connection goes beyond -- indeed exists in a completely different life dimension -- than my reflections on meta-text usurping text or on the length of time that has transpired since I have sat down and read a worthwhile book not related to work.

Where is the multimedia information retrieval tie-in? Well, first, as a result of this video it has occurred to me for the umpteenth time that we need a verb other than "watch" to describe this kind of interaction with this video. It's a music video, so I am mainly listening to it and then looking at the visual stimuli. There could potentially be rather large changes in the visuals -- different pictures, different editing -- and these changes could possibly leave my watching experience largely untouched. I would argue, if I were only "watching", these elements would necessarily have a major defining impact on my experience. They don't. Here, I am rather "watch/listening", which I suppose could give us the new concept of "wistening".

There's a second tie-in as well: There is a little snowflake in the player bar, which I discovered after "wistening" for a while. I usually find snowflake icons ambiguous: especially on climate control units in strange hotel rooms -- do I turn the setting to "snowflake" if it's cold outside or is the "snowflake" setting going to cause the system to start producing cool? I've encountered both. So I've learned just to click on the snowflake and see what happens...

I clicked.

And lo and behold it started snowing. Right into the Peaceable Kingdom -- flakes floating down slowly -- different sorts of flakes at different speeds -- and accumulating at the bottom of the frame. I felt the smile spread on my face -- and grow wider as a realized that I was witnessing one little bit of a sort of world-wide holiday miracle as people in front of screens around the planet discover that you make it snow on YouTube. I thought about people watching this on their laptops and tables, using the mouse to play a bit in the snow and then gathering their friends, colleagues, family around their screens in one big Christmas "You gotta check this out!"

Apparently, you can't do this to every video: and this is where it really starts getting interesting to me. How did YouTube decide which videos to add this feature to? There must have been some multimedia classification algorithm that maybe looked for keywords in the title and description and something like music in the audio channel or colors in the visual channel and combined this with the upload date -- and then enabled "snow" for this video.

I want to make these kinds of algorithms! How do we put everything that we know how to do in terms of multimodal video processing and machine learning and figure out for which videos it needs to be able to snow?

And it's not just snow. There are other ways in which this could go -- and should go -- it has potential to cause so much joy. I am sitting here "wistening" and thinking about friends and family and playing in the snow, but it's clear that we need to go being "wistening" and we need a very for watching+listening+reflecting+playing. It's also clear that we need the technologies that support these activities. Imagine a search engine that can find videos that are appropriate for 'snow': that goes so far beyond user information needs as they are currently conceptualized for multimedia that it sort of takes your breath away.

How to enable the multimedia community to work at these new (from the perspective of this moment, utterly fantastic) frontiers?

The key to doing work in this direction, is to evaluating it. How do we know if we were right in presenting the snow option for a given video? YouTube is probably analyzing its interaction logs at this very moment. But I hate to think that I need to go to work for YouTube in order to ever be able to do the evaluation necessary to write a paper on this topic. Everyone loves the snow, so everyone should be able to work in order to make it better.

Note to self qua New Year's resolution: Keep up commitment to evaluation -- we need it to push ourselves forward into the unknown in a meaningful way. Maybe it's what actually makes the difference between what we call computer science and what we call art. But I'll leave that thought to another day.

In the meantime, the overall conclusion is that holidays and multimedia information retrieval do indeed mix well in a blog post. So happy holidays (ans enjoy the video):

Saturday, October 9, 2010

Drink recommendation

Within the last ten days I've been in Asia, Europe and North America. I've taken jetlag to a new level. Usually there is a reference point, you can say, "It's past midnight in the Netherlands at the moment, my internal clock thinks it's past my bedtime and that's why I am so tired." Now I have no clue why time my internal clock reads.

At the grocery store, I just picked out a four pack of energy drink in order to try to jump start myself and get re-aligned with the cycle of the sun at my current location. I stood for ten minutes in front of the selections, looking at the cans and then reading the labels. I wanted something not too expensive, sugar free and also with guarana. A Brazilian colleague had recommended guarana as one of the best "pick up" ingredients you can get in an energy drink.

What I could use is a good drink recommendation system. The Asian part of this odyssey took place in Tokyo, and the following video was what YouTube there listed as a popular video. It had received 44466 views in the one day since it had been uploaded.

1 dag geleden 44466 keer bekeken



It is a news report on a drink vending machine (a Tokyo fixture) that recommends drinks by taking your picture and doing a little bit of multimedia content analysis that gives it clues as to your age and gender.

In my current situation, age and gender wouldn't have been enough. Rather the system would need information about my internal state -- the camera would have to have noticed the unfocused glaze of my tired eyes. In this situation, internal-state information could be inferred if the system had access to information about my geo-coordinates within the last ten days. Access to a recent history of my sleeping-waking pattern would provide an even better source of evidence.

However, another key bit of information, that would be critical to get to the correct drink would be that at the moment I do not want to be tired. I can't be tired. I don't want something that will relax me -- no chamomile, not yet. I need to work.

The bottom line is clear: barring a system that has access to all that information and the ability to use it in the right way, the Brazilian colleague remains the best source of drink recommendations.

And it looks like the drink is working already, since I have already reached a level of alertness to attempt a blog post.

Wednesday, March 31, 2010

and I have transcended

Here's a screen shot from my YouTube video about the intelligent multimedia player. I've had YouTube do the automatic transcription, which you can see displayed as a caption at the bottom.

Captioning my video with the YouTube speech recognition service got me thinking about an op-ed piece that appeared in the International Herald Tribute last month: Typing with a Voice by Stewart Wachs. As the title suggests it's about using the computer via a speech recognition interface rather than a keyboard. Unsurprisingly, his speech recognition software generates some wildly off-the-mark transcriptions of what he's saying -- and these serve to make the op-ed quite amusing. But what sticks and doesn't let me go is the final turn of the article, where he talks about a flash of insight that allowed him to start extending empathy to the software. This change of attitude made the mis-recognitions less frustrating allowing him to work with the technology instead of pulling the opposite direction.

He worries that his compassion for the software is misplaced, citing the pathetic fallacy, the mistaken attribution of human characteristics to an inanimate object. The compassion is useful, nonetheless, he observes.

Does this explain my own fascination with speech recognition that has extended over more than a decade now? Is is some sort of innate ability, a gift I have to empathize with the software? Perhaps. If there is indeed compassion involved, more likely it arises not from working with an inanimate software, but rather from the accumulation of interactions with speech recognition researchers I have had over the years. It stems from an appreciation for their creativity and hard work and especially their patience with an external world where word error rate rules and where anything less than 99% accuracy is declared a priori useless. Humans, they patiently point out, don't achieve 99% accuracy in transcription and speech transcripts can be useful for speech retrieval even with error rates over 50%.

Sometimes you administer a bit of tough love, however, like on the day I decided to test drive the new YouTube automatic transcription service. The video I transcribed was recorded using the built-in mic on my Cannon Powershot which was about 2 meters away from me and there was also quite an echo in the kitchen where I recorded the video. If you were squinting at the small screen shot above to read the transcript generated by Google I'll now reveal that it reads, "and I have transcended weeks." Hmm, maybe I have transcended -- reached some sort of a plane where I embrace rather than push away the unexpected outputs of automatic speech recognition. They say, after all, that the speech recognition problem is AI-complete, as difficult as artificial intelligence. Maybe there is no other reasonable choice.

In reality, what I am saying at this point in the video is "...and our friends send us links!" What does this have to do with search? I say "send" YouTube hears "scend" not much of a common ground, but with a syllable level indexing system it would be enough to help locate this quote within the video. That's a lot better than what we have right now.