Saturday, April 14, 2018

Pixel Privacy: Protecting multimedia from large-scale automatic inference

This post introduces the Pixel Privacy project, and provides related links. This week's Facebook congressional hearings have made us more aware how easily our data can be illicitly acquired and used in ways beyond our control or our knowledge. The discussions around Facebook have been focused on textual and behavior information. However, if we think forward, we should realize that now is the time to also start worrying about the information contained in images and videos. The Pixel Privacy project aims to stay ahead of the curve by highlighting the issues and possible solutions that will make multimedia safer online, before a multimedia privacy issues start to arise.

Pixel Privacy project is motivated by the fact that today's computer vision algorithms have super-human ability to "see" the contents of images and videos using large-scale pixel processing techniques. Many of us our aware that our smartphones are able to organize the images that we take by subject material. However, what most of us do not realize is that the same algorithms can infer sensitive information from our images and videos (such as location) that we ourselves do not see or do not notice. Even more concerning that automatic inference of sensitive information, is large-scale inference. Large scale processing of images and video could make it possible to identify users in particular victim categories (cf. cybercasing [1]).

The aim of the Pixel Privacy project is to jump-start research into technology that alerts users to the information that they might be sharing unwittingly. Such technology would also put tools in the hands of users to modify photos in a way that protects them without ruining them. A unique aspect of Pixel Privacy is that it aims to make privacy natural and even fun for users (building on work in [2]).

The Pixel Privacy project started with a 2 minute video:

The video was accompanied by a 2 page proposal. In the next round, I gave a 30 second pitch followed by rapid fire QA. The result was winning one of the 2017 NWO TTW Open Mind Awards (Dutch).

Related links:
  • The project was written up as "Change Perspective" feature on the website of Radboud University, my home institution: Big multimedia data: Balancing detection with protection (unfortunately, the article was deleted after a year or so).
  • The project also has been written up by Bard van de Weijer for Volkskrant in a piece with the title "Digital Privacy needs to become second nature". (In Dutch: "Digitale privacy moet onze tweede natuur worden")


[1] Gerald Friedland and Robin Sommer. 2010. Cybercasing the Joint: On the Privacy Implications of Geo-tagging. In Proceedings of the 5th USENIX Conference on Hot Topics in Security (HotSec’10). 1–8.

[2] Jaeyoung Choi, Martha Larson, Xinchao Li, Kevin Li, Gerald Friedland, and Alan Hanjalic. 2017. The Geo-Privacy Bonus of Popular Photo Enhancements. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR '17). ACM, New York, NY, USA, 84-92.

[3] Ádám Erdélyi, Thomas Winkler and Bernhard Rinner. 2013. Serious Fun: Cartooning for Privacy Protection, In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain, October 18-19, 2013.

Monday, January 1, 2018

2018: The year we embrace the information check habit

The new year dawns in the Netherlands. The breakfast conversation was about the Newscheckers site in Leiden and about the ongoing "News or Nonsense" exhibition at the Netherlands Institute for Sound and Vision.

Signs are pointing to 2018 being the year that we embrace the information check habit: without thinking about it do a double check of the trustworthiness of the factuality and the framing of any piece of information that we consume in our daily lives. If the information will influence us, if we will act upon it, we will finally have learned to automatically stop, look, and listen: the same sort of skills that we internalized when we learned to cross the street as youngsters.

For me, 2018 is the year that I make peace with how costly that information quality is. On factuality: I spend hours reviewing papers and checking sources. On framing: I devote a lot of time to looking for resources in which key concepts and processes are explained in ways that my students would easily understand them. And too often I am prevented from working on factuality and framing by worrying about the consequences of missing something or making the wrong choices.

It is costly in terms of time and effort just to choose words. I need words to convey to the students in my information science course that the world is dependent on their skills and their professional standards: anyone whose work involves responsibility for communication must devote time and effort to information quality and must take constant care to inform, rather than manipulate.

What is the name for our era? I don't say "post-truth". A era can call itself "post-truth", but that's asking us to accept that it is fundamentally different than whatever came before---the "pre-post-truth" era. The moment we stop to reflect on how the evidence proves that we have shifted from truth to post-truth, we are engaging in truth seeking. Post-truth goes poof.

I don't say "fake news" era. I grew up with the National Enquirer readily available at the supermarket check out counter, with its bright and interesting pictures of UFOs and celebrity divorces. That content wasn't there to contribute to building my mental model of reality, any more than Pacman. "Fake news" has always been there.

My search for the right words continues. I am using the book Weaponized Lies by Daniel Levitin for the first time this year in order to teach critical thinking skills. Levitin uses words like "counterknowledge" and "misinformation". These are important terms, but they imply the existence of a intelligent adversary intentionally misleading us. It is important to defend against these forces. However, the idea that the problem is people putting effort into "weaponization" overlooks the less dramatic, and less easily identify problem, of reasoning from shaky, half remembered information sources or using flawed logic to build arguments.

Now at the end of the first day of 2018, I am staring at Weaponized Lies next to my keyboard, wishing there were shortcuts---that I didn't have to start from the bottom finding the words to talk about the importance of information quality, even before I start talking about information quality itself, and researching how to build safer more equitable information environments.

There are no shortcuts. The only thing that we can hope for is that we can routinize information check. Make it a habit.

I even stopped for a moment to dream about a rising demand for information quality creating new jobs. We need professionals who are able to help us monitor information without sliding into suppressing free speech and imposing censorship. This is the direction in which our knowledge society should grow.

I thought I remembered reading an article online that discussed 2018 as the "Information Year". Now, for the life of me, I cannot find it. It takes so long to track and keep track of sources. My first step in making peace with the cost of information quality: I end this blog post by admitting I have no proof for my thesis that 2018 is the year we embrace the information check habit. The title is instead an expression of hope that we can move in that direction.