Sunday, October 27, 2013

Power behind MediaEval: Reflecting on 2013

Power behind MediaEval 2013
The MediaEval 2013 Workshop was held in the Barri Gòtic, the Gothic Quarter, of Barcleona on 18-19 October, just before ACM Multimedia 2013 (held in Barcelona). The workshop was attended by 100 participants, who presented and discussed the results that they had a achieved on 11 tasks, 6 main tasks, and on 5 "Brave New Tasks" that ran for the first time this year.

The workshop produced a working notes proceedings containing 96 short working papers, and has been published by CEUR-WS.org.

The purpose of MediaEval is to offer tasks to the multimedia community that  support research progress on multimedia challenges that have a social or a human aspect. The tasks are autonomous and each run by a different team of task organizers. My role within MediaEval is to guide the process of running tasks, which involves providing feedback to task organizers and sending out the cues that keep the tasks ruining smoothly and on time. Today I did a quick count that revealed that during the 2013 season, I wrote 1529 personal emails to people that contained the keyword "MediaEval" in them.

What makes MediaEval work, however, cannot be expressed in numbers. Rather, it is the dedication and intensive effort of a large group of people, who propose and organize tasks and carry out the logistics that make the workshop come together. My motivation to continue MediaEval year after year stems largely from an underlying sense of awe at what these people do: both at the work that I am aware of and also at the many things that they do behind the scenes that make largely invisible. These people are the power behind MediaEval. Here I represent them with the picture above, which are the power plugs arranged by Xavi Anguera from Telefonica with the assistive effort of Bart Thomee from Yahoo! Research. The process involved a combination of precision car driving and applied electrical engineering.

In the airplane back from Barcelona yesterday, I finished processing the responses that we received from the participant surveys (collected during the workshop), input from the organizers meeting (held on Sunday after the workshop), and feedback that people gave me verbally during ACM Multimedia (last week). These points are summarized below.

Thus endeth MediaEval 2013, but at the same time beginneth the season of MediaEval 2014. Hope to have you aboard.

Community Feedback from the MediaEval 2013 Multimedia Benchmarking Season + Workshop

The most important feedback point this year was the new structure of the workshop, which was very well received. This year the workshop was faster paced and we introduced poster sessions. We were happy that people liked the short talks and that the poster sessions were considered to be useful and productive. There is a clear trend to preferring there to be more discussion time at the workshop, both in the presentation sessions and in the poster sessions. An idea for the future is to separate passive poster time (posters are hanging and people can look at them but the presenter need not be present) from active poster time (presenter is standing at the poster).

The number one most frequent request was for MediaEval to provide more detailed information. This request was made with respect to a range of areas: descriptions of the tasks should always strive to be maximally explicit; descriptions of the evaluation methods should be detailed and available in a timely manner; task overview talks at the workshop should contain examples and descriptions that allow a general audience (i.e., people who did not participate in the task) to understand the task easily.

Other suggestions were to increase consistency check and continue to promote industry involvement. Finally, requests for more time for preparation of presentations and to explicitly invite (and support) groups to make demos with the posters.

The organizers meeting on Sunday was the source of additional feedback. Task organization requires a huge amount of time and dedication from task organization teams and it is important that this is distributed as evenly as possible across the year and across people. In general, tasks would benefit from additional practical guidance on organization. This includes task management and evaluation methodologies. Since MediaEval is a decentralized system, the source of this guidance must be people with past experience with task organization and communication between tasks. Here, the bi-weekly telcos for organizers are an important tool.

In the coming year, the awards and sponsorship committee can expect an expanded role. The outreach to early-career researchers and to researchers located outside of Europe (in the form of travel grants) is seen by the organizers to be not merely a "nice-to-have", but rather a central part of MediaEval's mission. There is solid consensus about the usefulness of the MediaEval Distinctive Mentions (MDMs). MDMs are peer-to-peer certificates awarded by task organizers to each other or to the participants of their tasks. The MDMs  allow the community to send public messages between members of the community, and especially to point out participant submissions that are highly innovative or have particularly high potential (although they may not have been top scorers according to the official evaluation metric). It is important to make clear that the MediaEval Distinctive Mention is not an "award", since the process by which they are chosen is intentionally kept very informal. In the coming year, we will be investigating the issue of whether MediaEval should introduce a five-year impact award, that would be more formal in nature. The peer-to-peer MDMs will be maintained, although and effort will be made to make them increasingly transparent.

In general we were satisfied with the process used to produce the proceedings. Having groups do an online check of their metadata was helpful. If future years also involve proceedings with 50+ papers, we will need to further streamline the schedule for submission---with the ultimate goal of having the proceedings online at the moment that the workshop opens.

Tuesday, October 22, 2013

CrowdMM 2013: Crowdsourcing in Mutlimedia Emerged or Emerging?

CrowdMM 2013, the 2nd International ACM Workshop on Crowdsourcing for Multimedia, was held in conjunction with ACM Multimedia 2013 on 22 October 2013. This workshop is the second edition of the CrowdMM series, which I have previously written about here. This year it was organized by Wei-Ta Chu (National Chung Cheng University in Taiwan) and Kuan-Ta Chen (Academia Sinica in Taiwan) and myself, with critical support from Tobias Hossfeld (University of Wuerzburg) and Wei Tsang Ooi (NUS). The workshop received support from two projects funded by the European Union, Qualinet and CUbRIK.

During the workshop, we had an interesting panel discussion with the topic "Crowdsourcing in Mutlimedia Emerged or Emerging?" The members of the panel were Daniel Gatica-Perez from Idiap (who keynoted the workshop with a talk entitled, "When the Crowd Watches the Crowd: Understanding Impressions in Online Conversational Video"), Tobias Hossfeld (who organized this year's Crowdsourcing for Multimedia Ideas Competition) and Mohammad Soleymani from Imperial College London (together we presented a tutorial on Crowdsourcing for Multimedia Research the day before). The image above was taken of the whiteboard where I attempted to accumulate the main points raised by the audience and the panel members during the discussion. The purpose of this post is to give a summary of these points.

At the end of the panel, the panel together with the audience decided that crowdsourcing for multimedia has not yet reached its full potential, and therefore should be considered "emerging" rather than already "emerged". This conclusion was interesting in light of the fact that the panel discussion revealed many areas in which crowdsourcing represents an extension of previously existing practices, or stands to benefit from established techniques or theoretical frameworks. These factors are arguments that can be marshaled in support of the "emerged" perspective. However, in the end the arguments for "emerging" had the clear upper hand.

Because the ultimate conclusion was "emerging", i.e., that the field is still experiencing development, I decided to summarize the panel discussion not as a series of statements, but rather as a list of questions. Please note that this summary is from my personal perspective and may not exactly represent what was said by the panelists and the audience during the panel. Any discrepancies, I hope, rather than being bothersome, will provide seeds for future discussion.

Summary of the CrowdMM 2013 Panel Discussion
"Crowdsourcing for Multimedia: Emerged or Emerging"

Understanding: Any larger sense of purpose that can be shared in the crowdsourcing ecosystem could be valuable to increase motivation and thereby quality. What else can we do to fight worker alienation? Why don't taskaskers ask the crowdworkers who they are? And vice versa?

Best Practices: There is no magic recipe for crowdsourcing for multimedia. Couldn't the research community be doing more to share task design, code and data? Would that help? Factors that contribute to the success of crowdsourcing are watertight task design (test, test, test, test, test, test the task design before running a large scale experiment), detailed examples or training sessions, inclusion of verification questions, and making workings aware of the larger meaning of their work. Do tasks have to be fun? Certainly, they should run smoothly so that crowdworkers can hit a state of "blissful productivity".

History: Many of the issues of crowdsourcing are the same ones encountered when we carry out experiments in the lab. Do we make full use of the carry over? Crowdsourcing experiments can be validated by corresponding experiments that have been carried out in a lab environment. Do we do this often enough?

Markets: Many of the issues of crowdsourcing are related to economics, and in particular to the way that the laws of supply and demand operate in an economic market. Have we made use of theories and practices from this area?

Diversity: Why is crowdsourcing mostly used for image labeling by the community? What about other applications such as system design and test? What about techniques that combine human and conventional computing in online systems?

Reproducibility: Shouldn't reproducibility be the ultimate goal of crowdsourcing? Are we making the problem to simple in the cases that we struggle with reproducibility? Understanding the input of the crowd as being influenced by multiple dimensions can help us to better design crowdsourcing experiments that are highly replicable.

Reliability: Have we made use of reliability theory? How about test/retest reliability used in psychology?

Uncertainty: Are we dealing with noisier data? Or has crowdsourcing actually allowed us to move to more realistic data? Human consensus is the upper limit of what we can derive from crowdsourcing, but does human consensus not in turn depend on how well the task has been described to crowdworkers? What don't we do more to exploit the whole framework of probability theory?

Gamification: Will it solve the issues of how to effectively and appropriately incentivize the crowd? Should the research community be the ones to push forward gamification (Does any of us realize how many people and resources it takes to make a really successful commercial game?)

Design: Aren't we forgetting about a lot of work that has been done in interaction design?

Education: Can we combine crowdsourcing systems with systems that help people learn skills that are useful in real life? In this way, crowdworkers receive more from the system in exchange for their crowdwork, in addition to just money.

Cats: Labeling cats is not necessarily an easy task. Is a stuffed animal a cat? How about a kitten?