Showing posts with label multimedia search engines. Show all posts
Showing posts with label multimedia search engines. Show all posts

Thursday, October 20, 2011

Deep Link to Delft Technology Fellowship

Being educated in the US and being a scientist in Europe is sometimes quite tough. I need to continuously use a sort of filter that tells me that although I am hearing X, I need to pause and carefully consider and realize that the person is really saying Y. One particularly painful example, was unfortunately provided by our rector magnificus, the president of our university, in a recent interview. In promoting a new program to attract female scientists to the TU Delft, he said '...vrouwelijke wetenschappers zijn minstens zo talentvol als mannelijke wetenschappers.' which translates in English as 'female scientists are at least as talented as their male counterparts'. Ouch.

This statement does not work in the US academic context, because it fails gender symmetry. Gender symmetry can be diagnosed with the following test: flip the polarity of gender terms (e.g., 'woman', 'man', 'male', 'female') in a statement, and determine whether the resulting statement retains meaning within the context.

Let's try it. Flipping polarity of gender terms in his sentence yields, '...male scientists are at least as talented as their female counterparts'. This sentence is clearly interpretable, but no longer has a meaning that fits the context.

Contrast that with an alternate sentence such as: 'There is no discrepancy in talent between male and female scientists'. This sentence has the same declarative content, but it passes the gender symmetry test because you can substitute it with 'These is no discrepancy in talent between female and male scientists'.

Of course, in this case, a further problem arises. This sentence has the implicature that there is some reason for which this fact needs to be asserted in the first place. The act of pronouncing this sentence communicates that the speaker does not consider the point to be completely obvious, but rather feels that it needs to be explicitly asserted. One might choose against even this alternative sentence in order to avoid sending the message that one feels that there is someone out there that still needs to be convinced on the point of talent equivalence between male and female scientists. But on the whole, this alternative could be considered the 'best practices' formulation, should one indeed find oneself in a situation where it was necessary to make a statement comparing the relative scientific talent of men and women.

What my filter tells me is that although X was said in this case, what was meant is Y. And concerning Y, I rather suspect that our rector magnificus harbors the personal opinion that women have perhaps even a teensy bit more science talent than men and that in fact he is saying, "at least as (if not more) qualified". Whether or not that is true, it's safe to say that he is of the opinion that our university would, at this point in time, benefit from hiring additional women.

One of the research topics that I am interested in as a multimedia retrieval scientists is developing algorithms for the retrieval of jump in points (JIP) in video. JIPs allow the viewer to click directly to a certain relevant point in a video. On YouTube, they are called deep links. JIPs make it possible to share or to comment about particular points of a video, just as I am currently doing with this post. The deep link to the relevant section of the interview under discussion is the following:

http://youtu.be/wvto6MWXE6k?t=35s

The current status of technology on the Web is that it is possible to comment on JIPs or share them, but search engines don't return them as results. Together with colleagues within the Netherlands and across Europe I am developing and helping to promote the development of JIP retrieval in the MediaEval Rich Speech Retrieval task (see the feature on MediaEval 2011 in MMRecords for a brief description.) Such technology would allow search engines to return pointers to specific time points within video that are relevant to user queries.

At the end of the day, I am more interested in the scientific questions raised by the task of JIP multimedia retrieval than I am in the gender issue. Since grade school, I have frequently been the "only girl" involved in whatever activity fascinated me. You don't know it any other way, so you don't really notice. I contribute what I can to the discourse on promoting gender balance, not so much because of myself, but because I find it wasteful if I feel that women who I am mentoring are somehow holding themselves back.

When I first came to Delft, I contributed the following comment on improving the working climate at the Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS). This is the point of view that I still stand by so I include it here to complete my comment on the deep link.

Response on the 2009 Challenging Gender survey
The way of improving the working climate at EEMCS would be to address the gender imbalance within a larger program of promoting diversity into the Faculty of EEMCS. A faculty that includes international scientists addressing multi- and trans-disciplinary questions is automatically going to be more comfortable for women, since gender differences become just one of many differences of background and perspective that make the faculty richer and more productive.

Any effort invested in promoting inclusion of scientists/researchers that have pursued non-traditional career tracks (e.g., completing their PhD at an older or younger age, taking time off, switching disciplines mid-career) will automatically make women feel more welcome. When women feel welcome, they will also feel confident that the effort that they invest will be rewarded by a long and productive career in the EEMCS, establishing a virtuous cycle.

Everyone benefits from the promotion of diversity. For example, in this kind of climate, a researcher who has worked in the faculty for years will feel more comfortable about taking the risk of investigating a new class of algorithms or applying expertise accumulated in one domain to solving a problem in a radically different domain.

Positive side-effect: If everyone benefits, then women will not be burdened by the (perceived) need to fight the prejudice that they have been hired due to their gender and not due to their competence.

By promoting diversity, both in terms of scientific expertise and also in terms of other characteristics (cultural, religious, linguistic, socio-economic, sexual orientation as well as gender), the faculty will draw on a larger pool of talent and increase its productivity and capacity for creation and invention.

Working at TU-Delft, you see "Challenge the future" written everywhere...sometimes in unexpected places. As a woman this speaks to me in a special way: it says that the future at the TU-Delft is not set up to be carbon copy of the past. Because of the "challenge the future" attitude, I have confidence that the demographics of my department will shift naturally as we the Faculty of EEMCS continues to mature, extend and innovate scientifically.

Saturday, August 13, 2011

Subjectivity vs. Objectivity in Multimedia Indexing

In the field of multimedia, we spend so much time in discussions about semantic annotations (such as tags, or concept labels used for automatic concept detection) and whether they are objective or subjective. Usually the discourse runs along the lines of "Objective metadata is worth our effort, subjective metadata is too personal to either predict or be useful." Somehow the underlying assumption in these discussions is that we all have access to an a priori understanding of the distinction between "subjective" and "objective" and that this distinction is of some specific relevance to our field of research.

My position is that, as engineers building multimedia search engines, if we want to distinguish between subjective and objective we should do so using a model. We should avoid listening to our individual gut feelings on the issue (or wasting time talking about them). Instead, we should adopt a the more modern notion of "human computational relevance" which, since the rise of crowdsourcing, has entered into conceivable reach.

The underlying model is simple: Given a definition of a demographic that can be used to select a set of human subjects and a definition of a functional context in the real world inhabited by those subjects, the level of subjectivity or objectivity of an individual label is defined as the percentage of of human subjects who would say "yes, that label belongs with that multimedia item". The model can be visualized as follows:

Fig. 1: The relevance of a tag to an object is defined as the proportion of human subjects (pictured as circles) within a real-world functional context and drawn from a well-defined demographic that agree on a tag. I claim that this is the only notion of the objective/subjective distinction relevant for our work in developing multimedia search engines.

Under this view of the world, the distinction between subjective and objective reduces to the inter-annotator agreement under controlled conditions. I maintain that the level of inter-annotator agreement will also reflect the usefulness that the tag will have deployed within a multimedia search engine designed for use within the domain defined by the functional context by the people in the demographic. If we want to assimilate personalized multimedia search into this picture we can define it within a functional context for a demographic consisting only of one person.

This model reduces the subjective/objective difference to a estimation of the utility of a particular annotation within the system. The discussions we should be spending our time on are the ones about how to tackle the daunting task of implementing this model so as to generate a reliable estimates of human computational relevance.

As mentioned above, the model is intended to be implemented on a crowdsourcing platform that will produce an estimate of the relevance of each label for each multimedia item. I am as deeply involved as I am with crowdsourcing HIT design because am trying to find a principled manner to constrain worker pools with regard to demographic specifications and with regard to the specifications of a real-world function for multimedia objects. At the same time, we need useful estimators of the extent to which the worker pool deviates from the idealized conditions.

These are daunting tasks and will, without doubt, require well-motivated simplifications of the model. It should be clear that I don't claim that the model makes things suddenly 'easy'. However, it is clearly a more principled manner of moving forward than debate on the subjectivity vs. objectivity difference.

Continued...