Thursday, July 28, 2011

It's complicated.

I just logged into the manuscript management system of a journal that will remained unnamed and was greeted by this information icon plus message.

Is this system really user friendly or is it exactly the opposite? The page goes on to state, "So please do look at the .... hints and warnings that we’ve put in the green box at the top of each page. They will save you time and ensure that your work is processed correctly."

I identify with this sentence. I am always telling people that they are probably not going to understand everything I am saying. It sounds like standard US-English, but I speak fast, with a lot of unexpected vocabulary, large dynamic range and obscure cultural references and I have just enough of a regional accent. People who easily follow fast paced Hollywood movies think they should understand me. But really, I tell them, the skills just might not transfer and it might not be your fault.

I started warning people after years of trying to slow down and choose simple vocabulary. I still try to do this in formal, large group situations with people I don't know very well. However, there are just too many people that insist on speaking English with me: the other languages I speak are drying up and dying and my English will go that way as well if I don't occasionally take it for a walk and deploy it in its full erratic richness. Somehow, I do really identify with this system: "Hey, World! I'm complicated. Deal with it."

And, like this system, I ensnarl myself in some sort of paradox of self-reference. If I don't speak in a way that will allow people to understand me -- how can I expect them to grasp that I want to communicate to them that I might be difficult to grasp? Will I ever succeed in motivating anyone to bring up the extra dose of patience and attention that it takes to follow me? Won't they be suspicious of me if they know I know that I am difficult to understand and also that I am not doing anything about it? Can I ever convince anyone that the effort is worth the payoff?

Likewise: Can we except that the system is indeed being self-explanatory when it claims of itself that it is not self-explanatory? Is it worth the effort and patience it takes to cut through that knot? In the end it's a matter of trust. I smile at the message and the information icon and consider how to move forward.

In the end, I decide to whip off a blog post, sighing when the word "ensnarl" can't be typed without a red misspelling line under it, and return to my attempt to extract the manuscript I'm supposed to be reviewing from the system.

I now sympathize with the interface. The message has worked: I've agreed that it's ok to be complicated.

Wednesday, July 20, 2011

Google doesn't love me.

As an IR researcher, I tend to obsess about why Google can't always deal with my queries. I fall into this bad habit when I have a lot of better things to do with my time and even when I know it is not getting me anywhere.

Today I needed to recall the details of the relationship between Wikipedia and Wikimedia Commons so that I could get it just right for a text I was writing. I typed "relationship between wikipedia and wikimedia commons" into the Google search box and was rewarded as my top hit the link on the image pictured at the right. The rest of the list was of the same ilk.

Oh, my gosh, why is Google reacting this way? This is not how I expected by evening to play out! Actually, "relationship" was sort of information that I wanted and "Wikipedia" and "Wikimedia Commons" were the two entities whose relationship I wanted to understand. Wasn't I making myself clear?

Is Google interpreting my named entities being as the source of the information? Is Google trying to tell me something? Such as, I should be reading Wikipedia instead of writing about Wikipedia?

Is Google trying to gently point out to me that I should be doing more image search? Or maybe giving me subtle support for my opinion that there is a very fine line between navigational and transactional queries?

Is Google relating my query to religion in order to express support for my blogpost last month on Search and Spirituality, which was written when I was in rather a strange mood? Does Google want to evoke in me again that vein of reflection?

But there is an alternative to this vein of inquiry: a simply, very plausible explanation. It runs closely along the lines of the now infamous: "He's just not that into you". Google doesn't do what I would anticipate or what would make be satisfied and happy because Google simply doesn't love me. The evidence is there: my information needs remain unmet and my search goals unreached.

Google and I obviously have a relationship problem. It goes beyond my searches on "relationships". But yet, I keep on returning to that tempting search box again and again. Do I have some sort of a genetic predilection with maintaining a dysfunctional relationship with my search engine? At least until I find an alternate outlet that does happen to be "into" me and my information needs.

In the meantime, I guess I can go and find myself a copy of the Gospel of Mark: I never realized that there was so much overlap.

Saturday, July 2, 2011

Crowdsourcing Best Practices: Twenty points to consider when designing a HIT

Your very first glance at worker responses on the very first first task you crowdsource tells you that there are very different kinds of workers out there in the crowdsoucing-sphere, for example, on Mechanical Turk. Some of the responses are impressive in the level of dedication and insight that they reflect, others appear to flatly fail the Turing test.

It is also quite striking that there are different kinds of requesters. Turker Nation gives us insight on to the differences between one requester and the next, some better behaved than others.

What is particularly interesting is the differences among requesters who are working in the area of crowdsourcing for information retrieval and related applications. One would maybe expect there to be some homogenity or consensus here. At the moment, however, I am reviewing some papers involving crowdsourcing, and no one seems to be asking themselves the same questions that I ask myself when I design a crowdsourcing task.

It seems worthwhile to get my list of questions out of my head and into a form where other people can have a look at it. These questions are not going to make HIT (Human Intelligence Task) design any easier, but I do strongly feel that asking them should belong to crowdsourcing best practices. And if you do take time to reflect on these aspects, your HIT will in the end be better designed and more effective.
  1. How much agreement do I expect between workers? Is my HIT "mechanical" or is it possible that even co-operative workers will differ in opinion on the correct answer? Do I reassure my workers that I am setting them up for success by signaling to them that I am aware of the subjective component of my HIT and don't have unrealistic expectations that all workers will agree completely?
  2. Is the agreement between workers going to depend on workers' background experience (familiarity with certain topics, regions of the world, modes of thought)? Have I considered setting up a qualification HIT to do recruitment? or Have I signaled to workers what kind of background they need to be successful on the HIT?
  3. Have other people run similar HITs and I have I read their papers to avoid making the same mistakes again?
  4. Did I consider using existing quality control mechanisms, such as Amazon's Mechanical Turk Masters?
  5. Is the layout of my HIT 'polite'? Consider concrete details: Is it obvious that I did my best to minimize non-essential scrolling? But all in all: Does it look like I have ever actually spent time myself as a working on the crowdsourcing platform that I am designing tasks for?
  6. Is the design of my HIT respectful? Experienced workers know that it is necessary for requesters to build in validation mechanisms to filter spurious responses. However, these shoud be well designed so that they are not tedious or insulting for conscientious workers who are highly engaged in the HIT: it is annoying and breaks the flow of work.
  7. Is it obvious to workers why I am running the HIT? Do the answers appear to have a serious, practical application?
  8. Is the title of my HIT interesting, informative and attractive?
  9. Did I consider how fast I need the HIT to run through when making decisions about award levels on also when I will be running the HIT (on the weekend)?
  10. Did I consider what my award level says about my HIT? High award levels can attract treasure seekers. However, award levels that are too low are bad for my reputation as a requester.
  11. Can I make workers feel invested in the larger goal? Have I informed workers that I am a non-profit research institution or otherwise explained (to the extent possible) what I am trying to achieve?
  12. Do I have time to respond to individual worker mails about my HIT? If no, then I should wait until I have time to monitor the HIT before starting it.
  13. Did I consider how the volume of HIT assignments that I am offering will impact the variety of workers that I attract? (low volume HITs attract workers that are less interested in rote tasks)?
  14. Did I give examples that illustrate what kind of answers I expect workers to return for the HIT? Good examples will let workers concerned about their reputations judge in advance if you are likely to reject their work?
  15. Did I inform workers of the conditions under which they could be expected to earn a bonus for the HIT?
  16. Did I make an effort to make the HIT intellectually engaging in order to make it inherently as rewarding as possible to work on?
  17. Did I run a pilot task, especially one that asks workers for their opinions on how well my task is designed?
  18. Did I take a step back and look at my HIT with an eye to how it will enhance my reputation as a requester on the platform? Will it bring back repeat customers (i.e., people who have worked on my HITs before)?
  19. Did I consider the impact of my task on the overall ecosystem of the crowdsourcing platform? If I indiscriminately accept HITs without a responsible validation mechanism, I encourage workers to give spurious responses since they have been reinforced in the strategy of attempting to earn awards with investing a minimum of effort.
  20. Did I consider the implications of my HIT for the overall development of crowdsourcing as an economic activity? Does my HIT support my own ethical position on the role of crowdsourcing (that we as requesters should work towards fair work conditions for workers and that they should ultimately be paid US minimum hourly wage for their work)? It's a complicated issue:
The workers on Mechanical Turk refer to themselves as "turkers". This act of self-naming signals a sense of community, of a common understanding of what they are doing, the commonality of the activity that they are all engaged in.

What do we as requesters call ourselves? Do we have a sense of community, too? Do we enjoy the strength that derives from a shared sense of purpose?

The classical image of Wolfgang von Kempelen's automaton, the original Mechanical Turk, is included above since I think it sheds some light on this issue. Looking at the image we ask ourselves who should be most appropriately designated "turker"? Well, it's not the worker, who is the human in the machine. Rather it is the figure who is dressed as an Ottoman as is operating the machine: If workers consider themselves turkers, then we the requesters must be turkers, too.

The more that we can foster the development of a common understanding of our mission, the more that we can pool our experience to design better HITs, the more effectively we can hope to improve information retrieval by using crowdsourcing.