iamelgringo

Tuesday, September 09, 2008

Mechanical Turk: Now with 25 percent more Awesome.

I just finished my first big project using Amazon's Mechanical Turk service. I'm in love.

I had a list of over 6000 business names, addresses and url's of dubious quality that I needed to make sure was accurate. For a brief moment, I thought about checking them myself. But after doing several dozen by hand, I realized that I was violating the principle of Don't Be Silly(TM). So I was stuck. Either I could use half baked data as is or shelve the project entirely.

Enter Mechanical Turk, and the hordes of awesomeness to save the day. I'd heard about Mechanical Turk several years ago, and I had been dying to use it. But, I never quite found 12,000 itches that I needed to scratch.


What it took to do the project

I needed the Turkers (preferred nomenclature for those who complete tasks on Mechanical Turk) to look up the business name and URL, correct any errors they found and submit any comments. To increase accuracy, I wanted each task completed twice.

Here are the basic steps I did to create a batch job.

  • I clicked a few buttons on the Turk's interface to tell it how many times you need each task to be done, any requirements on the Turkers that you wish to enforce, and how much you're willing to pay a Turker for the task.
  • I then created a HIT, or Human Intelligence Task. "Human Intelligence Task" sounds a bit complicated, but it's really simple and straight forward. You design a specialized HTML template with a form in it based off of sample ones that Mechanical Turk give you.
  • Finally, I reviewed everything, uploaded a CSV file filled with the name, addresss and URL data that I need to complete the template, and I hit submit.
  • Approve each task as it's completed, or wait until it's finished and you are able to use the "Approve All" button.
  • 5 days later, download my CSV file full of awesome.

I did run a test of 2-300 addresses through the system to be sure the process worked. A few tweaks later, I loaded up the 6,204 line CSV file, sat back, and within minutes, hordes of Turkers were chipping away at my 12,408 tasks.


The Awesomeness

Now, before I started, I was concerned about the quality of work that I was going to get for $.02 a task. I was half expecting a CSV file filled with unusable crap at the end of this, and I was ready to chalk it up to a lesson learned, and $300 wasted.

Instead, I was blown away at how people handled the task. People really took pride in their work, and were interested in helping me out. Several Turkers emailed asking for clarification on how I wanted the individual fields completed. One or two emailed me wishing me luck with my project.

One user pointed me to Turker Nation, a bulletin board where Turkers can bounce questions off of each other. She had created a thread devoted to my task. So, I used the chance to the people working on it and give them a bit of guidance as to exactly what I wanted.

Further examples of awesomeness were in the comments people submitted along with the information:

  • Can't find the exact web address for the hospital. Apologies for using a directory listing, in spite of it being the less desirable method.
  • The business you're looking for is a famous cancer institute setup in 1898 by Dr. Roswell park. And it happens to be America's first cancer institute.
  • I was unable to find a website, but went through the county government site to get the information. I checked multiple sites, since the address I found was different. They all listed the Airport Road address.

One person couldn't find a URL for the business for me, so they emailed me to let me know that they called the business long distance and verified the address for me.

Holy crap. People did all that for two cents a task.

Tips

Several Turkers mentioned that they appreciated working for a decent requester (me). Apparently, Turkers tend to get upset when requesters do things like treat them like crap, talk down to them in their form, or don't approve their work on a task that pays a measly two cents.

So, from my experience, I'd suggest a couple of tips when using the Mechanical Turk service.
  • Don't be a jerk. Approve people's work fer crying out loud. You're paying two cents to a nickels for people's time and effort. Reserve not approving people for users that fill your forms with spam or other hostile acts.
  • Appeal to people's kindness. Most people really aren't doing this for the cash, although I'm sure some do. If people feel like they're helping someone out, or doing something for a good cause, they're going to feel good about themselves as well as earn two cents a task. Turkers really want the requesters to succeed.
  • Be clear about what you need. I read a number of threads on Turker Nation where people were frustrated that they couldn't figure out what the requester wanted. Decreaseing people's pain level in completing your task means that more people will be interested in working for you.
  • Be thankful. I was really truly humbled at how many people helped me complete my task. I was also overwhelmed at how many people wished me luck on my new business venture. So, I made a point to express my gratitude in the HIT as well as on Turker Nation. People really responded to that.

Finally

The whole project of verifying a list of 6,204 business names, addresses and URLs was completed in under 5 days at a cost of $300. I'm thinking of all the different ways in which I can make use of the service.

Seriously, it's cool. You should check it out.