Too Long; Did Not Read
Skip to code samples below to see how to use Amazon’s Mechanical Turk in order to crowdsource small tasks for very cheap. You can automate as much as possible and outsource the remaining human tasks to a large crowd.
Source code is below. Also, a ridiculous sample is shown.
True Story Follows
So, as has been the case with my last few blog posts, I’m on the grind with Exercise Library. There’s a whole back story to it and a number of things where I’ve been using code to automate some of the tasks that need to be done.
Long story short, I re-filmed several hundred exercises, each from two different angles. I need to edit each “scene” which is composed of two shots, and each shot needs to be trimmed of dead space in the beginning and end. Moreover, not every video is framed very well, so I need to crop each video.
However, this is extremely tedious. Aside from the tasks above, I still need to manually load and render each set of videos. The rate at which I was editing was only about 10 videos per hour. For the 700 total exercises I needed to get done, this was going to take me 70 hours, or basically a week straight of just editing videos, assuming I had enough free time to begin with.
I looked online at sites like Fiverr to find people that might be willing to edit videos for really really cheap, but, alas, they basically don’t exist. However, I suddenly had an idea while my mind was wandering and walking my two dogs, Bonethug and Snuggles.
I bet there was a site that allowed me to crowdsource really small tasks (Ah, this does exist. Amazon Mechanical Turk, among others). For each video, I can divide up all of the actions I need to take on the video into its own task: frame the video properly, find the actual start point, find the actual end point. I could pay a few cents to each action, then write a script that will actually execute all of those instructions on all of my videos using OpenCV.
But Scott, You’re Not Crazy Enough
I am crazy enough.
What is Mechanical Turk?
Mechanical Turk is a crowdsourcing market place hosted by Amazon. You can use it to have people take surveys, maybe describe a picture…basically anything that isn’t morally questionable. And by morally questionable, I mean driving traffic to your site or something. Labeling pornographic images is fine. Machine Learning. Read about it.
Without looking at other forums, the site would lead you to believe that you primarily use Amazon’s user interface to create tasks, but if you’re reading this blog, then you, like me, are far more interested in their API and doing as much as possible with programming.
What the Docs Don’t Tell You
Before going onto wild tangents while I explain code, and to help anyone that possibly stumbled here in the troubleshooting process, Amazon’s documentation is not very thorough, and it took some trial and error to get things working. Some helpful things to know if you’re using an external website to perform tasks:
- When POSTing back to Amazon to complete a Human Intelligence Task (HIT), you MUST make it from straight from the client. Server side POSTs will be rejected.
- Amazon’s documentation tells you that you just need to post the “assignmentId”. This is false, you need the “hitId” and the “workerId” as well
- Limiting a user to only one task is not a built in feature. You need to add your own logic to enforce that rule.
- I believe that when you have a group of HIT’s, all HIT’s in that group will be locked as long as one person is viewing them. I could be wrong, but if true, that might slow down task execution.
I could break this out into code and a demo separately, but you sort of lose context, so let’s do all at once.
Suppose I want to farm out some arbitrary task. It turns out that this task can be pretty much anything, but if you want that sort of power and flexibility, the way to do is to create an “external question,” which basically means that an iFrame will load your separate website, and your website will POST back to Amazon.
So let’s set up a really simple demo. There’s a development sandbox where you have an infinite budget, and then you can easily switch over to a production environment. I tested everything initially, but I’ll walk through the example in production.
Sign up on Mechanical Turk as a requestor. There’s tons of resources on the internets about making money online and signing up as a worker. But we’re the producers, the big ballers that drop mad nickels on the economy and make things happen, paying out $2 in just a few minutes distributed across dozens of people. So we’ll talk about that angle. You pretty much just authenticate with Amazon, ensure you have an Amazon Developer Console, and add some funds to your account. The same can be done in the Sandbox.
To paint a picture of what I’m doing, I basically crowdsourced an arbitrary task. In this case, I took some awkward and ridiculous pictures of my roommate from back at the Academy, and asked workers to describe the man. See below:
Let’s go ahead and get additional context by scrolling down further:
For the code, there’s really two parts to this. In this case, for the simple aspect of demonstration, below is a quick script that will generate some tasks. This is only usable because I’ve already written the server side code (bear with me).
Credits to this blog for the code sample that got me started.
All that’s happening here is that we’re creating Human Intelligence Tasks that simply redirect to my website: https://mturk-demonstration.herokuapp.com/. Also, there’s no guarantee that the link I just posted will live forever, I just threw together a quick Heroku App.
Server Side Code
Here’s my server side (Django) code:
A few things to note that I have commented:
- Host endpoints vary based on production/sandbox
- You’ll need assignment ID, worker ID, and hit ID in order to POST back to Amazon (despite what docs say)
- Assignment ID will have the value “ASSIGNMENT_ID_NOT_AVAILABLE” if the worker is just viewing the task (it would be a good idea to hide the submit button or something so they don’t do unnecessary work and get frustrated)
- If your task happens to be like the one I described where it’s the same for everyone, you’ll want to guard against the same worker executing the same task repeatedly.
Client Side Code
Putting It All Together
When I post a task as a requestor, I can see it has been created:
From Amazon, I can use their frustrating interface to see all of my batches which are counter-intuitively at 0, and then click the not so obvious “Mange HITs Individually” to get some actual data:
Then for whatever reason, I can only review tasks individually, which kind of defeats the purpose of crowdsourcing:
So we need some more code. One more sample for how to manipulate HITs with Python:
This post of course, would not be complete without the results of the crowdsourcing. All in all, I spent $3. If that. I’m not sure if all the tasks got completed. But here are the responses that I got from all the workers for the above task:
- Eclectic. Mustachioed. Interesting. ‘Murican.
- The man in the picture is of average height and weight but muscular build from working out. He has dark hair and sometimes facial hair but not always. He has lived a life of adventure. He is not afraid to take risks and can be the life of the party but tires at the end of the day just like the rest of us. He interacts well with different people including children. He has many varied interests from cars to vacationing. He is confident and proud.
- Diverse. Looks like he knows how to have fun but also how to be serious. Soldier. Silly. Strong.
- Sociable, outgoing, and always busy/occupied
- This man is jolly type and hard worker. he enjoys the nature everthing.Children like him somuch because always like as freind to children.
- Outgoing and fun.
- 1. Suburb Perv
2. Phelps Wannabe
3. Cool Vet and Prince look-a-like
4. Career Day Vet
5. Leprechaun Pimp
6. 4 hrs 3 minutes and 30 seconds clean Guy
- This man knows how to live life. He is at one with himself and knows what it means to be a real man and a hero.
- A fun guy that enjoys having a good time.
- It’s like if the “Most Interesting Man in the World” became a hipster that tries too hard
- This man is looking handsome. He is brave and honest. He looks like an army soldier.
- Active and sense of humor.
- Looks like he wants to promote bikini, so he approaches a manufacturer to give him a chance to promote their bikini but he refuses and instead offer him to promote his company in school and mall with different type of costumes.End of the day he is exhausted and taking a nap on the table.
- Good handsome guy serving his nation with pride.
- Hardworking solider who enjoys his free time.
- The man is athletic, outgoing, a patroit, and enjoys vintage automobiles.
- outgoing, a soldier, doesn’t take himself to seriously, good sense of humor
- Goofy, oddball, patriot, soldier, friendly, off beat, unique, strange, funny.
- good look
- Adventurous extrovert with well rounded social life.
- This person appears to be someone who likes to have fun and doesn’t take himself too seriously. However, he also seems to have a serious side that is dedicated to the causes in which he believes.
- he is a man of many disguises, seems to have a sense of humor and likes to stand out in a crowd.
- Sleepy. Furry. Half-naked. Touching a sweet car.
- he seems to be a military man who wears what he wants to wear regardless of what people will think of him
- An active, social, healthy and handsome man
- Soldier. Silly. Strong. Knows how to be serious but how to have fun.
- super style
- CRAZY, JOKESTER WHO ENJOYS MAKING OTHERS LAUGH, WHILE STILL HAVING A SERIOUS SIDE AND SERVING OUR COUNTRY
- Soldier from the south in some interesting getup.
- Adventurous, funny, caring, selfless
- extrovert, social, funny, helpful, service oriented, stylish, hard working
- He is a sexy and versatile man!
- He is the all american type of guy.
- This dude is a soldier who makes time to talk to kids but definitely knows how to enjoy his down time to the fullest. Also, he gets sleepy sometimes.
- Very outgoing. Not afraid about what others might think of him.
- This man is part of the military. He is in good physical shape. These photos are not of the same man. The man in the green fuzzy suit is not the same man on the beach.
- He is a veteran of the armed forces. When he’s not serving our country he likes to have fun by restoring vintage cars, hanging out at the beach and going to parties. He is a charitable man who likes to help out with children at their school and help out at fundraisers. Sometimes he pushes himself too hard and gets tired.
- adventurous, fun
- Party animal, fun-loving, good person. Bumpkin, hard worker, fun.
- He seems like a ridiculous man. He does silly things and wears weird clothes.
- He seems fun,with a great personality and a humanitarian.
- funny, energetic, fun, happy, exciting, positive, partier
- This is Tim. He’s not afraid to take risks and enjoy life by posing in bizarre pictures. He stays out late and hits up the nightlife and is known for wearing odd clothes on occasion. Tim has an above-average respect for the armed forces. His active social life allows him to be in constant contact with the public.
- This man is responsible yet outgoing. He loves to have fun but he knows how to control himself.