These days, most experimental consumer researchers (not to mention social psychologists, political scientists, behavioral economists, etc.) have replaced or supplemented student subject pools with Amazon Mechanical Turk (MTurk) workers to conduct their research. Using MTurk survey-takers for research is convenient, cheap, and quick. (It is also often exploitative, but that’s a different blog post).

Zol Korabi by Rolo Natkys Flickr Licensed Under CC BY 2.0
Source: Zol Korabi by Rolo Natkys Flickr Licensed Under CC BY 2.0

It’s no wonder that tens of thousands of psychology studies each year are being presently done using MTurk workers as survey-takers. (When I searched on MTurk as I was writing this blog post, 319 surveys were available to me to be filled out as an MTurk worker, and I don’t even have the MTurk Master qualification!)

Given that a significant amount of social science research today relies on MTurk workers, it is certainly worth knowing more about these people. This information has bearing on the validity of research being conducted.There are many important questions to ask such as:

  • Who are these people?
  • Are they answering surveys in a serious way?
  • Can we be confident that anything we find with this group actually applies to the broader population? In other words, are they representative?
  • How valid are inferences made from their answers? What things should we be concerned about?

Each of these questions is worth digging into. However in this post, I want to consider a different question that I was very curious about: “Just how many survey-takers are there on Amazon Mechanical Turk?”

Why is this question important?

Drawing inspiration from a study I will discuss a bit later in this post, it’s helpful to think of the Amazon MTurk site as a pond, each researcher as a fisherman or fisherwoman, and each MTurk worker as a fish. Every time a researcher posts a study, they are casting a net to catch several hundred fish. They catch and release the fish, but in the process, they extract some “essence” out of the fish in the form of answers to survey questions, changing the fish from its pristine condition.

At the moment, nets are being cast in the MTurk pond thousands of times each week by (mostly) academics, and increasingly by professional pollsters. So the questions of “are there enough fish?” or whether we are all catching mostly the same fish are significant.

A large pool of respondents has three advantages: (1) greater chance of finding diverse characteristics, and  sufficient number of respondents with a certain characteristic, say, heavy TV watchers, or hiking enthusiasts; (2) more “fresh” uncontaminated respondents who will take the task of answering the survey seriously and do it with thought and honesty (i.e., someone who hasn't answered the general self-control scale or the social and economic conservatism scale a dozen times already), and (3) greater chance of a sample that is representative of the population at large (although this is not guaranteed).

Straight from the horse’s mouth – The MTurk site?

The logical first place to look for the answer is the MTurk site itself. You would expect Amazon to provide this information, updated daily, for two reasons. First, this is crucial information for Amazon’s customers (the academic and commercial researchers) who are paying them dearly to use the site.  Without question, it would increase the quality of research done using the site. And second, they already know how many MTurk workers use the site, so it is very easy (and cheap) for them to extract and publish it. But unfortunately that’s not the case. 

Their claim “Access more than 500,000 Workers from 190 countries” is pretty much meaningless because it is imprecise and hasn’t changed in years. Plus it’s not clear how many of these alleged half a million people actually take surveys or have come and gone from the site.

As an aside, it’s mind-blowing, and a real shame, how many reputed media outlets such as the Guardian, the Wall Street Journal, and even research organizations like the Pew Research Center lazily report this dodgy number. While this is not quite "fake news", it does seem like "lazy news" to me because it can distort the rest of the story that is being reported.

Not cool, Amazon and big media! Ok, so now we need other sources to find the number.

Stewart et al. study, Judgment and Decision Making, 2015.

Psychology professor Neil Stewart and a team of researchers from seven different labs spread across different countries have conducted the most comprehensive and impressive (in terms of size and duration) study to date to answer this question that I could find. Covering the MTurk data collection from all their labs from January 2012 to March 2015, and employing a cool statistical method that ecologists normally use to figure out the population of a particular animal or bird in the wild, they concluded that the average researcher using MTurk has access to a pool of around 7,300 respondents at any given time.  They also found that 26% of MTurk survey takers “retire” every quarter and they are replenished by new workers. Also noteworthy, their labs used mostly US-based respondents. I was struck by their comparison:

“Thus the population that the average laboratory can reach is only a few times larger than the active participant pool at a typical university (course credit pools tend to have quite high uptake), with a turnover rate that is not dissimilar to the coming and going of university students.” (p. 485)

So the MTurk survey-takers are like a university student subject pool on steroids!

Some of their other conclusions are equally thought-provoking, to say the least. Around 65% of their survey-takers participated in multiple studies conducted by the same lab (I got this value by eye-balling Figure 5 in their paper). Now it is very likely that these different studies were about different topics, but this is still not ideal. And just over 50% of participant participated in studies from multiple labs. This is the fish being caught, released, and caught again, and released again, and caught again, and on and on.

I could find two other estimates of older provenance and/or using what I thought were less reliable methods:

  • A 2011 study in Computational Linguistics by Karen Fort, Gilles Adda and K. Bretonnel Cohen estimated the number of survey-takers to be between 15,059 and 42,912. This was based on survey estimates from a set of MTurk workers and adjusted using expert opinion. Plus they included non-US respondents as well (a fair number of workers are from India).
  • An anonymous commenter on Reddit indicated that they were able to get 6,000 unique responders in one week by having no conditions or restrictions on who could participate in their study.

So what to make of these numbers?

To give some perspective, the online panel company YouGov claims that it has an online panel of 4 million panelists covering 37 countries, and a US panel of 1.8 million panelists. Similarly, SurveyMonkey claims to have an online panel of 30+ million people, and Slice Intelligence to have 4.2 million people in its online panel. Obviously, these are just claims, much like Amazon’s “Over 500,000…” claim, and it is impossible to verify any of them.

Coming back to Amazon MTurk, it seems to me that there are not that many fish and there are far too many nets cast into the MTurk pond. This is certainly not good news if you are a psychological researcher because a vast majority of the MTurk workers have taken dozens, if not hundreds, of other surveys within a short amount of time. It is quite likely that many of them are in the middle of a marathon survey-taking session. This can create all kinds of problems if the researcher is trying to produce a psychological manipulation like a mood induction, or a specific mindset, etc. (More about this in a future post).

Too many fisher folk, not enough fish.

So this is the image we are left with. Imagine a pond with about 10,000 fish swimming around at any given time. And then imagine tens of thousands of fishermen and fisherwomen lined up around the pond and casting their nets thousands of times every week, catching the fish, probing them, and releasing them back, catching, probing, then releasing, catching, probing, then releasing on and on. It’s no wonder a quarter of the fish retire every three months. They are likely burnt out from the multitudinous probing of the fisher folk. This is what using MTurk workers to do social science research is like circa early 2017. Not a pretty sight.

PS:

After I wrote this post, Kristy Milland (Twitter handle: @TurkerNational) sent me a link to a study she conducted (see her comments below). Over a six-week period in 2015, she got 25,293 unique US-based participants, and a total of 30,002 participants (most of the remaining were from India) to complete her hit. Note this includes all MTurk workers, not just survey-takers. The link to Kristy's study results can be found here.

About Me

I teach marketing and pricing to MBA students at Rice University. You can find more information about me on my website or follow me on LinkedIn, Facebook, or Twitter @ud.

You are reading

The Science Behind Behavior

In Defense of United Airlines

The fiasco of dragging a passenger off a plane needs more tempered reaction.

Paying Income Taxes Makes Us Happy

It triggers reward areas in our brain & is associated with greater well-being.

Why Are Indians Ashamed Of Their Happy Arranged Marriages?

The concept is associated with forced marriage & signals social ineptitude.