What a “Minimal Turing Test” Says About Humans
What word would you say to convince someone you're not a robot?
Posted Sep 21, 2018
In 1950, the computer scientist Alan Turing asked “Can machines think?” and proposed a test: Could a computer convincingly imitate a human in written conversation? Now two cognitive scientists have proposed a simplified version of the test, not to challenge artificial intelligence but to explore what we humans think makes us special.
In their “Minimal Turing Test,” people and machines get only one word to convince a human judge that they’re alive. What would you say? They conducted an online survey, described in the November issue of the Journal of Experimental Social Psychology, which I also covered for Science. About a thousand participants offered four hundred different words, with the most common being love (14%), compassion (3.5%), human (3.2%), and please (2.7%). The others fell into the categories of affect (e.g., happiness), faith and forgiveness (Jesus), food (banana), robots and animals (dog), life and death (family), and bodily functions and profanities (penis).
See below for a chart of words used more than once, with circle size indicating popularity. Color indicates category. Position indicates a word’s “embedding,” an algorithmic measure of its meaning such that similar words are near each other.
Forty-seven percent of people offered a word related to the mind. Of those, 15% named one related to thinking and doing (such as judgement) and 85% named one related to sensing and feeling (such as grief). People seem to believe that computers are smart but would have little use for words describing subjective experience. (Previous research on the “uncanny valley of the mind” shows that when computers do talk about sensation and feelings, it feels creepy.)
How effective are these choices? The researchers took the top word from each category: love, please, mercy, compassion, empathy, banana, alive, human, robot, and poop. Two thousand online participants each saw a random pairing and guessed which was provided by a human (though both were). Love aside, there was no correlation between a word’s popularity in the first task and its convincingness in the second, indicating submitters’ failure to predict how words will be received. The winningest word was poop. In the figure below, percentages indicate how often a row word beats a column word.
The researchers—John McCoy, now at the University of Pennsylvania, and Tomer Ullman, at MIT—write that if they’d included more words in the second task that evoke emotions rather than merely describe them, such as profanities, those words might have been judged human too. Would silicon suspect the visceral disgust some people feel at the word moist? (It will after reading this article.)
Some fun words offered by lone participants in the first task: smurf, smegma, ginormous, yolo, noob, oops, lol, omg, frienemie, coexist, hitler. Some really caught the moment: captcha, terminator, huh?, f*ck off. When asked if the latter were really one word, McCoy said, “As meta-judges of this whole process, we decided to allow it, since it seemed like an appropriate reaction.”
The researchers believe their test highlights people’s intuitions about what separates humans from machines, and that it could be used to test other stereotypes. What word do people think a woman or an old person would say? But interpretation is complicated by the fact that respondents must think about how other people will think that they think.
I told the researchers their test seems like a particularly noisy way of asking what qualities supposedly separate humans from machines, given that responses are filtered through recursive mental modeling and other processes. Couldn’t they just ask people to name a uniquely human attribute or concern? McCroy said it’s “not actually that obvious” how best to elicit such judgments. They suspect that “the competitive pressure of asking the question as we do will cause some people to communicate deeper, non-obvious attributes that separate humans and machines”—like bootylicious—“since obvious attributes may lead to defeat by a smart robot.”
Indeed, some felt the competitive pressure. In the second task, Ullman told me, one participant commented, “Man, this was really hard. I felt like I was in some short Asimov story!” The researchers looked at the word pairing this person had seen: robot and human.
McCoy, J. P., & Ullman, T. D. (2018). A Minimal Turing Test. Journal of Experimental Social Psychology, 79, 1-8.