Intelligent Social Robots Must Have a "Theory of Mind"
Recognizing other minds is essential for intelligent social interaction.
Posted Nov 19, 2018
Corporate giants like Google, Facebook, and IBM are collectively investing billions of dollars in artificial intelligence (AI), bringing together some of the world’s brightest minds who claim that new techniques can create machines that think independently and creatively.
So with all this effort, money, and hype, where are the smart robots in our society? Where are the AI assistants, co-workers, and companions? Where are all the cool droids and humanoids that science fiction promised us?
In order to build AIs with human-like intelligence—AIs who can interact socially, who are able to work with us to achieve goals, and who are behaviorally and intellectually similar to beloved characters from Star Trek and Star Wars—we must first create one fundamental feature almost entirely missing from their current design. This feature is what cognitive scientists call a “theory of mind.”
Theory of mind refers to the ability to attribute mental states such as beliefs, desires, goals, and intentions to others, and to understand that these states are different from one’s own. Computers equipped with a theory of mind would recognize you as a conscious agent with a mental world of your own, rather than something purely mechanistic and inanimate.
A theory of mind makes it possible to understand emotions, infer intentions, and predict behavior. The ability to detect others’ minds is critical to human cognition and social interaction; it allows us to build and maintain relationships, communicate effectively, and work cooperatively to achieve common goals. In fact, research has shown that having a sophisticated theory of mind may be a large part of why humans have cognitive skills that seem infinitely more powerful than those of our genetically similar primate relatives. This ability is so important that when it is disrupted, as we see in some cases of autism, essential mental functions like language learning and imagination become impaired.
Recognizing other minds comes effortlessly for humans, but it is no easy task for a computer. We often forget that minds are not directly observable and are, objectively speaking, invisible. As the famous 17th-century German mathematician and philosopher Gottfried Leibniz wrote, “If you could blow the brain up to the size of a mill and walk about inside, you would not find consciousness.” It is a somewhat peculiar fact of nature that consciousness—despite the clarity and lucidity of our first-person sensory experience—is an intangible abstraction whose entire existence must be inferred.
As such, programming a computer to understand that an electrified piece of meat has a rich inner subjective world is a challenging computational task, to say the least. However, it’s not an impossible one.
But if having a theory of mind, or the ability to “mentalize,” as it is also called, is so important, then why hasn’t achieving this milestone been a central focus of the geniuses working in Silicon Valley or the biggest machine-learning companies around the world?
Don’t get me wrong: Today’s most famous AIs are extremely powerful and skilled machines that are able to “learn” in exciting and novel ways. We’ve been charmed by IBM’s Jeopardy champ Watson, who has since moved on to diagnosing cancer. Google’s AlphaGo recently defeated the world’s best Go player at a board game so complex that it has more possible positions than atoms in the known universe. And the animal-like, creepy-as-hell Boston Dynamics robots can run through forests and get up after being knocked down. Despite how impressive these computer programs are, none of them know you exist—they can’t even fake that they do. And that’s because techniques like deep learning aren’t sufficient for mentalizing. Today’s greatest AIs may be able to do some very sophisticated things, but they don’t have the basic features of a theory of mind.
This is because we haven’t got the faintest idea of how to make machines with minds like ours. Even Watson, the automated Jeopardy King, doesn’t have the intentionality of a fruit fly. And until neuroscientists understand the physical mechanisms underlying qualitative, subjective experience, theory of mind will likely stay an unsolved programming and engineering feat.
Without some sort of radical breakthrough in design, sentient machines will remain science fiction. It is highly improbable that a computer with a mind or a theory of mind will just suddenly appear due to increases in processing power and speed. Unfortunately for those working in machine learning, it seems that mind reading abilities need to be programmed the hard way.
While admittedly difficult, the good news is that a small but diligent group of social robotics researchers has been chipping away at the problem for some time. Roboticist Brian Scassellati, who is also a professor of cognitive science and mechanical engineering at Yale, pioneered an approach that he laid out in his 2001 dissertation for MIT, “Foundations for a Theory of Mind for a Humanoid Robot.” Scassellati and colleagues suggest that the best way to begin creating robots that can mentalize like humans is to mirror the development of theory of mind in children.
Both clinical and lab research shows that social interaction begins with the formation of basic mechanisms of attention. This is because the focus of attention parallels the focus of the mind. Because the direction of our gaze signals the location of attention, the orientation of our eyes can reveal our mental state.
With this in mind, it’s not surprising that children gain the ability to detect faces and eyes soon after birth: This is a crucial step in engaging another human socially. In a similar fashion, social robots need to automatically detect eye-like stimuli. By engaging the eyes, a robot can enter into an interaction with a human that it can learn from.
A robot with a theory of mind is useless if the human it serves doesn’t feel like they’re engaging with another conscious being. Because of this, eye contact is also a big deal, because it informs the human that the robot recognizes it as an intentional agent. This means that the best robots must be able to display roughly the same attention mechanisms as humans.
Another important skill that emerges early in childhood is the ability to follow another’s eye gaze, which appears around the six-month mark. Gaze-following is essential to understanding the minds of others, as the space where another’s attention is directed holds information about what a person is pondering about. Think about it: When we are walking through a shopping mall, our gaze is often directed toward things we find appealing or intriguing. Likewise, when we are out in the world and have a destination in mind, we first direct our gaze in the direction of that location.
Gaze-following is also required for a key function known to psychologists as joint attention. This refers to the shared focus of two individuals on one object and is established when someone visually follows the gaze of another to a specific location in the same point in space. Joint attention is crucial to social and language learning in humans and for completing any task that requires a shared focus. For example, if someone simply says to you, “What is that?” or “Go over there,” the words only make sense in relation to what that person is looking at. For this reason, robots designed to interact with humans, whether it’s to assist us physically or just to keep us company, should be able to establish joint attention.
While these basic social cues and others such as pointing gestures and head nods are critical to the foundation of a theory of mind, equally important is the ability to recognize basic emotional expressions. Not only are such expressions direct indicators of another’s emotional state, but when they are combined with gaze information, the result can be quite revealing. It is conceivable that an observant robot could create a mental model of a human over time—including information about their desires, dislikes, and fears—if it continuously catalogs the emotions being expressed when someone’s gaze is directed at certain objects, scenes, or other people.
While creating a robot with a complete theory of mind on par with that of the average adult human would require much more than the above, these skills provide a firm foundation on which to build. One example of advancements in this field is Kismet, a moderately anthropomorphized humanoid robot designed for social interaction with humans. Kismet can speak, recognize faces, purposefully direct its eye gaze, and display some pretty adorable emotional facial expressions. Although it can imitate certain human movements and some social cues, its primitive attention system fails to follow gaze and establish joint attention. Kismet is cleverly designed to compensate for its theory-of-mind shortcomings, its limitations are obvious. Still, the robot is proof that the blueprint for social robots with mentalizing abilities exists and is awaiting its evolution.
Megacorporations eager to integrate artificial intelligence into society should pay attention. A computer without a basic theory of mind is doomed to perform grunt work, as it can never effectively engage humans or work cooperatively. On the other hand, robots with the appearance of a mind are more functional from a user experience standpoint—science clearly shows that we are more willing to interact with, and are more trusting of, robots that look and behave like humans.
Society could benefit greatly from this technology. For example, socially assistive robots could care for the sick and elderly and help foster social, emotional, and cognitive growth in children. By being our companions, robots with a theory of mind could comfort us when no one else is around to. On a deeper level, they could enhance our spirituality by forcing us to ask what it means to be sentient beings in a physical world. With such tremendous potential, it seems to me that we should all be welcoming the social AI revolution with open arms.
This article was originally published at Quartz.