Skip to main content

Verified by Psychology Today

In the Beginning There Were No Words

A new perspective on the evolution of language.

Key points

  • Words had to evolve before a language, but the origin of words is a mystery.
  • An infant’s ability to learn words depends on her non-verbal relations with her mother.
  • Our ancestors’ practice of raising infants with extended families was crucial for learning language.

In 1871, Darwin hypothesized that “the difference in mind between man and the higher animals…is one of degree and not of kind.” In the case of language, attempts to define that degree have been an embarrassment for the theory of evolution. Indeed, the quest to discover them has been referred to as “the hardest problem of science” (Christiansen & Kirby, 2003).

It’s hard for many reasons. The most obvious is that humans are the only species to use language. Its essential features are absent in all forms of animal communication. How could language have evolved if it were not selected from animal communication?

A second reason is that language is too complex to have evolved full-blown. It is better to ask, how did words and grammar evolve? But while it’s clear that words evolved before grammar, it’s unclear how the emotional signals that animals use to communicate could give rise to even the simplest words.

A more subtle reason is that, before an infant speaks her first words, she must learn to communicate nonverbally with her caretaker (usually, her mother). Such communication, in which mother and infant share emotions and attention, is called intersubjectivity. To solve the “hardest problem,” we therefore have to understand intersubjectivity, how it evolved and how it develops.

Linguists have paid scant attention to intersubjectivity. As Noam Chomsky has famously argued, it is grammar, not words, that makes language a unique form of communication. He also argued that a universal grammar, one that applies to all languages, was caused by a mutation that occurred about 80,000 years ago. (Berwick & Chomsky, 2016).

The Origin of Words

This is where I part company from Chomsky. It is true grammar allows a speaker to create an infinite number of meanings from a finite number of words. That is one reason why language is so special. Yet none of those meanings could be created without words and, as Chomsky acknowledges, the origin of words is a mystery.

Chomsky deserves credit for revolutionizing the study of language by looking for its underlying structure. But to understand its evolution we must start at the beginning. Before attempting to understand the complex problem of a universal grammar, we must explore the nonverbal foundations of language and the origin of words.

Ironically, Chomsky himself recognized the importance of intersubjectivity. In what was perhaps an unguarded moment, he remarked that infants need “triggering events to learn a language,” in particular, “a stimulating loving environment in which their natural capacities will flourish. A child that is raised in an orphanage . . . may be very restricted in his abilities. In fact, it may not learn language properly ." (Chomsky, 1988, pp. 172-173, italics added).

That insight was, unfortunately, not enough to focus Chomsky’s interest on the significance of a “stimulating living environment.” Chomsky believes that learning language was like learning to walk. “Learning language just happens.” Throughout his career, Chomsky minimized the social function of language, at the expense of its contribution to thought. Thinking may be an important function of language in adults but, as we now know, language, unlike walking, would never develop without social interaction.

The Importance of the Maternal Bond and Joint Attention

The “stimulating loving environment” to which Chomsky refers is the relation between an infant and her mother. During the first few months, human infants and their mothers bond by taking turns at sharing gaze and affect. It is those bonds that comprise intersubjectivity. Toward the end of her first year, an infant also learns to share her mother’s attention to external objects. That relation is called joint attention. Both relations, which were topics of previous blogs, are uniquely human.

Because intersubjectivity is nonverbal and rhythmic, it is difficult to measure. But—as would be obvious to anyone who has watched an infant interact with her mother—it is there for the asking. Infants can’t say “I love you,” “I’m happy,” “I’m sad,” and so on. Yet their facial expressions and body language reveal those and other emotions.

Intersubjectivity is best measured by microanalysis, a technique developed by Beatrice Beebe (2016), who uses it to detect subtle changes in affect, many of which are too fleeting to be seen in real time. The tiny behaviors revealed by microanalysis, such as rapid shifts of gaze, head, hand, mouth-opening and closing, are often as short as 250 msec.

 Origins of attachment, used with permission from B. Beebe
Source: The mother-infant interaction picture book: Origins of attachment, used with permission from B. Beebe

While an infant and her mother are playing with each other, their behavior is recorded by separate video cameras. After the videotapes are time synched, they are assessed by experienced raters who quantify the degree of affect an infant and her mother express. Those measures have been shown to be highly correlated in infants as young as three months. Figure 2 shows two examples of a 4-month-old infant sharing affect with her mother. Note the time code, which specifies, to the nearest 30th of a second, when each frame occurred.

During the first few months, infants and their mothers also take turns in vocalizing. Instead of words, infants coo, grunt, whimper, and make other sounds. Such turn-taking is called protoconversation because it follows the form of actual conversation.

Pointing: A Developmental Milestone

Rhythmic exchanges of affect and turn-taking while vocalizing, which only occur in humans, are important features of development. Still, those dyadic relations are a far cry from language. At around 6 months they are supplemented by triadic relations in which mothers and infants share attention to objects of mutual interest, a critical stepping stone towards language.

Around the end of the first year, infants begin to point to objects, -not to request them, but to attract their caretaker’s attention. Pointing functions as the infant’s first act of reference, the equivalent of saying “that”. While pointing, an infant often smiles at her caretaker and shares eye gaze. These social responses are evidence that an infant and her caretaker have a “meeting of minds,” or a common ground in which to share the object in question.

Pointing has been called “the royal road to language” (Butterworth, 2003). A month or two after an infant begins to point to objects, she begins to name them. When, for example, she points to a dog, her caretaker responds “dog” and the infant imitates that utterance. The advantage of the utterance “dog” over pointing is that it is a more precise form of reference. If, for example, the dog was next to a tree, an infant’s pointing in their direction would not distinguish the dog from the tree.

I have argued that, from birth, human infants embark on a uniquely human trajectory of interpersonal relations with their caretakers. Only human infants experience intersubjectivity and develop words. What’s missing from this account is the origin of intersubjectivity. In particular, what aspects of our ancestors’ behavior gave rise to the high degree of cooperation that is a crucial feature of intersubjectivity? To answer that question, we need to identify the selection pressures that favored increases in social engagement.

According to Hrdy (2009), the subject of an earlier post, the evolutionary origins of intersubjectivity can be found in differences in the child-rearing practices in apes and humans. Chimpanzee mothers, for example, do not allow other members of their group access to their infants for approximately six months. By contrast, human infants are raised by cooperative child-rearing from the moment they are born, a practice in which a mother’s care of her infant is supplemented by members of her immediate family, so-called “alloparents.” Although the mother is still the primary source of care, sisters, brothers, aunts, fathers, and grandmothers, even non-kin, also care for newborns.

Since infants have to rely on alloparents, they need to assess their emotions and intentions. Infants try to elicit their care by scrutinizing their faces, and by smiling and vocalizing, often in response to an alloparent’s overtures. In contrast to apes, human infants have to share their emotions with their mothers and alloparents from the moment they are born.

There is compelling evidence that cooperative child-rearing was practiced by Homo erectus, an ancestor who evolved about 1.8 million years ago. A needy Homo erectus infant would not only have to interpret its mother’s commitment but also the moods and intentions of alloparents who might also help, a challenge that apes never experience.

Hrdy argues that by crying, smiling, vocalizing, or gesturing, those infants who were best at engaging in the nonverbal communication that defines intersubjectivity would be the best cared for and fed. Such novel selection pressures favor a very different type of ancestor, one that Hrdy refers to as “emotionally modern.”

To return to the “hard problem”, we have seen that infants produce their first words by the first birthday. Why they did so was, until recently, a mystery. Linguists, like Chomsky, argued that it “just happened”, the result of an innate language acquisition device. But they’ve overlooked the unique and complex social progression that all infants experience.

At birth, emotionally modern infants are primed to form nonverbal intersubjective relations with their caregiver, first dyadically then triadically. That sequence of events leads to the production of words. Although there’s still much to be learned about each stage of that sequence, we have, for the first time, a roadmap for the development language.


Beebe, B., Lachmann, F., & Cohen, P. (2016). The mother-infant interaction picture book: Origins of attachment. Norton Press.

Berwick, R. C., & Chomsky, N. (2016). Why Only Us. MIT Press.

Butterworth, G. (2003). Pointing is the royal road to language for babies. In S. Kita (Ed.), Pointing: Where language, culture and cognition meet (pp. 9-34). Lawrende Erlbaum Associates.

Chomsky, N. (1988). Language and Problems of Knowledge: The Managua Lectures. MIT Press.

Christiansen, M., & Kirby, S. (Eds.). (2003). Language Evolution. Oxford University Press.

Hrdy, S. B. (2009). Mothers and Others: The Evolutionary Origins of Mutual Understanding. Belknap Press of Harvard University Press.