Counting Down the Months of the Year
Here's how language creates meaning.
Posted December 31, 2021 | Reviewed by Davia Sills
- Language seems like computer code: arbitrary sequences of symbols.
- Because of this, the grounding of linguistic symbols in the perceptual worlds is needed.
- If the language system creates meaning, this is insightful information for the psychological sciences and computational sciences alike.
How do we extract meaning from language? And could computers understand language the same way humans do? These are important questions for the psychological sciences as they are for artificial intelligence. After all, if computers understand language the same way as humans do, it would shed light on human cognition as well as on artificial intelligence. Conversely, if there are no similarities between human and computational language understanding, we may have to conclude that artificial intelligence needs to scratch itself behind the ears and realize that it may never come close to human performance.
A well-known illustration in the psychological sciences emphasizes that the way humans understand language must be different than the way computers do. The illustration goes like this. Imagine you are locked up in a room with no windows. One side of the room has a small hole labeled “Input,” whereas the other side has a similar hole labeled “Output.” You receive words of a language you do not speak. All you have is a codebook that translates input to output. Your task is to find the input codebook and translate it to output and slide the answer through the output hole in the wall.
Let me make this a bit more concrete. Imagine you receive “10111011111” as input. You look through your codebook, and under the entry “10111011111,” you find “0000100011.” This is what you return as output. Do you now understand the code that has been provided to you? Clearly not, anybody would argue. All you have done is translate code into other code. But this is exactly what computers do. They translate code into other code!
You may object that computer code is not language. I agree, but likely for different reasons. Let’s use language. Imagine you receive as input the word “maart,” and you look up the word in the foreign dictionary. The entry “maart” returns “derde maand van het jaar; lentemaand.” This is what you return as output. The question now is whether you understand the word “maart” by translating it into another series of foreign words. Clearly not, anybody would argue.
Scientists who have used this example of language understanding argue that humans must be doing something different than computers. Instead of translating code into other code, humans ground symbols into the outside world (or their perceptual representation of the outside world). A dog is not a dog because of the outside world. A dog is not a dog because of its translations into other symbols (e.g., “huisdier”), but because it gets its meaning from pictures of dogs, sounds of dogs, their smell, their perceptual experiences, etc.
How language creates meaning
There is one important aspect missing from this argument. It assumes that 10111011111 is as arbitrary as “maart” and “dog.” But that is a mistake. The language system is not arbitrary. It certainly has arbitrary components between the sound of a word and its meaning, word order and its meaning, and word context and its meaning. But the language system—sound, word order, and context—is not entirely arbitrary.
Let me illustrate this by taking the months of the year. I could have taken the months of the year for any language to illustrate my point, but it may be easier to demonstrate this in English. Let’s take the 12 months of the year and compute how often they appear together in the English language. More concretely, let’s take 1,024,908,267,229 words from all the English language that Google has to its availability and compute how often the months of the year appear within a five-word window. The results are presented below.
Not much to become wiser from when trying to extract the meaning from the months of the year. Unless we look a little deeper. When we apply a simple statistical technique to put this matrix in a one-dimensional solution, we find something interesting, as presented in the table below. The loadings of the months of the year in a one-dimensional solution correlate strongly with the actual order of the months of the year (r = .85, p < .001 if you insist).
What does this suggest? Well, it demonstrates that the months of the year can be ordered by only taking their context into account. And that sheds light on how humans understand language and how computers could understand language.
The language system, contrary to computer code, is not some arbitrary combinations of 1s and 0s. Instead, it is an organized system from which language users can extract meaning. That does not mean that I am denying that symbols may need to be grounded to become meaningful. It may be useful to know that “maart” is a month, as it is to know that “March” is a month (its translation from Dutch). But once you know the 12 words are months, you can basically bootstrap the meaning of the words through the language system. And that sheds light on human cognition and artificial intelligence alike.
Brants, T., & Franz, A. (2006). Web 1t 5-gram version 1. Linguistic Data Consortium.
Louwerse, M. (2021). Keeping those words in mind: How language creates meaning. Prometheus Books.