Why It's So Hard to Understand Each Other

Linguistic research explains our trouble talking through a mask.

Posted Oct 02, 2020

This year has been the harbinger of many new things for most of us. Unfortunately, it's mostly been bad things—COVID 19, police brutality, economic uncertainty, rioting, and looting. 

But we’ve also come up with a few good things, like learning how to communicate across platforms and space in ways we never have before (enter the Zoom revolution), discovering the freedom to not wash our hair every day (my personal favorite), and learning how work and home life can be fitfully but productively integrated (cue barking dog here). Still, one new area we still seem to be having a bit of trouble with is making masks part of our daily ritual.  

There are many reasons people may be reticent to wear masks, but one I hear about a lot is that it makes talking more difficult, and, in turn, affects speech comprehension. With masks come muffling and, even more problematic, no mouth visibility, something that has a significant impact on how well we can decipher what someone says. While masks may keep us safe, from a linguistic perspective, they open up a can of comprehension worms, to say the least.

In contrast, we seem to do fine on the phone, where we don’t even get the benefit of eye contact or body language, so what gives? Why do masks seem to make it more difficult to get across what we are trying to say? Luckily, we can turn to speech science for an answer and perhaps even some advice on how to achieve a little more mask-speech harmony.

The Mouth/Ear Connection

It may seem strange that you would need to see someone’s mouth to understand what they are saying. After all, we’ve communicated over the telephone without visuals for years and no one has yet to declare phones the gateway to communication hell.  

But we tend to talk to people we know fairly well on phones and have a specific topic and context to draw upon that we can use to fill in the blanks. Also, because we have no expectation of visual information on the phone, our brains are forced to rely only on auditory information for the task of speech processing.

When face to face, we instead integrate multi-sensory cues, including both auditory (hearing) and visual information, into how we process speech sounds. To see how this works, check out experiments on the McGurk Effect, which illustrate how receiving incongruent information in audio and visual cues alters what sound a listener will report hearing.

Based on an effect first noted by psychologists Harry McGurk and John McDonald, we find that when watching a video of someone’s mouth making a "g’" sound while hearing audio of a "b" sound, listeners reported hearing a "d" sound, seemingly synthesizing the two sounds into an intermediate one.

What the McGurk effect really illustrates is that visual cues are part of what we call the "speech chain," the connection between how a speaker produces speech and how a listener receives it. When available, visual information assists a listener in processing what sounds a speaker is making. When wearing a mask that obscures the mouth, we are forced to rely only on auditory information.

The takeaway here is that seeing lips does indeed help us decipher what people are saying, as anyone with hearing loss who relies on lip-reading knows. But we all utilize this lips-to-ear connection when conversing. 

To see this in action, say "me" vs. "you." Notice that when saying "me" your lips are visibly spread, when saying "you," your lips are pursed and rounded, giving cues to which vowel sounds were made. While we don’t absolutely need to have visual information to help us decode the speech stream, it definitely helps.  

Masks, by cutting off visual cues, force us to use just auditory cues. This alone is not a problem, since we are certainly able to hear people over phones. It becomes a problem when the auditory signals available are also degraded by an inability to move our mouths normally to produce speech sounds or by layers of fabric that dampen the acoustics frequencies of a sound due to sound wave absorption. Both problems, of course, posed by wearing masks.

The Science Behind the Mask

The potential for hearing problems associated with masks is not new, and there has in fact been research investigating the impact of mask-wearing on comprehension. After all, degraded hearing is not just a problem for those of us ordering a venti latte at Starbucks in the days post-COVID 19, but it has long been a critical issue for doctors and nurses, for whom miscomprehension can be the difference between life or death.

 Ian Clayton, used with permission
Acoustic waveform of 'mask' spoken with and without a mask.
Source: Ian Clayton, used with permission

In a 2008 study, Mendal et al. compared the comprehension of speech through masks for both hearing impaired and non-impaired subjects (in both noisy and quiet conditions). While they found that the acoustics of the speech was significantly altered when masks were worn, they did not find any significant effect of mask-wearing on speaker comprehension. Noise, instead, was the significant factor.

In a more recent follow-up study, Atcherson et al. found that while normal hearing groups performed well whether speakers wore a mask or not, hearing-impaired listeners did better when transparent masks were used, allowing them to integrate audio-visual information. In other words, when we have some hearing difficulty, we rely on other modalities in addition to aural information.

However, both of these studies looked only at surgical masks in a medical setting, and researchers hypothesized that mask-wearers in such experimental settings may have spoken louder as compensation for wearing a mask. In today’s world, masks come in a variety of forms ranging from last ski season’s balaclavas to homemade fabric designs, and we wear them in our noisy real lives. Aside from contributing to bank tellers’ unease, do such non-medical masks wreak havoc with our hearing?

In a 2013 study, forensic linguists Nicole Fecher and Dominic Watt examined how various types of masks differed in terms of their effects on mouth movement, sound absorption, and listener comprehension. Somewhat surprisingly, none of the mask types, even those that obscured the mouth the most, obstructed motor movement (in other words, people’s mouths still moved normally), though the more a mask allowed a view of a speaker’s face, and, in particular, their lips, the greater listeners’ accuracy in perception. But, overall, comprehension was still quite high.

Fecher and Watt found that, without access to lips, we seem to shift to using other visual cues, such as the movements of the cheeks and chin, to assist in decoding what sounds we heard. They hypothesize that tighter-fitting masks that display more of the facial architecture may help provide such visual cues. Since this study was performed in pre-pandemic days, homemade fabric masks were not among those studied and the thickness and loose-fit of such masks may contribute to greater problems with comprehension.

Unmasking the Take-Away

In effect, the results of the few scientific studies of mask-wearing on speech production and speech perception do not indicate that our days of easy comprehension are behind us. Instead, they suggest that we are still able to talk fairly well and hear fairly well despite our masked state, particularly in quiet contexts.

So, why do we still feel like we struggle to understand each other behind our masks? Probably because, as some of these experiments suggested, noisy conditions are a factor. Also, these studies did not require subjects to be safely socially distant. Many of the places where we converse, like outdoors and in stores with background noise, are not conducive to speech comprehension even in the best of circumstances and the greater distance between us requires soundwaves to travel farther.

As a result of this combination of more space and less face, the loss of the integration of visual information in our speech processing then has a potentially greater effect. Throw in any undiagnosed higher frequency hearing loss, which many of us less-than-spring chickens often have, and it is a recipe for hearing issues. The strategies we might have unknowingly been relying on to compensate for this loss (i.e. lip reading) are no longer available to us.

But the good news is that masks, in and of themselves, do not seem to have strongly significant effects on hearing if we can try to minimize noise and we have normal hearing. It is also important that they fit so that they do not obstruct our motor movement and even better if they display at least some of our face for listeners to use as compensatory visual cues. Transparent masks also seem to help. But if you are still struggling to comprehend what people are saying behind the mask, it might be the universe telling you that it is time for a trip to an audiologist.

References

Atcherson SR, LL Mendel, WJ Baltimore, et al. 2017. The Effect of Conventional and Transparent Surgical Masks on Speech Understanding in Individuals with and without Hearing Loss. J Am Acad Audiol. 28(1):58-67. 

Fecher, N., & Watt, D. 2013. Effects of forensically-realistic facial concealment on auditory-visual consonant recognition in quiet and noise conditions. In International Conference on Auditory-Visual Speech Processing (AVSP 2013).

McGurk, H., and J. Macdonald. 1976. Hearing lips and seeing voices. Nature 264, 746–748.

Mendel LL, JA Gardino, and SR Atcherson. 2008. Speech understanding using surgical masks: a problem in health care?. J Am Acad Audiol. 19(9):686-695