Our sense of hearing is truly remarkable, all the more so when you understand the rich layers of information that exist within even a single spoken sentence. Of course this is something most of us take for granted, until you discover what life is like for people who don’t have the benefit of fully functioning hearing.
I have no direct experience with cochlear implants (CIs) – electronic devices that partly compensate for severe hearing impairment. But listening to a simulation of the sound produced is salutary. It is rather like hearing things underwater: fuzzy and with an odd timbre, yet still conveying words and some other identifiable sounds.
It is a testament to the adaptability of the human brain that this information can be recognizable even when the characteristics of the sound are so profoundly altered. Some people with CIs can appreciate and even perform music.
But there are serious limitations to what people with CIs can hear. And understanding these limitations could provide insights into how sound is processed in people with normal hearing – insights that can help us identify what can potentially go wrong and how it might be fixed. That is evident in a trio of papers buried in the little-known but infallibly fascinating Journal of the Acoustical Society of America – a publication whose scope ranges from urban noise pollution to whale song and the sonic virtues of cathedrals.
These three papers examine what gets lost in translation in CIs. Much of the emotional content, as well as some semantic information, in speech is conveyed by the rising and falling of voice – known as prosody. In the English language, prosody can distinguish a question from a statement (at least before the rising inflection became fashionable). It can also tell us whether the speaker is happy, sad or angry.
But CIs cannot fully convey the pitch of sounds, as well as their “spectrum” of sound frequencies, and this means that users may find it harder to tell whether someone’s voice is happy rather than sad, for instance. So users rely more on visual than on audio information to gauge a speaker’s emotional state.
In the first study, Takayuki Nakata of Future University Hakodate in Japan and his co-workers verified that Japanese children who are born deaf but use CIs are significantly less able to identify happy, sad, and angry voices than normal hearers of the same age. They went further than previous studies, however, by showing that these difficulties prevent children from communicating emotion through prosody in their own speech. This suggests two things: first, that we learn to communicate emotion by hearing and copying; and second, that CI users face an additional burden in that other people are less likely to understand their emotions.
Difficulties in hearing pitch can create even more severe linguistic problems. In tonal languages such as Mandarin Chinese, changes in pitch may alter the semantic meaning of a word. So CI users may struggle to distinguish such tones even after years of using a device, and hearing-impaired Mandarin-speaking children who start using them before they can speak are often scarcely intelligible to adult listeners – again, they cannot learn to produce the right sounds if they cannot hear them.
To understand how CI users might perceive these language tones, Damien Smith and Denis Burnham of the University of Western Sydney in Australia tested normal hearers with audio clips of spoken Mandarin altered to simulate CIs. The results were surprising.
Both native Mandarin speakers and English-speaking subjects do better in identifying the (four) Mandarin tones when the CI-simulated voices are accompanied by video footage of the speakers’ faces. That is not so surprising. But all subjects did better with visuals alone – and in this case non-Mandarin speakers did better than Mandarin speakers.