In 2004, a sudden brainstem stroke left the then 30-year-old Ann Johnson completely paralyzed. It took years of physical therapy for her to regain enough muscle control to express emotion on her face and breathe independently, but the muscles controlling her speech remained stagnant. Her daughter, 13 months old at the time of the stroke, lived her entire life never hearing her speak. But now, 18 years later, the development of a new brain-computer interface (BCI) technology is giving Ann her voice back.
“18 years later, the development of a new brain-computer interface (BCI) technology is giving Ann her voice back.”
On August 23 of this year, a group of researchers led by Dr. Edward Chang, chair of neurological surgery at the University of California San Francisco, published a paper in Nature describing findings from their clinical study testing this BCI device on Ann. A BCI is a technology that, simply put, allows people to control machines with their thoughts by using brain signals to control output devices such as cursors or prostheses. Findings revealed that this specific device can rapidly decode neural signals associated with speech in paralyzed people and subsequently accurately generate text, synthetic speech audio, and facial movement on a digital avatar.
Study co-author Gopala Anumanchipalli and co-lead author Kaylo Littlejohn explain that previous findings from Chang’s lab contributed to the development of this device. In 2019, researchers at the lab first established that it was possible to use neural activity to synthesize speech. They then found that a speech neuroprosthesis device, an electrode array implanted onto the brain’s surface that’s used to record electrical activity, can utilize brain activity to decode full words in paralyzed people. In a subsequent study, they used the same speech neuroprosthesis technology to develop a spelling interface of a much larger vocabulary (over 1,000 words) that allowed the participant’s brain activity to be decoded directly into text in real time with high accuracy.
This recent study aimed to generate speech audio and corresponding digital facial movements in addition to just text. The team attached a paper-thin rectangle consisting of 253 electrodes to the region of Ann’s brain responsible for speech that could intercept signals that control movement in the tongue, jaw, larynx, and other facial muscles that allow for the production of speech. For weeks, they trained artificial intelligence (AI) algorithms by having Ann silently attempt to repeat sentences from an over 1,000-word vocabulary until the computer could recognize specific neural patterns associated with each sound. Once decoded, the device turned signals into audio waveforms, which were then played as audio feedback. Audio resembled Ann’s voice, a feature made possible by using a past recording of her speaking. Instead of recognizing complete words, the AI system decoded signals associated with phonemes, the shorter subunits of speech that form words. The computer only needed to recognize 39 phonemes to decode any word, which allows for rapid decoding and enhanced accuracy and thus the ability of the user to communicate almost as rapidly as the average human. Ann states that hearing the voice for the first time was like “hearing an old friend.”
The device also used the discrete code generated by translating brain signals associated with muscle movement during speech to accurately control a high-quality digital avatar that expressed the same facial expressions as Ann would have while speaking. The avatar, which realistically resembled Ann, could act happy, surprised, and sad in high, medium, and low intensities and also perform non-speech articulative movements.
“This study has allowed me to really live while I’m still alive!” – Ann
“I want patients..to see me and know their lives are not over now.” – Ann
The development of a technology that enables a paralyzed person to rapidly and accurately generate speech audio while simultaneously controlling a realistic digital avatar is groundbreaking. Ann explains that she finally feels like her life has meaning again: “Being a part of this study has given me a sense of purpose. I feel like I am contributing to society … this study has allowed me to really live while I’m still alive!” In the future, researchers plan on reducing the delay between the user’s thoughts and the production of speech from the avatar until the process can feel real-time. Although there is room for improvement, this device is a revolutionary step in improving the quality of life of those affected by paralysis. Ann states, “I want patients … to see me and know their lives are not over now.”