The Brain That Speaks Again
Three landmark studies have transformed brain-computer interfaces from promising experiments into tools that restore real conversation to people who cannot speak — at speeds once thought impossible.
Contents 6 sections
There is a particular kind of loneliness in losing your voice. Not just the voice that makes sound, but the voice that lets you say I need help or I love you or yes — the one that connects your inner world to everyone else’s.
For millions of people living with ALS, brainstem strokes, and similar conditions, that connection is severed. They remain completely conscious, their minds fully intact, but trapped behind a body that no longer obeys. Some communicate using eye-tracking systems that laboriously spell out words letter by letter. The fastest achieve about 14 words per minute. Natural conversation runs at 160.
In the past two years, a series of landmark studies has fundamentally changed what seems possible. The results are not incremental improvements. They are the kind of leap that, in retrospect, marks a before and after.
Two Papers, One Week, A Different World
In August 2023, the journal Nature published two papers back-to-back in the same issue. They came from different research groups, used different approaches, and focused on different participants. But both crossed the same threshold: brain-computer interfaces (BCIs) that decode speech from neural activity fast enough, and accurately enough, to sustain a real conversation.
The first paper came from Francis Willett, Krishna Shenoy, Jaimie Henderson and colleagues at Stanford, working within the BrainGate2 clinical trial 1. Their participant — referred to as T12 — was a woman with ALS who could no longer speak intelligibly. The researchers implanted four tiny microelectrode arrays directly into her motor cortex, specifically into area 6v (ventral premotor cortex) and area 44 (part of Broca’s area, the brain region long associated with speech production). Together, these arrays monitored activity from 256 electrodes simultaneously.
The system worked by detecting the firing patterns of neurons as T12 attempted to speak — not as she imagined speech, but as she genuinely tried to produce it with her articulators, even though the signals that would normally travel from brain to throat had been severed. A recurrent neural network converted these spiking patterns into phoneme probabilities in real time, which a language model then assembled into words.
The result: T12 decoded speech at 62 words per minute — 3.4 times faster than the previous record — with a word error rate of 9.1% on a 50-word vocabulary and 23.8% on a 125,000-word vocabulary. That last figure was the first successful demonstration, to the authors’ knowledge, of large-vocabulary decoding from brain signals. Her attempted speech was intelligible to a machine from a vocabulary as large as a standard English dictionary.
The second paper, from Sean Metzger, David Moses, Edward Chang and colleagues at UCSF 2, focused on someone in a different situation. Their participant was a 47-year-old woman who had experienced a pontine brainstem stroke 18 years earlier, leaving her with complete paralysis and the inability to speak or move her limbs.
Instead of penetrating microelectrodes, this team used a high-density electrocorticography (ECoG) array — 253 electrodes arranged across the surface of the brain like a fine mesh, resting on the cortex rather than piercing it. Placed over speech motor cortex and the superior temporal gyrus, this array captured distributed activity across a wider area.
The UCSF system achieved 78.3 words per minute — more than five times faster than the participant’s existing assistive device, which had given her 14.2 words per minute. But the researchers didn’t stop at text.
They also synthesized her voice.
Using deep-learning models trained on articulatory gestures decoded from her neural activity, the system generated audible speech — a synthetic voice derived from the patterns of how her brain tried to move her vocal tract. And alongside that audio, they animated a photorealistic facial avatar that moved in real time as she attempted to speak, including expressions of emotion decoded separately from neural activity.
She was not just communicating. She was present.
What the Brain Remembers
One of the most remarkable aspects of these results is that both participants had been unable to speak for years. The woman in the UCSF study had her stroke in 2005 — her brain had not successfully produced speech in nearly two decades.
And yet the neural representations were still there.
When the researchers mapped her brain activity during attempted speech, they found that her sensorimotor cortex still encoded articulatory movements — the positions and movements of the lips, tongue, jaw, and larynx — organized much like in healthy speakers. The motor cortex had maintained these representations in the absence of output. The instructions were intact; only the road to the throat was broken.
This is neurologically striking. The brain’s representation of speech movement is not simply pruned away when it goes unused. It persists — perhaps reinforced by inner experience of attempted speech, or preserved through mechanisms we don’t yet fully understand. For the field of speech neuroprosthetics, this is essential: it means the signals worth reading are likely still present in most patients, regardless of how long they have been nonverbal.
The New England Journal of Medicine, One Year Later
In August 2024, a third study appeared in the New England Journal of Medicine, from Nicholas Card, Maitreyee Wairagkar, and colleagues, again within the BrainGate2 trial 3. The participant: a 45-year-old man with ALS who had developed severe dysarthria and virtual quadriplegia five years after diagnosis.
The surgical team implanted four microelectrode arrays — 256 electrodes total — into the left ventral precentral gyrus, the region of motor cortex responsible for speech articulation. Twenty-five days after surgery, on the very first day of use, the system achieved 99.6% accuracy on a 50-word vocabulary. Calibration required just 30 minutes of cortical recordings.
By the second day, with only 1.4 additional hours of training, the system extended to a 125,000-word vocabulary with 90.2% accuracy. Over the following 8.4 months, the neuroprosthesis maintained 97.5% accuracy, and the participant used it for more than 248 cumulative hours of conversation — at a pace of approximately 32 words per minute — in unscripted, self-paced dialogue with family and friends.
Words decoded from his brain appeared on a screen and were then vocalized by text-to-speech software designed to sound like his pre-ALS voice. Being heard in his own voice again was part of the design.
Context: How Far This Field Has Come
To understand how significant these results are, it helps to know where the field started just three years earlier.
The 2021 baseline was Moses, Metzger and colleagues in the New England Journal of Medicine 4, also from UCSF’s Chang laboratory. In that work, a patient with anarthria from brainstem stroke decoded sentences at 15.2 words per minute, with a word error rate of 25.6%, from a 50-word vocabulary. At the time, this was celebrated as a breakthrough. It was.
The 2023 papers increased decoding speed by four to five times. They expanded vocabulary from 50 to 125,000 words. They added voice synthesis and facial animation. And by 2024, word error rates had dropped to the point where a participant could use the system for nearly 250 hours of natural conversation over eight months.
That is not incremental. That is transformation.
What Still Needs Work
These are clinical trials, not products. There are important caveats.
Each study involves a single participant. Demonstrating that a BCI works for one person is not the same as demonstrating it works across the range of neurological conditions, brain anatomies, and disease stages that cause communication loss. Replication across participants — including people earlier in the course of ALS, or with different stroke locations, or with more extensive motor impairment — is essential.
The implanted systems require neurosurgery, carry infection risks, and in the case of penetrating microelectrodes, there are open questions about long-term stability of the electrode-brain interface. ECoG arrays rest on the brain’s surface and may be more stable, but the surgical access is still significant.
The UCSF team required their participant to insert brief pauses between words during attempts to speak, helping the decoder segment the stream — an accommodation that healthy speakers don’t make. Getting to fully naturalistic, continuous speech without such aids remains an open problem.
And 32 to 78 words per minute, while transformative compared to letter-by-letter eye-tracking, is still about one-fifth to one-half the speed of natural conversation.
The Distance That Remains
None of those caveats diminish what has happened.
The limiting factors are now clearer and more tractable than they’ve ever been. More electrodes means lower error rates — the relationship in the Stanford data appears log-linear, with doubling the electrode count roughly halving the error rate. Denser, more stable arrays would help. So would decoders that generalize better across days without daily retraining, and language models that handle spontaneous, unscripted conversation as fluidly as prompted sentences.
These are hard engineering problems. They are not fundamental barriers.
There is a moment in one of the supplementary videos from the Stanford study where T12 reads a sentence on the screen, attempts to say it, and words appear on the display beside her. The words are not quite right every time — speech BCIs still stumble. But you can see the sentence forming. You can see her trying. You can see the system listening.
That gap — between trying to speak and being heard — has narrowed enormously. And for the people on the other side of it, that change is not a statistic.
Footnotes
-
Willett FR, Kunz EM, Fan C, et al. A high-performance speech neuroprosthesis. Nature. 2023;620(7976):1031–1036. DOI: 10.1038/s41586-023-06377-x. PMID: 37612500. ↩
-
Metzger SL, Littlejohn KT, Silva AB, Moses DA, et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature. 2023;620(7976):1037–1046. DOI: 10.1038/s41586-023-06443-4. PMID: 37612505. ↩
-
Card NS, Wairagkar M, Iacobacci C, et al. An accurate and rapidly calibrating speech neuroprosthesis. New England Journal of Medicine. 2024;391(7):609–618. DOI: 10.1056/NEJMoa2314132. PMID: 39141853. ↩
-
Moses DA, Metzger SL, Liu JR, et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. New England Journal of Medicine. 2021;385:217–227. DOI: 10.1056/NEJMoa2027540. PMID: 34260835. ↩