There are a host of bad things that can rob someone of their ability to speak – but for some, brain-computer interfaces may be the key to restoring communication, said Meta researcher Jean-Rémi King. TIME.
“By placing an electrode on the motor areas of a patient’s brain, we can decode the activity and help the patient communicate with the rest of the world,” King said.
Already, a brain implant has restored a paralyzed patient’s ability to communicate. Rather than having to point to individual letters or words, the neuroimplant translates its thoughts directly into words.
Phiip O’Keefe, an Australian with ALS, has a brain-computer interface chip that allows him to translate his thoughts into text, opening up a whole world of electronic communication, including Twitter. Perhaps most impressively, a patient whose ALS progressed to full interlock syndrome also received an implant that enabled communication.
“But obviously it’s extremely invasive to put an electrode in someone’s brain,” King said.
(In O’Keefe’s case, it’s worth noting that the implant went through his jugular, so he didn’t need open-brain surgery, although it was nonetheless major surgery.)
“So we wanted to try using non-invasive recordings of brain activity. And the goal was to build an AI system that could decode brain responses to spoken stories.
King and his colleagues at the Facebook Artificial Intelligence Research (FAIR) Lab have started doing just that, creating deep learning AI that can decode speech from brainwaves — to some extent.
These datasets contain the brain recordings of 169 healthy volunteers, taken while listening to audiobooks in Dutch and English, for more than 150 hours.
Because the goal is to decode speech non-invasively, the team used data recorded by measuring the brain’s electrical activity – electroencephalography or EEG – and magnetic activity, known as magnetoencephalography or MEG .
Both are recorded via sensors outside the skull, which was one of the researcher’s main challenges, King told TIME: “noisy” data limited by the sensors’ distance from the brain, and the impacts of skin, skull, water, etc. ., on signals.
All that noise is made even harder to cut because, well, we’re not 100% sure what we’re looking for.
“The other big problem is more conceptual in that we don’t really know how the brain represents language to a large extent,” King said.
Using audiobooks and brain recordings, the AI analyzed them to spot patterns between heard words and brain waves.
It’s the problem of decoding speech that the team wants to outsource to AI, because it stimulates brain activity with an action – in this case, what a subject hears.
Without AI, it “would be very difficult to say, ‘OK, this brain activity means this word, or this phoneme, or an intention to act, or whatever,'” King said.
Speech decoding: After cutting those hours into three seconds, they passed both the audiobook and the brain recordings to the AI, which analyzed them to spot patterns.
The team kept 10% of the data to test their model, The new scientist reported: use learned patterns of the remaining 90% to try to identify words heard in brain recordings he had never seen.
“After training, our system performs what is called ‘zero-shot’ classification: given a snippet of brain activity, it can determine from a large number of new audio clips which one the person actually heard,” King wrote in the Meta blog. “From there, the algorithm deduces the words the person likely heard.”
Specifically, the AI relied on its 793-word vocabulary to create ten-word lists of its best guesses, New Scientist reported, roughly decoding speech.
According to their preprint, the AI was able to get the correct word in the first ten 72.5% of the time using three seconds of MEG data – hitting it at first 44% of the time – and 19.1% for EEG data.
And after: Thomas Knopfel, a professor at Imperial College London, told New Scientist that the system would need more refinement before it could be practically useful for decoding speech, and he is skeptical that EEG and the MEG – being non-invasive – could ever provide the granular detail needed for accuracy.
“It’s about flows of information,” Knopfel told New Scientist. “It’s like trying to stream an HD movie over old-fashioned analog phone modems. Even under ideal conditions, with someone sitting in a dark room with headphones on, just listening, there are other things going on in the brain. In the real world, this becomes completely impossible.
However, advancements in technology may change that: a new form of MEG called OPM is pushing the limits of what can be learned from the outside.
For his part, King told TIME that they currently only decode speech to the extent that they recount what people heard in the scanner; this is not intended for product design yet, only for basic research and proof of principle.We would love to hear from you! If you have a comment about this article or have a tip for a future Freethink story, please email us at [email protected]