Inner Speech is the Future of Thought-to-Text BCI
Inner speech can be characterized as your speech production areas preparing to speak while your auditory cortex listens in, tagged by a corollary discharge signal that marks it as self-generated. When that signal fails, as in some individuals with schizophrenia, the voice sounds like someone else's. When it works, you get the experience of thinking in words, an experience that around 75% of people report regularly having.
I'm drawn to the subject for its BCI applications. Paralyzed participants in a recent study preferred inner speech over attempted speech for communication, not because it decoded better, but because it required less effort.
Background on Development and Cognition
Vygotsky proposed that inner speech emerges through internalization of social dialogue, with self-regulation, i.e. verbal mediation of your own behavior and cognition, as its primary function (Vygotsky, 1934). Contemporary research adds cognitive scaffolding (organizing complex thoughts), metacognition (reflecting on mental states), and social cognition (perspective-taking through internalized dialogue).
The link to working memory is now well-established. Baddeley and Hitch's phonological loop model (Baddeley & Hitch, 1974) has two components:
- a phonological store with 1–2 second decay, and
- an articulatory rehearsal process—essentially inner speech.
Three lines of evidence support the connection. Articulatory suppression (repeating irrelevant words aloud to block inner speech) impairs verbal memory. Longer words are harder to remember, suggesting time-based decay during rehearsal. And phonologically similar words ("cat," "mat," "bat") get confused more easily, indicating storage in sound-based form.

Inner speech and the phonological loop share neural substrates and developmental timing, but they're not identical. Inner speech has functions beyond rehearsal, and its phenomenology is richer than phonological coding alone.
Articulatory suppression increases switch costs, which are the time and error penalties when alternating between task rules (Emerson & Miyake, 2003) (Miyake et al., 2004). In Go/NoGo tasks, blocking inner speech increases errors; people press when they shouldn't. The effect is strongest when quick responses are required and "Go" trials dominate. Inner speech provides a verbal brake.

For planning tasks like the Tower of London, evidence is mixed. Children show clear benefits; adults, less consistently. Planning is fundamentally visuospatial, so verbal strategies may be supplementary.
Self-regulation links to inner speech across development too. Children with better-developed inner speech perform better on inhibition and delayed gratification tasks. In sports psychology, self-talk enhances performance and motivation. Curiously, interrogative self-talk ("Will I?") produces better task engagement than declarative ("I will") (Senay et al., 2010) (Dolcos & Albarracín, 2014).
Reading comprehension requires inner speech for difficult texts
Multiple lines of inquiry demonstrate that silent reading activates inner speech mechanisms to hold sentence structure in working memory, integrate new information with what came before, preserve prosody for interpretation, and simulate character voices.
A 2011 study used eye-tracking during silent reading of limericks and found disruption when rhyme patterns mismatched readers' regional accents (Filik & Barber, 2011) (Northern versus Southern English), indicating that inner speech reflects personal voice characteristics. Responses are slower for phonetically long words versus short words during silent reading even when orthographically matched, with effects stronger in slower readers.
Meanwhile, blocking inner speech impairs reading comprehension but not listening comprehension. The effect is stronger for less proficient readers, more difficult texts, and second-language readers. A neuroimaging study found that direct speech in fictional text activates voice-selective auditory cortex more than indirect speech (Yao et al., 2011), suggesting readers simulate characters' voices.
Interestingly, second-language reading shows progression from overt subvocalization and literal translation to fluent reading with minimal inner speech as proficiency develops (Kato, 2009). Skilled readers modulate inner speech based on text difficulty, reading purpose, and genre. While speed reading programs often claim to eliminate subvocalization, complete elimination is impossible and some subvocalization aids integration and deep processing of complex material.
Thinking ≠ Inner Speech
All this talk of inner speech; however, it is not the only mode of thinking. Descriptive Experience Sampling (DES) (Hurlburt et al., 2013), where participants are randomly signaled throughout the day to report their current inner experience, reveals five frequent phenomena:
- inner speech (~26%)
- visual imagery (~34%)
- unsymbolized thinking (~22%)
- feelings (~26%)
- sensory awareness (~22%).
Furthermore, individual variability is massive. Inner speech frequency ranged from 0% to 75% across individuals, with about 25% of participants reporting no inner speech at all.
Inner speech frequency ranged from 0% to 75% across individuals, with about 25% of participants reporting no inner speech at all.
Unsymbolized thinking refers to "the experience of an explicit, differentiated thought that does not include the experience of words, images, or any other symbols." These thoughts are directly present in consciousness without accompanying words or images, yet are fully articulate and meaningful. People can readily verbalize them afterward, but do not experience them verbally in the moment. This phenomenon challenges the assumption that all thinking must involve words or images.
Visual-spatial thinking represents another major mode. Research suggests approximately 30% of people strongly use visual/spatial thinking, 45% use both visual and verbal modes, and 25% think predominantly in words. Brain imaging studies show that visual thinkers have greater activation in primary visual cortex during spontaneous thinking, while verbal thinkers show more frontal language area activation. Functional distinctions emerge between the two modes, wherein visual thinking excels at spatial reasoning, pattern recognition, and holistic processing, while verbal thinking excels at sequential reasoning, analytical tasks, and explicit rule-following.
An interesting asymmetry exists. When people engage in verbal thinking, they often involuntarily generate visual imagery (Pexman et al., 2017), but visual thinking does not necessarily evoke inner speech. This suggests visual imagery may be more fundamental or automatic than verbal representation.
The emerging view is that inner speech capacity is universal across humans, but actual use and phenomenology are highly variable across individuals and contexts.
Development follows Vygotskian progression but with substantial individual variation
Inquiry into inner speech dates back centuries, but systematic study began in the early twentieth century. Behaviorist John Watson proposed that children learn to talk, then learn to talk quietly, then silently - just volume reduction (Watson, 1913). In his 1934 work "Thought and Language," he proposed that inner speech emerges through internalization of social dialogue - and transforms in structure and function as it internalizes, not merely in volume (Alderson-Day & Fernyhough, 2015).
Vygotsky's developmental sequence has three stages (Vygotsky, 1978) (Berk, 1992):
- social speech (age 2+) is used for external communication with others
- private speech (age 3+) - audible self-directed talk - helps children guide their own thinking and actions
- inner speech (age 7+) emerges when private speech becomes fully internal and transforms into silent thought used for planning, reasoning, and abstract thinking.
Vygotsky placed private speech at the center of his theory as the critical transitional process between speaking with others and thinking for oneself (Alderson-Day et al., 2020). He argued that language doesn't just express thought but actually forms it - thought comes into existence through words rather than merely being expressed by them (Vygotsky, 1987).
Research from 2020-2025 has largely confirmed Vygotsky's developmental sequence while revealing important nuances. Private speech, which is the audible self-directed talk children use during activities, typically emerges around ages 2-3 and peaks between ages 4-7, with incidence rates highest around age 5. During this peak period, children use private speech for self-regulation, planning, and task guidance and frequency increases when children face cognitive challenges.
Internalization transforms speech into thought
The internalization process begins around age 7, following a progression from audible speech to whispers to inaudible muttering to fully silent inner speech. Phonological similarity effects, a hallmark of verbal rehearsal in working memory, were traditionally thought to emerge around age 7. However, recent studies found evidence for these effects earlier (Jarrold & Citroen, 2013), suggesting age-related differences may reflect gradual improvements in recall capacity rather than a qualitative strategy shift. Some evidence indicates the phonological loop functions as a language-learning tool from as early as 18 months.
Task-relevant private speech correlates with better task performance, and private speech frequency increases with task difficulty. The relationship appears bidirectional: private speech aids performance, but effective use also requires sufficient attentional control. Studies using the Tower of London planning task found correlations between private speech use and phonological similarity effects, suggesting domain-general verbal mediation.
Adult inner speech shows massive individual variation
In adulthood, inner speech becomes increasingly sophisticated and may compensate for cognitive aging. Computational modeling predicted that as executive functions decline with age, inner speech contribution actually increases to offset the loss (Granato et al., 2022). Your internal voice picks up the slack for other cognitive systems.
However, massive individual differences exist. Descriptive Experience Sampling studies, where participants report their mental experience at random moments throughout the day, found inner speech occurred in only 20-26% of sampled moments on average. But the range across individuals was striking - from 0% to 75%. Some people think in words constantly; others almost never do. The recent recognition of anendophasia—the absence or minimal experience of inner speech—challenged long-held assumptions that everyone has an internal monologue (Nedergaard & Lupyan, 2024).
Even among those who do experience inner speech, it takes different forms. A survey found that 77% of adults report dialogic inner speech (back-and-forth conversations with yourself), 82.5% report evaluative or motivational self-talk ("you've got this" or "that was stupid"), 36.1% report condensed inner speech (fragmentary, abbreviated thoughts), and 25.8% sometimes hear other people's voices in their inner speech (McCarthy-Jones & Fernyhough, 2011).
Neural substrates reveal dual mechanisms and surprising motor involvement
The past decade has fundamentally revised our understanding of inner speech's neural architecture. While earlier work established the involvement of Broca's area (left inferior frontal gyrus, BA 44/45) and Wernicke's area (left superior temporal gyrus, BA 22), a comprehensive meta-analysis analyzed inner speech to identify a dual-mechanism framework (Pratts et al., 2023).

The first is a corollary discharge mechanism. When your brain plans any movement, it sends a copy of that command to sensory areas - a "heads up" about what to expect. This is why you can't tickle yourself; your brain predicts the sensation before it happens. For inner speech, motor planning regions activate as if preparing to speak, and that predictive copy creates the sensation of "hearing" your own voice internally. You're essentially eavesdropping on your brain's speech preparation.
The second is a perceptual simulation mechanism, which engages speech perception regions (temporal auditory areas) - your brain recreating the experience of hearing speech using the same regions that process actual sounds.
Critically, both mechanisms activate concurrently during inner speech. The balance between them depends on deliberate inner speech (rehearsing what you'll say), which leans on the motor/corollary discharge system, while spontaneous thoughts or imagining someone else's voice rely more on perceptual simulation.
Core language areas activate concurrently, not sequentially
Traditional models of language processing centered on Broca's area in the left frontal cortex (speech production and planning) and Wernicke's area in the left temporal cortex (speech comprehension). Early theories suggested a sequence - Broca's area plans the speech and Wernicke's area monitors it. But fMRI studies consistently show both regions activating together during inner speech. The arcuate fasciculus, a major white matter highway connecting these areas, enables rapid bidirectional communication. MEG studies have found hints of temporal ordering (Tian & Poeppel, 2010) (Tian & Poeppel, 2013), but the relationship appears more like a conversation than a relay race.
Inner speech also recruits regions beyond this classic language network including the supplementary motor area for sequencing and cognitive control, the anterior insula for articulatory planning, and the supramarginal gyrus for integrating sensory and motor information.
When inner speech becomes dialogic, as in an internal back-and-forth with yourself or imagined others, the brain recruits social cognition networks as well. These include the precuneus and posterior cingulate cortex (self-referential processing) and the right temporoparietal junction (theory of mind). Having a conversation in your head, it turns out, activates some of the same circuitry as having a conversation with another person (Alderson-Day et al., 2016).
Having a conversation in your head, it turns out, activates some of the same circuitry as having a conversation with another person.
Motor cortex represents inner speech more robustly than expected
A recent discovery disrupted the classical view of language regions once again.
Researchers implanted electrodes in ventral premotor cortex (area 6v) and area 55b in participants with paralysis, regions associated with movement planning, not language, and achieved real-time decoding of imagined sentences directly from motor cortex (Kunz et al., 2025).

The accuracy was remarkable, beyond sufficient for real-time communication with large vocabularies. Notably, participants preferred inner speech over attempted speech because it required less effort, representing a practical advantage for real-world use.

A complementary 2024 study deepens the picture. Researchers recorded single neurons in the supramarginal gyrus, part of the inferior parietal lobule, and found that the same neurons fire whether someone is speaking aloud, hearing speech, or just thinking in words (Wandelt et al., 2024). Inner speech, it seems, shares neural machinery with both production and perception, which may explain why motor cortex carries such rich information about imagined language.
The default mode network debate remains unresolved
The role of the default mode network (DMN) in inner speech remains contentious. In 2023, neuroscientist Vinod Menon proposed that the DMN integrates memory, language, and semantic representations to create an "ongoing internal narrative" central to sense of self, with potential origins in childhood self-directed speech through Vygotskian internalization (Menon, 2023). This would position the DMN as the neural substrate for our continuous inner dialogue, broadcasting integrated representations to create subjective continuity.
However, neurolinguist David Kemmerer published a critical review (Kemmerer, 2025) identifying five major challenges to this hypothesis:
- Inner speech doesn't actually originate from self-directed overt speech
- Massive individual differences mean any narrative isn't ongoing for everyone
- Rodents and primates have DMN but lack language
- Inner speech often has condensed rather than narrative form
- Only a couple neuroscientific studies support DMN engagement during inner speech
Further complicating Menon's model, a study analyzing 1,717 participants found that verbal-predominant thought profiles were associated with segregation of language networks rather than DMN integration (Cremona et al., 2025).
An alternative view suggests the DMN contributes to semantic processing by coordinating activity across cortical regions to create embodied situation models (Fernandino & Binder, 2024), which may support inner speech without requiring it. The DMN integrates information over slower timescales than other brain systems, with activation patterns persisting during narrative processing and transitioning at event boundaries.
The emerging consensus suggests DMN involvement may be specific to spontaneous rather than deliberate inner speech, relating more to mind-wandering and self-referential thought than to volitional verbal thinking. Clinically, the DMN drives inner voice activity during unfocused states. Meditation helps regulate this network, while emerging neuromodulation techniques using transcranial focused ultrasound can target the posterior cingulate cortex to modulate DMN connectivity and alter subjective self-experience (Lord et al., 2024).
Inner speech is viable for BCI
The dominant approach in speech BCIs is decoding attempted speech, using the motor commands your brain generates when you try to speak, even if no sound comes out. This works even for paralyzed patients who can't produce intelligible sounds; the motor cortex still fires as if preparing to speak.
The Kunz study clarified that attempted and inner speech evoke similar patterns of neural activity, but attempted speech produces stronger signals (Kunz et al., 2025). They exist on a continuum differentiated by signal magnitude rather than orthogonal subspaces.
Current high-performance systems use intracortical arrays consisting of 256 electrodes implanted directly into cortex. The Kunz study placed electrodes in ventral premotor cortex (area 6v) and area 55b, not Broca's area. This paid off: motor planning regions outperformed classical language areas for decoding, consistent with the corollary discharge mechanism described earlier.
For attempted speech, a 2023 study achieved 62 words per minute, 3.4 times faster than previous records, with 90.9% accuracy on a 50-word vocabulary and 76.2% on 125,000 words (Willett et al., 2023). A 2024 follow-up reached 99.6% accuracy after 25 days and 97.5% in open conversation after 8 months (Card et al., 2024).
Inner speech decoding is slower and less accurate: 67-86% for 50 words, 46-74% for 125,000 words. But participants preferred it. Attempted speech requires continuous effort because you're always "trying" to talk. Inner speech feels like thinking. For all-day use, that matters.
Lastly, we must consider the matter of privacy. If a system can decode your inner speech, which you might now understand as the application of language to your thoughts, who controls access? The Kunz study implemented password-protected decoding: a mental passphrase that must be recognized before the system activates. Accuracy exceeded 98%. Kind of like "Hey Siri" for brain computer interfaces, the system only listens and records when you want it to. General purpose thought-to-text BCIs should be expected to have these safeguards in place.
References
- Vygotsky, L. (1934). Thought and Language. [Link]
- Baddeley, A., & Hitch, G. (1974). Working memory. Psychology of Learning and Motivation, 8, 47-89. doi:10.1016/S0079-7421(08)60452-1
- Emerson, M., & Miyake, A. (2003). The role of inner speech in task switching: A dual-task investigation. Journal of Memory and Language, 48, 148-168. doi:10.1016/S0749-596X(02)00511-9
- Miyake, A., Emerson, M., Padilla, F., & Ahn, J. (2004). Inner speech as a retrieval aid for task goals: The effects of cue type and articulatory suppression. Acta Psychologica, 115(2-3), 123-142. doi:10.1016/j.actpsy.2003.12.004
- Senay, I., Albarracín, D., & Noguchi, K. (2010). Motivating goal-directed behavior through introspective self-talk: The role of the interrogative form of simple future tense. Psychological Science, 21(4), 499-504. doi:10.1177/0956797610364751
- Dolcos, S., & Albarracín, D. (2014). The inner speech of behavioral regulation: Intentions and task performance strengthen when you talk to yourself as a You. European Journal of Social Psychology, 44, 636-642. doi:10.1002/ejsp.2048
- Filik, R., & Barber, E. (2011). Inner speech during silent reading reflects the reader's regional accent. PLoS ONE, 6(10), e25782. doi:10.1371/journal.pone.0025782
- Yao, B., Belin, P., & Scheepers, C. (2011). Silent reading of direct versus indirect speech activates voice-selective areas in the auditory cortex. Journal of Cognitive Neuroscience, 23(10), 3146-3152. doi:10.1162/jocn_a_00022
- Kato, S. (2009). Suppressing inner speech in ESL reading: Implications for developmental changes in second language reading fluency. Modern Language Journal.
- Hurlburt, R., Heavey, C., & Kelsey, J. (2013). Toward a phenomenology of inner speaking. Consciousness and Cognition, 22(4), 1477-1494. doi:10.1016/j.concog.2013.10.003
- Pexman, P., Dancy, C., & McGregor, K. (2017). Verbal thinking and visual imagery. Cognition.
- Watson, J. (1913). Psychology as the Behaviorist Views It. Psychological Review, 20(2), 158-177. doi:10.1037/h0074428
- Alderson-Day, B., & Fernyhough, C. (2015). Inner Speech: Development, Cognitive Functions, Phenomenology, and Neurobiology. Psychological Bulletin, 141(5), 931-965. doi:10.1037/bul0000021
- Vygotsky, L. (1978). Mind in Society: The Development of Higher Mental Processes.
- Berk, L. (1992). Children's Private Speech: An Overview of Theory and the Status of Research. Private Speech: From Social Interaction to Self-Regulation.
- Alderson-Day, B., Mitrenga, K., Wilkinson, S., McCarthy-Jones, S., & Fernyhough, C. (2020). The Emergence of Inner Speech and Its Measurement in Atypically Developing Children. Frontiers in Psychology, 11, 279. doi:10.3389/fpsyg.2020.00279
- Vygotsky, L. (1987). Thinking and Speech.
- Jarrold, C., & Citroen, R. (2013). Reevaluating key evidence for the development of rehearsal: Phonological similarity effects in children are subject to proportional scaling artifacts. Developmental Psychology, 49(5), 837-847. doi:10.1037/a0028771
- Granato, A., Cartoni, E., Da Rold, F., Mattera, A., & Baldassarre, G. (2022). A computational model of inner speech supporting flexible goal-directed behaviour in Autism. Scientific Reports, 12, 14198. doi:10.1038/s41598-022-18445-9
- Nedergaard, J., & Lupyan, G. (2024). Not everybody has an inner voice: Behavioral consequences of anendophasia. Psychological Science, 35(5), 506-520. doi:10.1177/09567976231225925
- McCarthy-Jones, S., & Fernyhough, C. (2011). The varieties of inner speech: Links between quality of inner speech and psychopathological variables in a sample of young adults. Consciousness and Cognition, 20(4), 1586-1593. doi:10.1016/j.concog.2011.08.005
- Pratts, J., Barsics, C., & Desmedt, O. (2023). Bridging phenomenology and neural mechanisms of inner speech: ALE meta-analysis on egocentricity and spontaneity in a dual-mechanistic framework. NeuroImage, 282, 120399. doi:10.1016/j.neuroimage.2023.120399
- Tian, X., & Poeppel, D. (2010). Mental imagery of speech and movement implicates the dynamics of internal forward models. Frontiers in Psychology.
- Tian, X., & Poeppel, D. (2013). The effect of imagination on stimulation: The functional specificity of efference copies in speech processing. Journal of Cognitive Neuroscience.
- Alderson-Day, B., Weis, S., McCarthy-Jones, S., Moseley, P., Smailes, D., & Fernyhough, C. (2016). The brain's conversation with itself: neural substrates of dialogic inner speech. Social Cognitive and Affective Neuroscience, 11(1), 110-120. doi:10.1093/scan/nsv094
- Kunz, E., Willett, F., Avansino, D., Hochberg, L., Henderson, J., & Shenoy, K. (2025). Decoding inner speech for brain-computer interfaces. Cell, 188, 1-15. doi:10.1016/j.cell.2025.06.015
- Wandelt, S., Bjånes, D., Pejsa, K., Lee, B., Liu, C., & Andersen, R. (2024). Representation of internal speech by single neurons in human supramarginal gyrus. Nature Human Behaviour, 8(6), 1136-1149. doi:10.1038/s41562-024-01867-y
- Menon, V. (2023). 20 years of the default mode network: A review and synthesis. Neuron, 111(16), 2469-2487. doi:10.1016/j.neuron.2023.04.023
- Kemmerer, D. (2025). Does the Default Mode Network Mediate an Ongoing Internal Narrative? An Evaluation of Menon's (2023) Hypothesis. Journal of Cognitive Neuroscience, 37(12), 2676-2683. doi:10.1162/jocn_a_02066
- Cremona, S., Gillig, A., Mellet, E., & Joliot, M. (2025). A dynamic framework of brain functional patterns shaped by spontaneous thoughts beyond the default mode network. Scientific Reports, 15, 28389. doi:10.1038/s41598-025-28389-0
- Fernandino, L., & Binder, J. (2024). How Does the "Default Mode" Network Contribute to Semantic Cognition?. Brain and Language, 252, 105405. doi:10.1016/j.bandl.2024.105405
- Lord, B., et al. (2024). Transcranial focused ultrasound to the posterior cingulate cortex modulates default mode network and subjective experience: An fMRI pilot study. Frontiers in Human Neuroscience, 18, 1392199. doi:10.3389/fnhum.2024.1392199
- Willett, F., et al. (2023). A high-performance speech neuroprosthesis. Nature, 620, 1031-1036. doi:10.1038/s41586-023-06377-x
- Card, N., et al. (2024). An accurate and rapidly calibrating speech neuroprosthesis. New England Journal of Medicine, 391(7), 609-618. doi:10.1056/NEJMoa2314132