
Psychology
Shared computational principles for language processing in humans and deep language models
A. Goldstein, Z. Zada, et al.
Discover groundbreaking insights into how the human brain and innovative deep language models (DLMs) converge in processing natural narratives. This fascinating study led by authors from Princeton University and Google Research uncovers the shared computational principles that enable both to predict language, enhancing our understanding of neural processing.
~3 min • Beginner • English
Introduction
The study investigates whether core computational principles implemented by autoregressive deep language models (DLMs), trained via next-word prediction on natural text, align with mechanisms used by the human brain during natural language processing. Traditional psycholinguistic models emphasize symbolic, rule-based operations and parts of speech, whereas DLMs learn contextual representations (contextual embeddings) to predict subsequent words without explicit linguistic priors. The authors pose the hypothesis that the brain, like autoregressive DLMs, continually predicts upcoming words, evaluates prediction errors upon word onset (surprise), and represents word meaning contextually. They aim to provide behavioral and neural evidence for these shared principles during continuous narrative comprehension, addressing longstanding questions about spontaneous prediction in natural contexts and the neural basis of context-dependent representations.
Literature Review
Prior work has used language models and machine learning to model semantic representations in the brain but did not typically treat autoregressive DLMs as cognitively plausible models of language coding. Recent theoretical accounts suggest fundamental connections between DLMs and human language processing, including predictive coding frameworks. Empirically, increased neural activity around 400 ms after word onset for unpredictable words (N400-like effects) has been linked to surprisal. Studies using modern language models have quantified surprisal and confidence for upcoming words in natural language and related these to neural measures, showing correlations between post-onset activation and model-based surprise. Additional work indicates that contextual embeddings (for example, from GPT-2) better model neural responses than static embeddings, though prior studies did not directly demonstrate continuous pre-onset prediction during naturalistic listening. The present study builds on these findings by directly testing pre-onset prediction, mapping the coupling between pre-onset confidence (entropy) and post-onset surprise (cross-entropy), and showing how contextual embeddings capture both past context and predictive information reflected in neural signals.
Methodology
Stimuli and tasks: Participants listened to a 30-min spoken narrative from the podcast This American Life ('Monkey in the Middle'), which was manually transcribed and temporally aligned to audio using forced alignment with manual correction.
Behavioral experiment: 300 adult participants (six nonoverlapping groups of 50) on Mechanical Turk performed a sliding-window next-word prediction task over the transcript. A ten-word window was shown; participants typed the next word, saw feedback, and the window advanced by one word until each segment’s words were predicted. This produced 50 predictions per word (final dataset: 5,078 words; 33 omitted due to a technical error). Predictability scores (percent correct per word) were computed. Human predictions were compared with GPT-2’s next-word probabilities across varying context window sizes (2 to 1,024 tokens). Baseline comparisons used 2- to 5-gram models trained on the Brown corpus with various smoothing methods.
Neural experiment (ECoG): Nine epilepsy patients (from an initial ten; one excluded for noise/seizure activity), implanted with intracranial electrodes for clinical purposes, freely listened to the same story without explicit prediction instructions. Coverage: 1,339 electrodes total; better coverage in the left hemisphere (1,106 LH; 233 RH). Data were acquired at 512 Hz (or resampled to 512 Hz), referenced to subdural strips, and localized to cortical surface via coregistered MRI/CT. Preprocessing included despiking, rereferencing (common average or ICA-based), estimating broadband high-gamma power (70–200 Hz excluding line noise), log-transform, z-scoring, and zero-phase temporal smoothing with a 50 ms Hamming window. Temporal leakage from preprocessing was quantified and bounded to at most ~93 ms.
Representations and models: Static word embeddings (GloVe; also word2vec 100d) and arbitrary random embeddings (50d) were used for encoding analyses. Contextual embeddings were extracted from pretrained GPT-2 (final hidden layer for the second-to-last token in a 1,024-token sliding window), aligned to spoken-word tokens; dimensionality reduced to 50 where needed for comparisons. GPT-2 provided distributions over next words to compute pre-onset confidence (entropy) and post-onset surprise (cross-entropy of the actual word).
Encoding analyses: Linear encoding models predicted neural responses (high-gamma) from embeddings at multiple lags relative to word onset (−2,000 to +2,000 ms in 25-ms steps). Inputs were 200-ms averaged windows; model performance was the correlation between predicted and actual held-out signals using 10-fold cross-validation. Significant electrodes were identified via a phase-randomization permutation test (5,000 permutations), controlling FDR at q<0.01. Group-level lag-wise significance and pairwise model comparisons used permutation tests with FDR correction. Control analyses projected out previous-word embeddings, removed repeated bigrams, and contrasted models using previous versus concatenated previous+current embeddings.
Behavior–neural alignment: Words were split by prediction accuracy using GPT-2’s top-5 criterion (62% correctly predicted; 38% incorrectly predicted) and alternative top-1 criteria for both humans and GPT-2 (Extended Data). Encoding was computed separately for correctly predicted, incorrectly predicted, and actually perceived words to dissociate pre-onset prediction from post-onset perception.
Decoding analyses: A deep convolutional neural network mapped multielectrode and multitime-bin neural data (ten 62.5-ms bins over 625 ms per lag; 160 LH electrodes significant in GloVe encoding, selected within training folds) to arbitrary, static (GloVe), or contextual (GPT-2) embedding spaces. Training used five temporal folds (three train, one dev for early stopping, one test), minimizing MSE. For evaluation, predicted embeddings were compared via cosine distance to embeddings of all word labels; ROC-AUC quantified classification performance, weighted by word frequency. Words with at least five repetitions (69% of transcript) were included. An ensemble of 10 decoders with different initializations improved stability.
Statistical measures: Bootstrapping and permutation tests assessed lag-wise significance, with FDR control (q=0.01). Confidence and surprise derived from GPT-2 were related to pre- and post-onset neural signals to test the coupling between prediction and error signals.
Key Findings
Behavioral next-word prediction: Humans achieved a mean predictability score of 28% (s.e. 0.5%), far above a 6% baseline (guessing 'the'). Roughly 600 words had predictability >70%, spanning parts of speech. Human predictability closely matched GPT-2’s estimated predictability (r=0.79, P<0.001). Humans and GPT-2 shared top predictions 49.1% of the time (regardless of accuracy). Confidence-to-accuracy functions were similar; both were under-confident and exceeded 95% accuracy when assigned probabilities >40%. Increasing GPT-2’s context window improved correlation with human predictions from r=0.46 (2-word context) to an asymptote of r=0.79 at ~100-word context.
Pre-onset prediction in neural data: Linear encoding using static GloVe embeddings significantly predicted neural responses to upcoming words up to ~800–1,000 ms before word onset, peaking ~150–200 ms after onset in significant LH electrodes (n=160). Arbitrary embeddings also yielded significant pre-onset encoding, indicating predictive information beyond static lexical statistics. Controls removing previous-word information, repeated bigrams, and comparing previous vs previous+current embeddings confirmed that pre-onset neural signals contained information about the identity of the next word above and beyond local context.
Prediction versus perception dissociation: Before onset, encoding tracked the content of predicted words (both correct and incorrect predictions), reflecting listeners’ expectations; after onset, encoding tracked the actually perceived word, not the (incorrect) prediction. This cleanly dissociated pre-onset predictive processes from post-onset comprehension.
Prediction–surprise coupling: Pre-onset activity increased for correctly predicted words, while post-onset activity increased for incorrectly predicted (surprising) words, with a robust effect ~400 ms after word onset. GPT-2-derived entropy (pre-onset confidence) and cross-entropy (post-onset surprise) provided a unified framework linking pre-onset prediction to post-onset error signals in neural activity.
Contextual representations: Contextual embeddings (GPT-2) significantly improved encoding over static embeddings across many electrodes, both before and after word onset. Averaging or shuffling contextual embeddings across occurrences of the same word reduced performance to static levels, indicating that unique, occurrence-specific context is crucial. Concatenating static embeddings for preceding words did not match GPT-2’s performance, suggesting superior compression of contextual information by contextual embeddings.
Decoding word identity: A contextual-embedding decoder classified word identity from neural signals substantially better than static or arbitrary decoders before and after onset. Average ROC-AUC reached ~0.74 at lag +150 ms (window −162.5 to +462.5 ms) for GPT-2 versus ~0.68 for GloVe/arbitrary. Predictive information about the next word’s identity was detectable up to ~1,000 ms before onset. Performance dropped to chance at lags beyond ~2 s.
Discussion
The findings directly address whether human language processing shares core computational principles with autoregressive DLMs. Behaviorally, humans display robust next-word prediction in naturalistic contexts that track GPT-2’s predictions, particularly with longer context windows. Neurally, the brain exhibits continuous pre-onset prediction of upcoming words, as evidenced by significant encoding up to a second before articulation. The dissociation between pre-onset prediction (reflecting internally generated expectations) and post-onset processing (reflecting the perceived word) aligns with predictive coding theories. Moreover, neural activity after onset scales with model-based surprise around 400 ms, linking pre-onset confidence and post-onset error signals within a unified predictive framework.
Contextual embeddings offer a powerful model of brain representations that integrate multi-timescale context and predictive information, outperforming static embeddings in both encoding and decoding. These results suggest that the brain represents words in a context-specific manner and continuously leverages prior context to anticipate forthcoming input, paralleling DLM objectives. While the computations may be implemented differently (for example, human serial processing vs transformer parallelism), the shared principles provide a biologically plausible framework for investigating the neural basis of language and for bridging machine and human language processing.
Conclusion
The study provides convergent behavioral and intracranial neural evidence that humans and autoregressive deep language models share three core computational principles in natural language processing: continuous context-dependent next-word prediction before word onset, use of pre-onset predictions to compute post-onset surprise (prediction error), and reliance on contextual embeddings to represent words. Contextual embeddings better capture neural representations than static embeddings and enable decoding of word identity even before onset. These results support using autoregressive DLMs as a modeling framework for the neural basis of language. Future research should examine how predictive, context-rich representations interface with broader cognition (for example, generating new thoughts), explore biologically plausible implementations beyond transformers, assess additional learning objectives and timescales in the brain, and test generalization across stimuli, modalities, and populations.
Limitations
- Participant sample and coverage: Neural data come from nine epilepsy patients with clinical electrode placement and stronger left-hemisphere coverage, potentially limiting generalizability and lateralization inferences.
- Stimulus specificity: Findings are based on a single 30-minute narrative; generalization to other genres, languages, and modalities remains to be tested.
- Preprocessing temporal leakage: Although bounded (~≤93 ms) and controlled for, preprocessing introduces limited temporal uncertainty in pre- vs post-onset effects.
- Decoding constraints: Word classification used only words with at least five repetitions and embedding-space constraints, which may bias performance estimates and exclude rare words.
- Model–brain implementation gap: While principles align, transformers are not biologically implemented; brain mechanisms may realize prediction and contextualization differently.
- Behavioral demographics: MTurk participant demographics (age, gender) were not collected, limiting characterization of behavioral variability.
Related Publications
Explore these studies to deepen your understanding of the subject.