logo
ResearchBunny Logo
Semantic encoding during language comprehension at single-cell resolution

Linguistics and Languages

Semantic encoding during language comprehension at single-cell resolution

M. Jamali, B. Grannan, et al.

Discover the fascinating world of how neurons represent linguistic meaning in our brains! This groundbreaking research, conducted by Mohsen Jamali, Benjamin Grannan, Jing Cai, Arjun R. Khanna, William Muñoz, Irene Caprara, Angelique C. Paulk, Sydney S. Cash, Evelina Fedorenko, and Ziv M. Williams, reveals how individual neurons selectively respond to word meanings, dynamically reflecting context as we comprehend language.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses how individual human neurons represent linguistic meaning during natural language comprehension, a question that has remained unresolved despite extensive neuroimaging work on language networks. Initial processing of linguistic input is modality-specific (auditory or visual) before information is mapped to meaning within a left-lateralized, amodal language-selective network in frontal and temporal cortices. Prior work has debated whether semantic processing is widely distributed across cortex or concentrated within hub regions, and has established that speech processing and meaning access are strongly context dependent. Moreover, human semantic knowledge is structured, but how such structure is instantiated at the cellular level is unknown. Leveraging single-neuron recordings in awake neurosurgical participants, the authors investigate moment-by-moment neural dynamics underlying word and sentence comprehension at the cellular scale, focusing on whether neurons encode word meanings, how context modulates these representations, and how higher-order semantic relations are organized across neuronal populations.
Literature Review
The paper builds on neuroimaging evidence that natural speech elicits semantic maps across cortex and that a language-selective network maps word forms to meanings. It references debates on distributed versus hub-based semantic processing and work showing context effects on lexical ambiguity resolution. Distributional semantic models (e.g., Word2Vec, GloVe) capture human semantic judgments and fMRI responses, suggesting their utility for probing neural meaning representations. Prior human single-unit studies have elucidated phonetic encoding in temporal cortex and task-related neuronal activity but have not established semantic encoding during natural comprehension at single-cell resolution. The authors also draw on theories of structured semantic knowledge and hierarchical relationships, motivating tests of whether neuronal representations reflect such structure.
Methodology
Participants: 13 right-handed, native English-speaking adults undergoing planned intraoperative neurophysiology, awake and language-intact (10 participants with tungsten microelectrode arrays; 3 with Neuropixels). Age ranges: microarray cohort 33–79 (8 male, 2 female), Neuropixels cohort 66–70 (2 male, 1 female). Recording sites and techniques: Single-neuron recordings targeted the left language-dominant prefrontal cortex centered on the posterior middle frontal gyrus, overlapping language-selective and other high-level networks. A total of 287 well-isolated units were recorded: 133 with custom-adapted tungsten microelectrode arrays and 154 with silicon Neuropixels probes, enabling higher-throughput recording. Stimuli and tasks: Participants passively listened to semantically diverse, naturalistic sentences presented in random order (per participant mean: 131 ± 13 sentences, 1,052 ± 106 word tokens, 459 ± 24 unique words). Controls included: random word lists comprising the same words but scrambled order (reducing sentence context), pronounceable nonwords (e.g., “blicket”), and short naturalistic story excerpts thematically and stylistically distinct from the sentences (in 3 participants). Neural data processing: Action potentials were aligned to word onsets at millisecond resolution. Peri-stimulus time windows for analysis were typically 100–500 ms post-onset. Spike sorting and unit isolation followed established methods for microarray and Neuropixels; multi-unit analyses were also performed as a robustness check. Semantic feature space and clustering: Each unique word was mapped to a 300-dimensional pretrained embedding (Word2Vec; GloVe used for replication). Using spherical clustering and silhouette analyses, words were grouped into nine semantic domains: actions, states, objects, food, animals, nature, people/family, names, and spatiotemporal relations. Purity and d′ assessed cluster quality and separability. Selectivity quantification: A selectivity index (SI) quantified tuning to domains (SI=1 indicates responses confined to one domain; SI=0 indicates no selectivity). For each neuron, two-tailed rank-sum tests compared responses to one domain versus all others with FDR correction across nine domains. Additional analyses varied domain specificity by excluding words far from domain centroids and by random sub-selection of words. Word vs nonword discrimination: In participants with this control, two-tailed t-tests assessed whether neurons distinguished real words from nonwords. Decoding analyses: Multi-class decoders were trained on semantically selective neurons’ population responses to predict a word’s semantic domain during sentence presentation. Robustness was tested across embedding models (Word2Vec/GloVe), temporal positions within sentences, random subsampling of neurons, and multi-unit activity. Cross-material generalization was assessed by training on sentence data and testing on narratives with partially overlapping vocabularies. Context dependence tests: (1) Word-list control compared SIs between sentence and random-list presentations while monitoring overall firing rates as an attentional proxy. (2) Homophone analysis contrasted neural differences for phonetically identical words with different meanings (e.g., sun/son) versus phonetically different words sharing a domain. (3) Surprisal analysis used an LSTM language model to compute word surprisal from preceding context; decoding performance was related to surprisal. Population semantic organization: Neural responses for all words were regressed onto the 300-D embeddings to derive per-neuron model weights, concatenated into a neuron-semantic transformation matrix. Principal component analysis (PCA) reduced dimensionality; distances among word projections in neural PC space were compared to embedding-space distances. Similarity was also tested using WordNet synset similarity and raw firing rates. Hierarchical organization was evaluated by correlating population activity differences with cophenetic distances from agglomerative hierarchical clustering of embeddings. Visualization used t-SNE for manifold projection of word representations.
Key Findings
- Single-cell semantic selectivity: 14% of microarray neurons (19/133) and 19% of Neuropixels neurons (29/154) showed selective responses to specific semantic domains (FDR-corrected P<0.05), totaling 48/287 across 13 participants. Most selective neurons (84%, 16/19 microarray) preferred a single domain; 16% preferred two. - Selectivity strength: Mean SI for selective neurons was 0.32 (95% CI 0.26–0.38; microarray) and 0.42 (95% CI 0.36–0.48; Neuropixels). SI increased with greater domain specificity (ANOVA F(3,62)=8.66, P<0.001). Selectivity was robust to random sub-selection of words (SI≈0.33) and to using intuitively curated domain members (SI≈0.30). - Domain distribution: ‘Actions’ elicited the largest number of selective changes; ‘spatiotemporal relations’ elicited the fewest. - Word vs nonword: Many neurons distinguished words from nonwords (27/48 selective neurons; microarray; P<0.05), and this ability was not limited to selective neurons. - Decoding of semantic domains: Using semantically selective neurons during sentence listening, multi-class decoders achieved significant accuracy versus chance. Reported accuracies included approximately 31 ± 7% (chance ~11%) with Word2Vec embeddings and 25 ± 5% with GloVe (permutation tests P<0.05). Across all 13 participants, decoding performance averaged 36 ± 7% and exceeded chance (P<0.01). Similar results held for Neuropixels (≈29 ± 7%) and for multi-unit activity. - Cross-material generalization: Models trained on sentence responses generalized to new narrative materials (3 participants, 9 selective neurons) with 28 ± 5% accuracy (P<0.05), despite new vocabulary. - Context dependence: SI dropped from 0.34 (CI 0.25–0.43) in sentences to 0.19 (CI 0.07–0.31) in random word lists (signed-rank P=0.02; microarray), with no change in mean firing rate (P=0.16). Neuropixels showed a similar SI drop (0.39 to 0.29; P=0.035). Neuronal activity differences were larger for homophones (same sound, different meaning) than for non-homophones within the same domain (permutation P<0.0001; n=115 cells), indicating encoding independent of phonetic form. Decoding accuracy was higher for low-surprisal (more predictable) words versus high-surprisal words (26 ± 14% vs 10 ± 9%; z=26, P<0.0001), replicated with Neuropixels. - Population semantic geometry: The first five neural PCs explained 46% of variance across all neurons and 81% for selective neurons. Distances among word projections in neural space correlated with embedding-space distances (r≈0.04 over 258,121 pairs; P<0.0001). Firing-rate differences correlated with embedding cosine distances (r=0.17; P=0.02 microarray; r=0.21; P<0.001 Neuropixels) and with WordNet synset similarity (r=−0.76; P=0.001). Population-averaged activity correlated with hierarchical cophenetic distances among words (r=0.38; P=0.004), indicating encoding of hierarchical semantic relationships.
Discussion
The findings demonstrate that individual neurons in human left prefrontal cortex encode word meanings during natural speech comprehension and that these signals collectively support accurate, real-time decoding of broad semantic categories. This directly addresses the gap left by prior neuroimaging studies by revealing single-cell selectivity for semantic domains, including the capacity to discriminate words from nonwords. Neural responses were not static lexical signatures but dynamically reflected context-dependent meaning: removing sentence structure reduced selectivity, homophones were differentiated by context despite identical phonetic forms, and predictability (lower surprisal) improved decodability. Thus, neuronal representations integrate sentence context to resolve ambiguity and refine meaning, consistent with theories emphasizing context in comprehension. At the population level, the organized geometry of neural responses mirrored semantic structure: distances in neural space tracked distributional semantic distances, and hierarchical relationships among words were reflected in neural activity differences. This suggests a mapping from semantic embedding spaces onto neural population codes that can support robust generalization across materials and potentially facilitate composition of meanings during ongoing speech processing. Collectively, the results imply that focal prefrontal populations can represent complex meanings at a coarse semantic level and maintain a structured, hierarchical semantic organization that could underpin efficient, context-sensitive language comprehension.
Conclusion
This study provides single-cell evidence in humans that neurons in left prefrontal cortex encode word meanings during natural language comprehension. Neurons show selective tuning to semantic domains, differentiate words from nonwords, and dynamically adapt to sentence context. Population activity predicts semantic categories in real time and reflects the hierarchical structure of semantic relationships among words. These insights establish a cellular-scale framework for semantic encoding in humans and a neural mapping from distributional semantic spaces to population responses. Future directions include testing modality independence (e.g., reading), generalization to non-linguistic stimuli (images, videos, non-speech sounds), cross-linguistic and bilingual generalization, comparisons between comprehension and production, exploration of other brain regions (e.g., temporal cortex), and elucidation of finer-grained semantic distinctions and compositional processes from words to phrases and sentences.
Limitations
- Spatial sampling: Recordings were restricted to a focal region of left prefrontal cortex; findings may not generalize to other language/semantic regions (e.g., temporal cortex) or broader networks. - Participant population: Data were collected intraoperatively from clinical participants, which may introduce selection biases and state-related factors. - Sample size and selectivity proportion: Only a subset of recorded neurons (48/287) were semantically selective; semantic granularity assessed was relatively coarse (nine domains). - Modality and task constraints: Experiments focused on auditory comprehension with passive listening; generalization to reading, production, and active tasks remains untested. - Embedding dependence: Analyses rely on distributional embeddings to define semantic space; while cross-validated with WordNet and alternate embeddings, model-based definitions could bias interpretations. - Temporal resolution of context effects: While context dependence was shown, detailed dynamics of composition into phrase- and sentence-level meanings were not fully resolved.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny