Psychology
The neural and cognitive basis of expository text comprehension
T. A. Keller, R. A. Mason, et al.
Learning technical information from text is a central cognitive skill crucial for education, employment, and everyday life. This study asks two central questions about expository technical text comprehension: (1) Which neurocognitive processes distinguish individuals who are better versus poorer comprehenders of technical texts? (2) Which conceptual and structural properties of technical texts are associated with better or poorer comprehension? Using fMRI during reading of realistic technical passages followed by a comprehension test, the authors aim to identify brain regions and processes related to individual differences in comprehension and to passage-level comprehensibility. The work addresses the need for naturalistic neurocognitive approaches that capture the complexity of real-world text comprehension and complements extensive behavioral literature by linking brain activity to key comprehension processes such as mental model construction and knowledge integration.
Prior behavioral research indicates that expository technical texts differ from narratives in vocabulary familiarity, reliance on prior knowledge, and linguistic complexity, and often require fewer inferences. Comprehension involves multiple component processes from lexical decoding and semantic retrieval to discourse-level coherence building and situation model construction. Individual differences arise from variability in these processes and in domain knowledge, with interactions among prior knowledge, text structure, and processing proficiency. Neuroimaging studies associate language and semantic networks (left-lateralized parietal, temporal, inferior frontal areas; ventral temporal; anterior temporal lobes; hippocampus; ventromedial/ventrolateral PFC) with lexical-semantic processing and conceptual integration, and implicate lateral prefrontal regions in integrating semantic knowledge and handling inconsistencies. Episodic memory involves medial temporal structures and medial prefrontal regions during retrieval. Large-scale meta-analytic resources (e.g., Neurosynth) can support reverse inference about processes from observed activation patterns. Despite the relevance to STEM education and job performance, relatively few neuroimaging studies have targeted expository text, motivating the present investigation that bridges behavioral and neuroscience literatures.
Participants: 31 right-handed, native English speakers (25 females, 6 males; ages 18–35) from Pittsburgh provided usable fMRI data. They were selected from a pool of 265 based on extremes of technical reading comprehension in an online pretest (High: 88–100% correct; Low: 41–72%). Exclusions: 11 (8 excessive motion, 2 fell asleep, 1 anatomical abnormality). All gave informed consent (Carnegie Mellon IRB). Materials: 24 five-sentence passages (mean ~132 words) describing mechanical devices or general knowledge topics; 16 used in fMRI (Bilge Pump, LiDAR, Refrigeration System, Automatic External Defibrillator, Screw Propeller, Sonar, 3D Printer, Aircraft Carrier Catapult, Bacteria, Acoustics and Cochlear Implants, Fever, Tumors Oncology Cancer, Photography, Intellectual Property, Beverages, Mechanical Engineering of Robots). Eight additional passages were used for pretesting/familiarization. Procedures: Pre-scan, participants rated topic familiarity (1–7) and completed handedness questionnaire. In the scanner, after a warmup passage, the 16 test passages were presented three times in different random orders. Each trial: title (1.5 s), fixation (0.5 s), then moving-window phrase-by-phrase presentation (1–4 words/phrase; respecting syntactic boundaries) with durations calibrated by word length/frequency: 300 ms + 16 ms/character + (400 ms – (31.26 × log(word frequency of least frequent word))), plus an intercept. Pauses: 4 s fixation after each of the first four sentences; an 'X' for 6.5 s after the final sentence. During the first two presentations only, stems of two of the eventual four multiple-choice questions were shown post-passage (5–10 s), followed by fixation (3.5 s). Post-scan comprehension test included four 4-option MCQs per passage (two previously cued stems; two novel). Psychometric tests post-scan: Nelson-Denny Reading Comprehension, Reading Span, Raven’s Standard Progressive Matrices, Bennett Mechanical Comprehension (abbrev.). A recall task was administered to 29 participants but not analyzed here. fMRI acquisition: Siemens Prisma 3T, multiband slice-accelerated BOLD spin-echo EPI sequence; 40 3-mm slices (no gap), TR = 1000 ms, multiband factor = 2, TE = 25 ms, flip angle = 64°, AC-PC orientation covering cortex, FOV 192 × 192 mm, matrix 64 × 64. Motion correction and MNI normalization with SPM. Data analysis: Mean percentage signal change (MPSC) relative to fixation computed at each gray-matter voxel. Activation per sentence was averaged over a 4-s window capturing the BOLD peak (images 5–8, offset 5 s from stimulus onset) and then averaged across the five sentences. Participant-level ROI identification: voxelwise correlations between MPSC (averaged over all passages) and comprehension scores; one-sample t-tests (height p < 0.05; cluster extent ≥10 voxels) identified positive/negative correlation clusters; 12 ROIs selected. Stepwise regression (entry/retention p < 0.15; minimum Schwartz Bayesian Criterion for stopping) related ROI activation (independent variables) to mean comprehension (dependent variable), with multicollinearity assessed (variance inflation, tolerance). Cross-validation: three models trained on each pair of presentations; ROIs selected independently in training folds; tested on left-out presentation. Passage-level ROI identification: voxelwise correlations between passage mean comprehension (across participants) and MPSC per voxel (p < 0.05, cluster extent ≥10 voxels) yielded 13 ROIs. Stepwise regression and three-fold cross-validation (by presentation pairs) predicted passage comprehensibility from activation in selected ROIs. Activation changes across repetitions: whole-brain paired t-tests compared overall MPSC between presentation pairs to quantify changes over repeats (threshold p < 0.05; extent ≥10 voxels).
Individual differences in comprehension:
- Better comprehension was associated with higher activation during reading in left inferior frontal gyrus (verbal working memory, phonological/semantic/syntactic processes), left superior parietal lobule (spatial processing/imagery), bilateral dorsolateral prefrontal cortex (semantic integration/structure-building), and bilateral hippocampus (encoding/consolidation of new declarative knowledge). Poorer comprehension was associated with greater activation in ventromedial prefrontal cortex and precuneus (episodic/autobiographical retrieval) and right inferior parietal lobule (coherence-related processing).
- Stepwise regression using four ROIs (L VMPFC [negative], L IFG, R DLPFC, R HC) predicted individual participants’ comprehension: F(4,26) = 20.43, MSE = 0.007, p < 0.00001, R² = 0.76 (adjusted R² = 0.72). Standardized weights showed L VMPFC β = −0.49; other selected regions had positive βs. Multicollinearity was low (variance inflation < 1.5; minimum tolerance 0.74).
- Cross-validated prediction of participant comprehension from activation in presentation-specific ROIs showed reliable generalizability: R² = 0.47 (F(4,26) = 5.67, MSE = 0.017, p = 0.00203), R² = 0.49 (F(4,26) = 6.18, MSE = 0.016, p = 0.00123), R² = 0.49 (F(4,26) = 6.23, MSE = 0.016, p = 0.00114); average cross-validated model fit displayed adjusted R² ≈ 0.46. Passage-level comprehensibility:
- Activation in 13 ROIs correlated with passage comprehensibility. A four-ROI model (L IFG pars opercularis, R temporal pole, L inferior parietal lobule, medial anterior cingulate/dorsomedial PFC) predicted passage difficulty: R² = 0.88 (adjusted R² = 0.84), F(4,11) = 20.02, MSE = 0.003, p = 0.00005. Influential predictors in the full model included M ACC/DMPFC (β = −0.99) and L IFG (β = 0.74); no collinearity issues (variance inflation < 2.0; tolerances > 0.5).
- Cross-validation (leave-one-presentation-out) for passage-level prediction yielded R² = 0.79 (F(4,11) = 10.37, MSE = 0.005, p = 0.00099), R² = 0.73 (F(4,11) = 7.60, MSE = 0.006, p = 0.00344), R² = 0.76 (F(4,11) = 8.57, MSE = 0.007, p = 0.00215); mean predicted comprehensibility across folds correlated with observed comprehension, R² = 0.44 (F(1,14) = 11.08, MSE = 0.01, p = 0.004970). Psychometrics vs brain measures:
- Psychometric predictors of individual comprehension were modest: Nelson-Denny R² = 0.40 (F(1,29) = 19.37, MSE = 0.017, p = 0.000133); Raven R² = 0.30 (F(1,29) = 12.18, MSE = 0.022, p = 0.001565); Reading Span R² = 0.22 (F(1,29) = 8.39, MSE = 0.022, p = 0.00711); Bennett Mechanical R² = 0.18 (F(1,29) = 6.50, MSE = 0.023, p = 0.016336). Combined model R² = 0.36 (F(3,27) = 4.96, MSE = 0.019, p = 0.00188). Brain activation measures in four regions predicted comprehension substantially better (R² ≈ 0.76). Adding psychometrics to brain measures did not significantly improve prediction (ΔR² = 0.048, F(3,24) = 2.09, p = 0.128), whereas adding brain measures to psychometrics did (ΔR² = 0.47, F(4,24) = 14.99, MSE = 0.007, p < 0.00001). Text features and familiarity:
- Readability (Coh-Metrix) predictors were weak: only Syntactic Simplicity approached significance for passage difficulty (R² = 0.24, F(1,14) = 4.45, p = 0.053386). Deep Cohesion predicted activation in L IFG (R² = 0.45, F(1,14) = 11.51, p < 0.00438) and modestly in L IPL (R² = 0.21, F(1,14) = 3.67, p = 0.076051), suggesting cohesion-related activation may mediate comprehension.
- Topic familiarity ratings were low (mean 2.83/7, SD = 0.88) and modestly related to comprehension (mean Fisher’s z-transformed r = 0.24, SD = 0.28; t(30) = 4.65, p < 0.000005). Repetition effects:
- Activation increased from first to second reading in bilateral occipito-temporal cortex, left superior temporal sulcus/anterior middle temporal gyrus, and right temporal pole, then decreased from second to third reading. Right inferior frontal gyrus and medial superior frontal gyrus increased by the third reading, indicating an inverted U-shaped temporal-lobe activation pattern and a shift from semantic to executive processing with repetition.
The study addresses how neurocognitive processes underpin successful comprehension of expository technical texts. Better comprehenders engaged verbal working memory (L IFG), spatial visualization (L SPL/IPS), semantic integration and executive control (bilateral DLPFC), and episodic encoding (bilateral hippocampus) to construct coherent situation models that integrate new technical information with prior semantic knowledge. Poorer comprehenders relied more on episodic/autobiographical retrieval (VMPFC, precuneus) and word-level processing, which was less effective for mastering technical content. Passage-level comprehensibility depended on processes that access semantic content and integrate it across distributed cortical representations (anterior temporal and ventrolateral prefrontal regions), and coherence-building (right posterior temporal areas) for better-understood texts. The findings reconcile and extend behavioral evidence by providing neural markers of the key processes, offering actionable implications for instruction (e.g., pre-teaching concept meanings, training spatial visualization, and fostering inference generation/self-explanation) and for text design (enhancing cohesion to support semantic integration).
This work advances understanding of individual differences and text-level factors in expository technical comprehension by linking them to specific neural systems for verbal working memory, spatial mental model construction, semantic integration, and episodic encoding/retrieval. Brain activation during reading robustly predicts both who will comprehend better and which passages will be easier or harder to understand, outperforming traditional psychometric and readability measures. Practically, teaching strategies that strengthen maintenance of verbal information, spatial visualization, and semantic integration, alongside engineering texts to explicitly interrelate concepts and improve cohesion, may enhance technical learning from text. Future research could manipulate text cohesion and conceptual structure more systematically, test training interventions targeting identified neural-cognitive processes, broaden participant populations and domains, and integrate multimodal measures (e.g., eye tracking, EEG) to refine predictive models and causal understanding.
The study infers relationships from fMRI correlational data; causal links between activation and comprehension cannot be established. Participants were a selective sample (right-handed, native English speakers; extremes of pretested comprehension) from a single geographic area, limiting generalizability. Comprehension assessment per passage had limited granularity (four 4-option items), potentially constraining correlation estimates. Text materials were not designed to vary widely in cohesion, which may have reduced the sensitivity of readability analyses; Deep Cohesion effects were observed only indirectly via activation. Motion and alertness issues led to exclusions. Cross-validation was conducted across repeated presentations of the same materials, which supports generalizability across sessions but not across wholly new texts or populations.
Related Publications
Explore these studies to deepen your understanding of the subject.

