Psychology
The neural and cognitive basis of expository text comprehension
T. A. Keller, R. A. Mason, et al.
The study investigates how individuals comprehend expository technical texts, focusing on two questions: (1) which neurocognitive processes distinguish better and poorer comprehenders of technical information from text, and (2) which conceptual and structural aspects of technical texts are associated with better or poorer comprehension. Understanding these processes is crucial given the centrality of reading technical materials (manuals, Wikipedia, handbooks) in education, employment, and daily life. Prior behavioral work has identified many component processes (lexical access, syntactic parsing, working memory, coherence/inference, mental model construction) and highlighted the role of prior knowledge. However, fewer neuroimaging studies have examined expository text comprehension in ecologically valid settings. This study uses fMRI while participants read realistic technical passages and then complete comprehension tests, relating brain activation during reading to both individual comprehension differences and passage-level comprehensibility.
Behavioral research shows expository texts differ from narratives in vocabulary familiarity, everyday experience linkage, and syntactic complexity, often requiring fewer inferences but relying more on prior knowledge. Comprehension involves multi-level processes: word decoding and meaning retrieval, syntax, propositional structure, cohesion and coherence building, working memory maintenance, discourse-level integration, retrieval of semantic and episodic knowledge, and construction of situation/mental models integrating new and prior knowledge. Neuroimaging has mapped language and semantic networks (left-lateralized perisylvian regions; ventral temporal; anterior temporal lobes; hippocampus; ventromedial and ventrolateral PFC) and integration/control networks (bilateral lateral PFC). Semantic memory is distributed across neocortex; controlled retrieval/integration engages lateral prefrontal cortex; episodic storage involves medial temporal/hippocampus; episodic retrieval engages medial prefrontal regions and precuneus. Few studies have targeted expository comprehension specifically, creating a gap between behavioral and neural evidence. Large-scale meta-analytic resources (e.g., Neurosynth) aid in linking activation to putative cognitive processes, though causal inference from fMRI remains limited. Repetition of discourse can change processing and brain activation patterns, suggesting dynamic adjustments over readings.
Participants: 31 right-handed, native English speakers (18–35 years; 25 female) from the Pittsburgh area with usable fMRI data. Recruited from 265 individuals screened online to select high and low technical reading comprehenders. Exclusions: excessive head motion (n=8), fell asleep (n=2), anatomical abnormality (n=1). Informed consent per CMU IRB. Materials: 24 five-sentence expository passages (mean length 132 words) describing mechanical devices or general knowledge topics; 16 used in fMRI: Bilge Pump, LiDAR, Refrigeration System, Automatic External Defibrillator, Screw Propeller, Sonar, 3D Printer, Aircraft Carrier Catapult, Bacteria, Acoustics and Cochlear Implants, Fever, Tumors Oncology Cancer, Photography, Intellectual Property, Beverages, Mechanical Engineering of Robots. Eight additional passages used for screening/familiarization. Pretest screening: In a 1-hour Zoom session, participants read the 8 screening passages twice (video format) and completed comprehension questions; those with high (88–100% correct) or low (41–72%) comprehension were invited. Also completed the Nelson-Denny Reading Comprehension Test. Experimental session: Before scanning, participants rated topic familiarity (1–7 scale) and completed handedness questionnaire. In-scanner, after a warmup passage, the 16 test passages were presented three times in different random orders. Each trial: title 1.5 s, fixation 0.5 s, then moving-window phrase-by-phrase presentation (1–4 words per phrase, respecting syntax; phrases >4 words split into two cumulative segments). Presentation duration per phrase: 300 ms + 16 ms/character + (400 ms − 31.26*log(freq of least frequent word)); inter-sentence pauses of 4 s with fixation after the first four sentences; after final sentence, an 'X' for 6.5 s. During the first two presentations, stems of two of the eventual four comprehension questions were shown (5–10 s), then fixation 3.5 s. Post-scan, comprehension was assessed with four 4-alternative multiple-choice questions per passage (two stems previously seen, two new). Post-scan psychometric tests: Reading Span Test, Raven’s Standard Progressive Matrices, Bennett Mechanical Comprehension Test (abbreviated). Twenty-nine participants also performed an in-scanner recall task (not analyzed here). MRI acquisition: Siemens Prisma 3.0 T; multiband slice-accelerated spin-echo EPI; 40 axial slices, 3 mm thick, no gap; TR=1000 ms; TE=25 ms; multiband factor=2; flip angle=64°; FOV=192×192 mm; matrix=64×64; AC–PC orientation, whole cortex coverage. SPM used for motion correction and normalization to MNI template. Data preprocessing and activation measure: Mean percent signal change (MPSC) relative to fixation computed at each gray-matter voxel. For each sentence, activation measured in a 4-s window offset 5 s after sentence onset (images 5–8 at 1 Hz sampling), then averaged across the five sentences per passage. Analytic strategy:
- Participant-level ROIs: Voxelwise correlations between each participant’s MPSC (averaged over all passages) and their mean comprehension were computed; one-sample t-tests identified clusters with positive/negative correlations (height p<0.05; extent ≥10 voxels). Twelve ROIs selected.
- Participant-level prediction: Stepwise regression (entry/retention p<0.15; stopping via minimum Schwartz Bayesian Criterion) with participants’ activation in the 12 ROIs (averaged over three presentations) predicting participants’ mean comprehension (averaged over 16 passages). Multicollinearity assessed via variance inflation and tolerance. Cross-validation: three models trained on each pair of presentations (ROIs selected within training folds) and tested on the left-out presentation; predictions averaged across folds.
- Passage-level ROIs: Voxelwise correlations between passages’ mean comprehension (across participants) and MPSC; one-sample t-test threshold p<0.05; extent ≥10 voxels; 13 ROIs selected.
- Passage-level prediction: Stepwise regression with activation in the 13 ROIs predicting mean passage comprehensibility; cross-validated across presentation folds analogously to participant-level.
- Readability and familiarity analyses: Coh-Metrix measures (Narrativity, Syntactic Simplicity, Word Concreteness, Referential Cohesion, Deep Cohesion) used to predict passage comprehension and to predict activation in key ROIs; topic familiarity ratings related to comprehension.
- Repetition effects: Whole-brain paired t-tests of overall MPSC compared across presentation repetitions to assess changes over readings (p<0.05; cluster ≥10).
Two main result domains:
- Individual differences in comprehension:
- Better comprehenders showed higher activation during reading in: left inferior frontal gyrus (L IFG; pars opercularis/triangularis), left superior parietal lobule (L SPL), bilateral dorsolateral prefrontal cortex (L/R DLPFC), and bilateral hippocampus (LHC, RHC), consistent with verbal working memory, spatial imagery/mental model construction, semantic integration, and episodic encoding.
- Poorer comprehenders showed greater activation in ventromedial prefrontal cortex (bilateral VMPFC), left superior frontal gyrus (L SFG), right precuneus, and right inferior parietal lobule, associated with episodic/autobiographical retrieval and episodic knowledge integration.
- Stepwise regression selecting key regions predicting individual comprehension identified L VMPFC (negative weight), L IFG, R DLPFC, and R hippocampus. Model fit: F(4,26)=20.43, MSE=0.007, p<0.00001; multiple R=0.76; adjusted R²=0.72. L VMPFC had the most negative standardized weight (fβ≈−0.49), indicating poorer comprehenders had higher VMPFC activation; L IFG, R DLPFC, RHC had positive weights.
- Cross-validation across presentation folds showed reliable predictive generalizability: fold R² values ≈0.47–0.49 (F(4,26)≈5.67–6.23; p≈0.002–0.001); mean cross-validated adjusted R²≈0.46.
- Passage-level comprehensibility:
- Thirteen ROIs’ activation correlated with passage comprehensibility. Better-comprehended passages elicited greater activation in regions supporting semantic processing and integration (e.g., L IFG pars opercularis, L temporal pole, L MTG, L/R VLPFC, right posterior STG). More difficult passages elicited greater activation in regions implicated in executive control and integration across distributed representations (M ACC/DMPFC), semantic processing (L STG), spatial visualization (L SPL/precuneus), left inferior parietal lobule, and right temporal pole/anterior STG.
- Stepwise regression selected four ROIs predicting passage comprehensibility: L IFG (pars opercularis), right temporal pole (R TP), left inferior parietal lobule (L IPL), and medial ACC/dorsomedial PFC (M ACC/DMPFC). Model fit: R=0.88; adjusted R²=0.84; F(4,11)=20.02; MSE=0.003; p=0.00005. Notably, standardized weights in the full model indicated strong influence of M ACC/DMPFC (β≈−0.99) and L IFG (β≈0.74).
- Cross-validated fold fits: R²=0.79 (F(4,11)=10.37, p=0.00099), 0.73 (F(4,11)=7.60, p=0.00344), 0.76 (F(4,11)=8.57, p=0.00215); mean predicted vs observed R² across folds=0.44 (F(1,14)=11.08, p=0.00497).
Comparative predictors:
- Psychometric predictors of individual comprehension: Nelson-Denny (R²=0.40, p=0.000133), Raven’s (R²=0.30, p=0.001565), Reading Span (R²=0.22, p=0.00711), Bennett Mechanical (reported as R≈0.18; F(1,29)=6.50, p=0.016). Combined psychometrics multiple R=0.36 (F(3,27)=4.96, p=0.00188). Brain activation predictors (four ROIs) achieved R=0.76. Adding psychometrics to imaging did not significantly improve prediction (ΔR²=0.048, F(3,24)=2.09, p=0.128); adding imaging to psychometrics did (ΔR²=0.47, F(4,24)=14.99, p<0.00001).
Text features and familiarity:
- Readability (Coh-Metrix) prediction of passage comprehension: only Syntactic Simplicity approached significance (R²=0.24, F(1,14)=4.45, p=0.053). Deep Cohesion predicted activation in L IFG (R²=0.45, F(1,14)=11.51, p<0.00438) and marginally L IPL (R²=0.21, p=0.076), suggesting neural mediation between cohesion and comprehension.
- Topic familiarity was low (mean 2.83/7, SD=0.88) and modestly related to comprehension (mean Fisher z r=0.24 across participants; t(30)=4.65, p<0.000005).
Repetition effects:
- From first to second reading, increased activation in bilateral occipito-temporal cortex, left superior temporal sulcus/anterior middle temporal gyrus, and right temporal pole; decreases from second to third reading in these regions; increases by the third reading in right IFG and medial superior frontal gyrus. Pattern suggests a shift from semantic processing to executive control with repetition.
Findings indicate that successful comprehension of technical expository texts depends on engaging verbal working memory and language processes (L IFG), constructing spatially grounded mental models (parietal regions such as L SPL/IPS), and integrating new information with existing semantic knowledge via lateral prefrontal and hippocampal mechanisms (bilateral DLPFC and hippocampus). Poorer comprehenders relied more on episodic/autobiographical retrieval (VMPFC, precuneus) and word-level/episodic integration processes, which were associated with lower comprehension performance. At the text level, more comprehensible passages elicited greater activation in regions associated with semantic access and discourse coherence, whereas more difficult passages recruited control and integration resources (M ACC/DMPFC) and areas implicated when content conflicts with world knowledge (right temporal pole). These neural patterns address the core research questions by identifying process-specific networks that differentiate individuals and texts, suggesting that facilitating semantic integration, spatial visualization, and coherence building can improve comprehension. The results align with behavioral literature on text cohesion, prior knowledge, and strategy instruction, demonstrating added value from neuroimaging in both predictive power and process insight.
This study advances understanding of expository technical text comprehension by linking individual and passage-level comprehension to distinct neural systems. Better comprehenders exhibit greater activation in networks supporting verbal working memory, spatial mental model construction, and semantic integration/encoding (L IFG, parietal regions, DLPFC, hippocampus), while poorer comprehenders show elevated episodic retrieval-related activation (VMPFC, precuneus). Passage comprehensibility is predicted by activation in semantic access/integration hubs (L IFG, L IPL), control regions (M ACC/DMPFC), and right temporal pole. Brain-based predictors outperform traditional psychometric and readability measures and provide mechanistic insight. Practical implications include pre-teaching key concepts and vocabulary, training visualization and inference/self-explanation strategies, and enhancing text cohesion (explicit causal, temporal, spatial links). Future research could broaden participant populations, manipulate text cohesion and knowledge demands experimentally, integrate longitudinal training interventions, and combine neuroimaging with eye-tracking to refine models of real-world technical reading.
- Causality: fMRI analyses are correlational; direct causal links from neural activity to comprehension cannot be established.
- Sample and selection: N=31, college-aged, right-handed native English speakers, selected from distribution extremes (high/low comprehenders), which may limit generalizability.
- Task ecology: Although designed to be naturalistic, the moving-window presentation and fixed timing may differ from self-paced reading; presentation of question stems during the first two readings could influence processing strategies.
- Measurement resolution: Passage comprehension measured with four multiple-choice items (limited score granularity), potentially constraining variance and reliability at the passage level.
- Text properties: Materials were not expressly designed to vary cohesion broadly; thus, correlations between cohesion and comprehension may be underestimated.
- Model building: ROI selection based on correlation maps and stepwise regression can risk overfitting; cross-validation mitigates but does not eliminate this concern. Cross-validated R² values, while significant, were moderate, indicating room for model improvement.
- Familiarity: Overall topic familiarity was low; modest familiarity–comprehension correlations suggest unmeasured prior knowledge dimensions may contribute.
Related Publications
Explore these studies to deepen your understanding of the subject.

