Education
Neural alignment predicts learning outcomes in students taking an introduction to computer science course
M. Meshulam, L. Hasenfratz, et al.
The study addresses how learners internalize new concepts in authentic educational contexts. Building on multivariate neuroimaging methods that reveal how knowledge is represented in neural activity, the authors note that most prior work focuses on well-established, not newly acquired, concepts. They posit that learning success should manifest as neural alignment—individual learners’ neural representations converging toward canonical patterns observed in experts and shared among peers. Motivated by evidence that shared neural responses across individuals (e.g., during communication or shared viewing) relate to shared understanding and memory, particularly in default mode network (DMN) regions, the authors hypothesize that greater neural alignment during learning will predict better educational outcomes and that topic-specific alignment during assessment will track understanding of specific concepts.
Prior work employing multivariate pattern analysis and representational similarity analysis has elucidated neural representations of objects and categories but less so of newly learned concepts. Studies such as Cetron et al. used fMRI classifiers in controlled STEM settings to decode understanding of physics categories, and Mason & Just showed cortical activation progressions during learning. Research on neural coupling indicates speaker–listener and inter-subject synchronization relate to successful communication and shared understanding; children’s alignment to adult-like responses during math videos predicted math scores (Cantlon & Li). Shared neural patterns, particularly in DMN regions, encode and reinstate memories of shared experiences. Collectively, this literature suggests that thinking alike (shared neural responses) supports comprehension and memory, motivating the focus on neural alignment to capture learning in real-world courses.
Design: Longitudinal fMRI study in a real flipped introductory computer science course (COS 126, Princeton). Students (n recruited=24; final usable datasets=20 undergraduates, 18 complete) were scanned six times over a 13-week semester. Experts (n=5 recruited; 4 complete) were graduate students in computer science, scanned once at semester end. Stimuli and tasks: Students viewed subsets of lecture videos during scans 1–5 (3–5 segments per scan; ~40 min per scan; 21 total segments; 197 min total). In scan 6, participants viewed five 3-min recap videos (16 min total) and then took a self-paced final exam in the scanner (16 open-ended questions spanning course topics; verbal responses, mean length 31.9 s, SD 24.7). A written baseline exam (“pre”) with the same questions was administered before the semester; all students scored 0 at baseline. Verbal responses were recorded, transcribed, and scored by course staff on 0–3 per question; total scores normalized to 0–100. Experts completed the same recaps and exam; expert responses scored <2 were excluded to ensure expert canonical patterns reflected correct answers. MRI acquisition and preprocessing: 3T Siemens Skyra/Prisma, 64-channel head coil; T2*-EPI (TR=2000 ms, TE=28 ms, FA=80°, 3 mm isotropic, 38 slices). Preprocessing in FSL: slice-time correction, motion correction, linear detrending, high-pass filter (100 s), 6 mm FWHM smoothing, coregistration to MNI152. Motion parameters regressed out. Analyses in Python/R with BrainIAK. Regions of interest and searchlight: Eight anatomically defined bilateral ROIs (Harvard–Oxford atlas, >20% prob): DMN nodes (angular gyrus, precuneus, anterior cingulate cortex), hippocampus, posterior superior temporal gyrus; controls: early visual (intracalcarine), early auditory (Heschl’s gyrus), amygdala. Whole-cortex searchlight cubes 5×5×5 voxels (15 mm per side). Neural alignment metrics:
- Alignment-to-class (lectures, recaps, exam): Inter-subject pattern correlation. For lectures/recaps, multi-voxel patterns were extracted in non-overlapping 30 s bins; alignment computed as Pearson correlation between a student’s pattern and the mean pattern of all other students; Fisher z-transform used for averaging. For exam, mean BOLD pattern per question; same-question alignment computed by correlating each student’s question pattern with the class-average pattern for that same question.
- Alignment-to-experts (recaps, exam): Pearson correlation between each student’s pattern and the mean expert pattern (canonical expert response), per 30 s bin (recaps) or per question (exam).
- Knowledge structure alignment (exam): For experts and class, 16×16 similarity matrices (templates) computed by correlating canonical question patterns with each other. For each student and each question (row), correlations were computed between the student’s question pattern and template patterns of all other questions, yielding per-question alignment scores to class and to experts. Statistical analysis:
- Between-participants: Correlated each student’s mean alignment during lectures with overall final exam score (Pearson r).
- Within-participants (exam): For each student, correlated per-question alignment (same-question and knowledge-structure measures) with that student’s question scores; mean r across students tested.
- Searchlight significance via one-sided permutation tests (1000 label shuffles) with FDR correction q=0.05. Controls: Alternative alignment bin sizes (10 s, 2 s) and temporal ISC yielded similar patterns. Response-length control: residualized question scores by regressing out answer length; alignment–performance effects persisted. Power analysis assessed accumulation across lecture segments; prediction improved with more data. Additional details: Recap–exam alignment-to-class vs alignment-to-experts correlations assessed between participants across ROIs and cortex. Intersection analyses identified voxels consistently implicated across tasks and measures.
- Learning gains: All students scored 0 at pretest; by course end, significant improvement (two-sided t-test, t(19)=12.6, p<0.001), total scores range 22–76/100, median 53.1, SD 17.1.
- Alignment during lectures predicts outcomes: Between participants, higher alignment-to-class during lectures correlated with higher final exam scores in DMN and memory regions and early sensory cortices. Example ROI correlations (lectures): hippocampus r≈0.75 (p<0.01), angular gyrus r≈0.62 (p<0.01), precuneus r≈0.61 (p<0.01), ACC r≈0.53 (p<0.01), early auditory r≈0.46 (p<0.05), early visual r≈0.41 (p<0.05). Whole-cortex searchlight showed significant clusters in anterior and posterior medial cortex, bilateral angular gyrus, temporal and insular cortices (FDR-corrected). Prediction improved as more lecture data accumulated.
- Class patterns reflect expert patterns: Across ROIs and cortex, alignment-to-class and alignment-to-experts were positively correlated during recaps and during the exam (e.g., ACC recap r≈0.81**, precuneus recap r≈0.67**, early auditory exam r≈0.74**, angular gyrus r≈0.47–0.49**; one-sided permutation tests, corrected).
- Same-question alignment during exam tracks performance within students: Within participants, per-question alignment-to-class and alignment-to-experts positively correlated with question scores, especially in medial cortical regions. Significant effects included ACC and superior temporal ROIs for both measures; additional effects: alignment-to-experts in precuneus; alignment-to-class in hippocampus, angular gyrus, and visual ROIs. Searchlight maps highlighted medial prefrontal and posterior medial DMN regions for both alignment measures (FDR-corrected). Effects were robust after controlling for response length.
- Knowledge structure alignment predicts performance: Within participants, per-question alignment between each student’s knowledge-structure and the class template positively correlated with question scores across hippocampus, ACC, angular gyrus, and temporal ROIs; searchlight again highlighted medial cortical regions (FDR-corrected). Alignment to expert knowledge-structure was qualitatively similar but did not survive multiple-comparisons correction.
- Convergent DMN loci: Intersection analyses across tasks and measures revealed consistent clusters in medial prefrontal, posterior medial, left angular, and superior temporal cortices—overlapping with DMN.
- Sensory areas: Unexpected positive correlations between lecture alignment and exam scores also appeared in early visual and auditory cortices, possibly reflecting top-down attention to relevant lecture details.
The findings support the hypothesis that successful learning is reflected in neural alignment to canonical representations. During lectures, students whose neural patterns more closely matched the class average (and, by extension, experts) achieved higher final exam scores, particularly in DMN and hippocampal regions implicated in memory encoding and retrieval. During the exam, the similarity of a student’s neural pattern to expert and class canonical patterns for each question tracked that student’s score on that question, indicating concept-specific alignment of neural representations underlies understanding. Knowledge-structure alignment further shows that learning encompasses not only representations of individual concepts but also their interrelations; students whose neural similarity structure across concepts more closely matched the class canonical structure performed better. The tight relationship between alignment-to-class and alignment-to-experts suggests convergence on shared canonical states, though variability across regions and tasks indicates task-dependent factors (e.g., passive viewing vs active recall). The prominence of DMN regions across learning phases underscores their role in constructing, encoding, and reinstating abstract, integrative knowledge. Early sensory alignment effects during lectures may reflect top-down modulation of perceptual processing by attentional engagement. Overall, neural alignment offers a principled, general-purpose approach to quantify and predict learning in real-world educational settings beyond controlled lab tasks.
This study introduces and validates neural alignment as a general approach to predict and assess learning in authentic educational contexts. Alignment-to-class during lectures predicted final exam performance, and during the exam, both same-question and knowledge-structure alignments to class and experts tracked question-by-question performance within students. Convergent effects in DMN and hippocampal regions highlight their central role in forming and reinstating canonical conceptual representations. These results open avenues for neural metrics to complement traditional assessments, enabling fine-grained, concept-level evaluation of understanding. Future research should generalize across courses, domains, and learner populations; disentangle contributing factors to inter-individual alignment; increase expert sample sizes to power-match student–expert comparisons; and leverage advances in language modeling and network science to dynamically track the evolution of knowledge structures over time.
- Generalizability: Single course (introductory CS) at one institution; results may not directly generalize across domains, course types, or educational settings without further validation.
- Sample size: Modest number of participants (20 students, 5 experts; 18 and 4 complete), potentially limiting detection of smaller effects and power in student–expert comparisons.
- Partial exposure: Only a fraction (~3 h of ~21 h) of lecture content was scanned; learning also occurred outside the scanner, complicating lecture-specific inferences.
- Measurement scope: Alignment likely reflects multiple factors (e.g., attentional engagement, prior educational background, familiarity with teaching style); their unique contributions remain to be parsed.
- Power considerations: Some cortical regions required more data to reach stable prediction; more participants or data might reveal additional effects.
- Expert alignment: Knowledge-structure alignment to experts did not survive multiple-comparisons correction, potentially due to limited expert sample size and/or qualitative differences between expert and novice representations.
Related Publications
Explore these studies to deepen your understanding of the subject.

