Psychology
Neural signatures of emotional intent and inference align during social consensus
M. C. Reddan, D. C. Ong, et al.
The study addresses how the human brain transforms socioemotional signals into conscious inferences about others’ internal states. Social functioning is critical for health and well-being, and hinges on effective emotion signaling and accurate inference. Prior neuroimaging implicates regions such as amygdala, mPFC, TPJ, and precuneus in socioemotional inference, but distributed models specifying how multiple regions interact remain underdeveloped. Moreover, most prior work conflates the signaler’s intent (what they mean to convey) with the observer’s inference, often relying on normative ratings rather than targets’ own self-reports. Using the Stanford Emotional Narratives Dataset (SENDv1), which includes targets’ moment-by-moment self-ratings, the authors aim to dissociate neural signatures of signal intent from those of observer inference in a naturalistic storytelling paradigm. They test whether both components can be predicted from observer brain activity and whether alignment between these components relates to empathic accuracy, thereby informing potential interventions to improve social functioning.
Previous work links empathic accuracy to activity in IPL, premotor cortex, STS, and mPFC, but accuracy is an outcome of processing rather than a process itself, and mechanisms separating perception of intent from inference formation are unclear. Studies frequently use actors or normative datasets without self-reported intent, limiting direct assessment of the target’s intended message. Naturalistic, multimodal stimuli capture rich nonverbal cues (facial expressions, prosody, body language) and reduce demand characteristics, offering ecological validity over static or unimodal stimuli. The SENDv1 dataset provides first-person narratives with self-reported moment-by-moment emotion, enabling disentanglement of target intent from observer inference. The authors situate their work within literature on social cognition, theory of mind, and constructionist views of emotion, and contrast their approach with established brain-based signatures (e.g., PINES, NPS), anticipating unique patterns tied to schema activation and inference formation.
Participants: N=100 healthy, right-handed adults (59 women, 37 men, 4 no response; mean age=25.23, SD=9.96) from the Stanford community. Demographics and SES reported; IRB approved; informed consent obtained. Design and stimuli: Within-subjects design with 24 narrative videos (1–3 min) from SENDv1 (19 unique targets; 12 negative, 12 positive), presented in three sensory conditions: audiovisual (8 trials; used for model training), audio-only (8), visual-only (8) (combined as held-out validation). Pseudorandomized orders; practice session preceded scanning. Ratings: Targets provided moment-by-moment self-ratings of their felt emotional intensity immediately after recording (self-reported intent). Observers, during fMRI, rated moment-by-moment what they thought the target felt on the same bivalent scale (inference). Ratings sampled every 0.5 s, downsampled to TR=2 s, range-normalized within participant, then transformed into five valence-independent intensity quintiles (levels 1–5; 5=highest intensity). fMRI acquisition: 3T GE Discovery MR750, 32-channel head coil. T1w BRAVO (0.9 mm iso). Functional BOLD EPI: TR=2 s, TE=25 ms, FA=77°, 46 slices, 2.9 mm isotropic, interleaved, 3× in-plane acceleration. Two runs with brief break. Preprocessing: fMRIPrep 1.4.1 (N4 bias correction, skull strip, FreeSurfer recon-all, ANTs registration to MNI152NLin2009cAsym). Slice timing (AFNI 3dTshift), motion correction (FSL MCFLIRT), BBR coregistration, combined transforms via ANTs. Physiological noise regressors via tCompCor and aCompCor; framewise displacement computed. First-level GLM: For each participant, three single-trial models were fit with regressors of interest: (1) signal intent quintiles, (2) observer inference quintiles, (3) empathic accuracy (difference between normalized inference and intent; converted to 5 levels where 5=highest accuracy). Regressors convolved with SPM canonical HRF; design matrices included 36 nuisance regressors (CSF/WM signals and derivatives, motion and derivatives, spike regressors). OLS estimation produced whole-brain beta maps per quintile for each rating type. Quality control via VIF; outlier events removed if >3 SD of VIF. Model training: For each participant, beta maps within each intensity quintile (audiovisual trials) were averaged, yielding 5 maps per model type. Two whole-brain multivariate models were trained: (i) intent model (predicting target self-rated intensity from observer brain activity), (ii) inference model (predicting observer inferred intensity). Algorithm: LASSO-regularized principal components regression (LASSO-PCR; CANLab toolbox), leave-one-participant-out cross-validation; λ set automatically; retained coefficients chosen by MSE (lasso number=120). Voxel weight maps obtained by projecting PC weights back to voxel space. Internal accuracy assessed via prediction–outcome Pearson correlations. Feature importance: Bootstrap hypothesis testing (5000 samples) over voxel weights; FDR q<0.05 for significance; liberal P<0.01 (unc.) for visualization. NeuroSynth image decoder used to relate unthresholded patterns to literature; cosine similarity comparisons with established brain signatures (e.g., PINES, NPS, empathic care/distress, social rejection). Validation and specificity: External validation on held-out unimodal (audio-only and visual-only) beta maps for intent and inference. For each subject, prediction–outcome correlation computed and averaged; one-sample t-tests (greater than zero) for sensitivity; paired t-tests across opposite validation sets for specificity/double dissociation. Additional internal, per-stimulus validation replicated double dissociation. Separability analysis: Linear SVMs (C=1, LOO-CV) trained to classify intent vs inference beta maps at each intensity level; ROC/AUC assessed; bootstrap over classifier weights (5000 samples) to identify regions distinguishing intent and inference at high intensity (P<0.05 unc.). Alignment with empathic accuracy: Dot products (pattern expression) between each multivariate model and participant-level empathic accuracy maps for low vs high accuracy segments (audiovisual), plus intercepts, produced model predictions; correlations between models’ predictions compared via Fisher’s z-tests; replicated on held-out unimodal validation maps. Confirmed separability of accuracy maps from intent/inference maps via SVM. Exploratory analyses: (1) Univariate regression during low-accuracy trials predicting inference pattern expression while controlling for intent expression; FDR q<0.05, cluster size k≥25 to identify regions uniquely associated with inference expression. (2) Functional connectivity: Brainnetome Atlas (272 ROIs) timeseries during audiovisual videos; pairwise distance correlation matrices; degree centrality per node correlated across participants with overall empathic accuracy (Pearson r), identifying nodes whose centrality relates to accuracy.
- Intent model (predicting target self-reported intensity from observer brain): Internal CV within-subject prediction–outcome correlation r=0.65±0.34 SD; t(99)=18.91, P<0.001, CI=[0.58–0.71]; overall Pearson r=0.50, P<0.001; CV MSE=1.50. External validation on held-out intent set: average r=0.19±0.002; t(99)=9.65, P<0.001, CI=[0.15–0.23], Cohen’s d=0.23. Specificity vs inference validation set: average r=0.18±0.002; difference not significant (paired t(99)=0.48, P=0.629). Internal (audiovisual) double dissociation confirmed per-stimulus (paired t(22)=4.63, Cohen’s d=1.13, P<0.001).
- Intent model feature importance: Right visual cortex, right anterior insula, right angular gyrus, left PCC, bilateral precuneus, bilateral superior and inferior frontal gyri (bootstrap 5000, FDR q<0.05). NeuroSynth similarity: resting state, theory of mind, person, social, autobiographical, beliefs, spatial/scene construction, speech, self-referential. Low similarity to emotion induction signatures (PINES cosine=-0.03; NPS=-0.03; social rejection=-0.04); weak similarity to empathic care (0.06) and distress (0.10).
- Inference model (predicting observer inferred intensity): Internal CV within-subject r=0.68±0.30 SD; t(99)=22.72, P<0.001, CI=[0.62–0.74]; overall r=0.53, P<0.001; CV MSE=1.52. External validation on held-out inference set: average r=0.32±0.002; t(99)=12.48, P<0.001, CI=[0.27–0.37], Cohen’s d=1.24. Specificity: performance higher on inference validation than intent validation (paired t(99)=2.77, P=0.007, CI=[0.03–0.16], d=0.33). Additional internal validation per-stimulus confirmed specificity (paired t(22)=2.08, P=0.049, d=0.45).
- Inference model feature importance: Bilateral cerebellar crus, left precuneus, right primary somatosensory cortex (S1), right inferior frontal gyrus, bilateral superior medial frontal gyrus, bilateral lingual gyrus, bilateral temporal pole, bilateral anterior insula (bootstrap 5000, FDR q<0.05). NeuroSynth similarity emphasized somatosensory simulation and bodily action, alongside resting state, person, theory of mind, social, spatial, moral, self-referential, beliefs.
- Dissociability: Cosine similarity between intent and inference weight patterns=0.29; after thresholding, overlap only in right anterior insula; within bilateral insula masks, cosine=0.22. Linear SVMs classify intent vs inference maps above chance at all intensity levels (accuracy≈69–70%, AUC=0.68–0.75; all P<0.001). Regions maximally distinguishing at highest intensity: dACC, PCC, anterior insula, pallidum, precuneus.
- Alignment predicts empathic accuracy: Correlation between intent and inference model predictions is higher during high-accuracy than low-accuracy periods. Audiovisual: low accuracy r=0.28 (P=0.004) vs high accuracy r=0.64 (P<0.001); Fisher’s z=3.26, P=0.001, Cohen’s q=0.47. Validation (unimodal) replication: low r=0.58 vs high r=0.79; z=2.83, P=0.005, q=0.41. Models can be combined to predict empathic accuracy in held-out data.
- Exploratory univariate (low-accuracy trials, controlling for intent pattern expression): Positive associations with inference expression in right S1 and right parahippocampal gyrus (PHG); negative associations in left insula and left primary motor cortex (M1) (FDR q<0.05). Pattern dissimilar to finger-tapping maps (cosine=-0.07).
- Exploratory functional connectivity: Degree centrality correlates with overall empathic accuracy at right PHG (r=0.22, P=0.032), right cingulate gyrus area 24 (r=0.24, P=0.019), and right inferior temporal gyrus (r=0.22, P=0.032).
The findings demonstrate that two dissociable neural components of socioemotional processing—one related to recognizing a signaler’s intended emotional intensity (intent) and another to the observer’s conscious inference—can be decoded from observer brain activity. Despite partial overlap (right anterior insula), the multivariate patterns are largely distinct and linearly separable. Greater alignment between these patterns during viewing corresponds to higher empathic accuracy, suggesting that observers possess latent neural representations of others’ intended socioemotional intensity that, when engaged alongside inference-related processes, yield more accurate judgments. The intent pattern likely reflects schema activation and multisensory integration of socioemotional cues, while the inference pattern appears to involve mentalizing and somatosensory simulation, consistent with constructionist and simulation theories of emotion and empathy. These results advance models of social signal processing by disentangling intent from inference in naturalistic contexts and highlight potential neural targets for improving social understanding and reducing loneliness and isolation.
This work introduces and validates two multivariate fMRI signatures derived from observer brain activity that predict (1) targets’ self-reported socioemotional intent and (2) observers’ inferences, showing that their alignment tracks empathic accuracy. The signatures are dissociable, generalize to held-out unimodal stimuli, and are distinct from established affective signatures. Together, they support a framework in which schema-based recognition of signal intent and deliberate inference formation jointly contribute to empathic accuracy. Future research should: (i) test out-of-sample generalization across diverse populations, cultures, and tasks; (ii) validate on additional naturalistic audiovisual datasets and clinical groups; (iii) incorporate valence and higher-dimensional socioemotional content over time; (iv) analyze linguistic and story content to link language with neural dynamics; and (v) refine models to disentangle motor responses from inference-related processing, potentially informing interventions to enhance social connection.
- Intent and inference arise simultaneously in this paradigm, complicating complete neural separation despite independent validations.
- Targets’ self-reports may imperfectly reflect actual feelings at recording time (influenced by reflection, mood, social desirability).
- Generalizability to other cultures, age groups, and socioeconomic backgrounds is unknown; sample skewed toward Stanford undergraduates.
- Models focus on a single dimension (intensity) and remove valence to accommodate fMRI sampling limits; including valence did not improve accuracy.
- Button-press motor requirements may confound some activity, though analyses suggest dissociation from finger tapping.
- Story content and language were not analyzed, limiting content-specific inferences; randomization reduced power for content analyses.
- Further validation on other naturalistic datasets and modalities is needed to establish sensitivity and specificity.
Related Publications
Explore these studies to deepen your understanding of the subject.

