Psychology
Outcome prediction with a social cognitive battery: a multicenter longitudinal study
E. Brunet-gouet, C. Decaix-tisserand, et al.
Social cognition in schizophrenia is closely linked to real-world functioning and is considered by many to be a distinct domain mediating between neurocognition and functional outcomes. Prior meta-analyses suggest social cognition can explain more variance in functioning than traditional neurocognitive measures, supporting its targeted assessment. Social cognition encompasses multiple constructs, including emotion perception, attributional style, theory of mind (ToM), social perception, and social knowledge, typically requiring a battery of measures to capture its heterogeneity. Psychometric adequacy of such tools requires appropriate distributional properties, internal consistency, and retest characteristics, yet consensus on optimal measures remains lacking. Beyond internal properties, the clinical value of social cognitive tests is judged by concurrent, incremental, and predictive validity for outcomes such as functioning and QoL, ideally beyond neurocognition. Existing studies often rely on cross-sectional or short-term designs, leaving gaps in understanding long-term predictive value and sensitivity to change. This study aims to characterize the psychometric properties of a comprehensive set of social cognitive measures in schizophrenia spectrum disorders, evaluate their sensitivity to change over one year, and assess their cross-sectional and longitudinal associations with functioning and QoL, including incremental validity beyond neurocognition.
The authors review evidence positioning social cognition as a mediator between neurocognition and functional outcomes in schizophrenia. Meta-analyses indicate social cognition often outperforms neurocognition in predicting functioning. The SCOPE project examined several social cognitive measures; while many correlated cross-sectionally with outcomes, only a subset (e.g., ER-40, Hinting Task, IBT) demonstrated predictive value beyond neurocognition. Additional tools (e.g., DACOBS, TASIT Lies) have mixed support. Prior studies largely used cross-sectional or short-term designs (≤4 weeks), limiting insights into long-term prediction and responsiveness. Reasoning abilities correlate with ToM (~r=0.30), potentially confounding ToM–functioning links. Evidence on empathy and QoL is mixed, with both positive and negative associations reported across studies. There is limited information on sensitivity to change for social cognitive measures in schizophrenia, a key property for longitudinal evaluation and intervention studies.
Design and setting: Multicenter longitudinal cohort (NCT02901015) recruiting from seven FondaMental Centers of Expertise in France. Ethical approval obtained; written informed consent collected. Assessments at baseline and 1-year follow-up; same clinician rated both visits for 96.2% of participants. Participants: Inclusion criteria were DSM-IV-TR schizophrenia or schizoaffective disorder, age 15–65. Exclusions included current psychiatric hospitalization or treatment change within 4 weeks, current substance dependence (except tobacco), recent ECT (<6 months), neurological conditions affecting cognition, significant sensory impairment, inability to attend assessments, or planned change in follow-up within a year. Clinical assessments: PANSS (five-factor model: positive, negative, disorganization, arousal/excitation, depression) and CGI-Severity. Interrater reliability harmonized via annual meetings. Social cognition battery (EVACO):
- TREF (Facial Emotion Recognition Task): 54 color face photographs, six basic emotions, nine intensity levels; accuracy averaged across conditions.
- V-Comics (Versailles Intention Attribution Task): Comic strips requiring selection of the most logical ending; conditions include intention attribution and physical causality. Split into two matched versions (V1, V2); average accuracy in intention attribution condition used.
- V-SIR (Versailles-Situational Intention Reading): Eight short videos; participants rate the probability of four intention explanations on 4-point Likert scales; scoring is average distance from healthy controls (higher scores indicate lower ToM performance).
- SPEX-GBA (Goal & Belief Attribution): Silent animations across conditions assessing attribution of goals/beliefs vs physical causality; two versions, with analyses focusing on 12 items for goals/beliefs; sensitivity metrics computed as per Supplementary Information 1.
- QCAE (Questionnaire of Cognitive and Affective Empathy): 31-item self-report; 4-point Likert; mean scores for cognitive and affective subscales. Neurocognition: Six domains assessed at baseline: processing speed, attention/vigilance, working memory, memory, reasoning, and executive functions. Scores normalized so higher values reflect better performance. Outcomes: Functioning via Personal and Social Performance (PSP; 0–100, higher is better; CA ~0.76; ICC ~0.97). Quality of life via S-QoL 18 (eight domains; higher is better; CA 0.72–0.90). Statistical analysis: Baseline distributions evaluated for normality, outliers, ceiling/floor effects, and internal consistency (Cronbach’s alpha). Sensitivity to change assessed with standardized response mean (SRM) among completers; not computed for tests with alternate versions across visits. Cross-sectional associations at baseline and longitudinal predictions (baseline predictors to 1-year outcomes) analyzed with linear regressions, reporting standardized coefficients (β). Incremental validity tested by adding clinical covariates (PANSS dimensions) and neurocognitive domains separately in multivariable models. Missing data handled via multiple imputation (MICE in R; 50 imputations). Positive β indicates better social cognition associated with better outcomes.
- Sample: 143 participants (schizophrenia n=103; schizoaffective n=40); 79 reassessed at 1 year. Predominantly male; mild PANSS symptoms and moderate CGI severity at baseline.
- Neurocognition profile: Most impaired domains were executive functions (−1.0±1.5 SD), memory (−0.9±0.9 SD), and processing speed (−0.8±0.7 SD); less impairment in attention (−0.4±0.5 SD), working memory (−0.5±0.7 SD), and reasoning (−0.4±0.9 SD).
- Psychometrics (distributional properties): Acceptable distributions for TREF, V-SIR, and QCAE; SPEX-GBA measures showed floor effects; V-Comics showed ceiling effects and several low-performance outliers.
- Internal consistency: Adequate for TREF (α=0.76 [0.70–0.82]), V-SIR (α=0.74 [0.68–0.72]), V-Comics V1 (α=0.74 [0.66–0.83]) and V2 (α=0.83 [0.77–0.88]), and QCAE cognitive (α=0.87 [0.83–0.90]). Insufficient for SPEX-GA (V1 α=0.61; V2 α=0.31), SPEX-BA (V1 α=0.52; V2 α=0.38), and QCAE affective (α=0.62 [0.53–0.71]).
- Sensitivity to change (SRM) over 1 year among completers: TREF medium (SRM=0.43); V-SIR very small (SRM=0.01); QCAE affective small (SRM=−0.14); QCAE cognitive small (SRM=−0.08). Several neurocognitive measures showed moderate SRMs; TAP-flexibility global index was large (SRM=0.59).
- Cross-sectional associations with functioning (PSP) at baseline: Significant positive associations for TREF (β=0.32, t=3.7, p<0.001), V-SIR (after sign reversion; β=0.29, t=3.5, p=0.001), and V-Comics V1 (β=0.38, t=3.6, p=0.001). These relations persisted beyond clinical symptoms. TREF remained significant beyond all individual neurocognitive domains; V-SIR lost significance controlling for reasoning (p=0.125) and was marginal with executive functions (p=0.065); V-Comics V1 became marginal controlling for reasoning (p=0.082). All neurocognitive domains and most clinical dimensions were associated with functioning.
- Cross-sectional associations with QoL (S-QoL) at baseline: Among social-cognitive predictors, only QCAE affective was positively associated (β=0.17, p=0.044); QCAE cognitive was marginally negative (β=−0.16, p=0.07). No neurocognitive domain related to QoL. QoL related negatively to PANSS positive, negative, and depression scores. The QCAE affective–QoL link became non-significant when controlling for depression.
- Longitudinal prediction (1-year): Functioning at follow-up was significantly predicted by V-Comics V1 at baseline (standardized association; t=2.6, p=0.013), but this effect generally became non-significant when controlling for neurocognitive domains (processing speed, working memory, reasoning, executive functions) and for baseline functioning. Baseline neurocognition (all domains except memory) and several PANSS dimensions (positive, disorganization, excitation, depression) significantly predicted functioning at 1 year; baseline negative symptoms did not. No social cognition or neurocognition measure predicted QoL at 1 year.
The study demonstrates that within a comprehensive social cognition battery, the TREF exhibits the most robust psychometric profile, including acceptable distribution, internal consistency, and medium sensitivity to change, and shows cross-sectional associations with functioning that persist beyond both symptoms and neurocognition. V-SIR and V-Comics show cross-sectional links with functioning beyond symptoms, but these links are largely attenuated by neurocognitive covariates, especially reasoning, highlighting shared variance between ToM and higher-order cognition. SPEX-GBA displayed suboptimal psychometric characteristics and no outcome associations, arguing against its inclusion in clinical batteries. Empathic dispositions (QCAE affective) relate to QoL cross-sectionally, with effects influenced by depressive symptoms, suggesting subjective QoL may be more closely tied to self-reported affective processes than to performance-based social cognition. Longitudinally, social cognition at baseline weakly predicts future functioning compared with neurocognition and clinical status; the lone significant ToM predictor (V-Comics V1) lost significance after accounting for neurocognition and baseline functioning, suggesting that the causal pathways from social cognition to functioning may be subsumed by neurocognitive processes. These findings support integrating social cognition assessments—particularly emotion recognition—into evaluations of functioning determinants, while recognizing the dominant predictive role of neurocognition over time.
A targeted social cognition battery can inform functioning in schizophrenia spectrum disorders. TREF offers satisfactory psychometrics and meaningful cross-sectional associations with functioning beyond symptoms and neurocognition, supporting its use in clinical practice and research. V-SIR is promising but shows limited sensitivity to change and diminished associations when controlling for neurocognition. Overall, neurocognition and clinical status more consistently predict 1-year functioning than social cognition. QoL appears more closely related to self-reported affective empathy than to performance-based social cognitive or neurocognitive measures. Future research should: (1) evaluate test–retest reliability and sensitivity to change of social cognitive tools; (2) identify minimal, optimally predictive tool combinations across batteries; (3) examine longitudinal causal pathways using designs and analyses that parse neurocognitive and social-cognitive contributions; and (4) consider more homogeneous samples or stratification by expected change in social cognition.
Key limitations include: (1) modest sample sizes for some tools due to split versions (SPEX, V-Comics), potentially reducing power; (2) higher missingness for SPEX and TREF in early centers due to availability; (3) approximately 45–46% attrition at 12 months, though baseline comparisons suggested no attrition bias; (4) lack of direct test–retest reliability assessment, which relates to sensitivity to change; (5) heterogeneity in social cognition impairments and their trajectories possibly diluting SRMs; and (6) potential practice effects, although the 1-year interval likely mitigates these.
Related Publications
Explore these studies to deepen your understanding of the subject.

