logo
ResearchBunny Logo
Identifying multiple sclerosis subtypes using unsupervised machine learning and MRI data

Medicine and Health

Identifying multiple sclerosis subtypes using unsupervised machine learning and MRI data

A. Eshaghi, A. L. Young, et al.

This groundbreaking study led by authors Arman Eshaghi and colleagues utilizes unsupervised machine learning on MRI scans from thousands of MS patients to reveal distinct subtypes of multiple sclerosis. The findings shed light on how these subtypes predict disability progression and treatment response, potentially transforming patient care and clinical trials.... show more
Introduction

The study addresses the need to define MS phenotypes based on underlying mechanisms rather than clinical course, to improve stratified medicine and clinical trial design. Traditional clinical phenotypes (CIS, RRMS, PPMS, SPMS) are limited by overlapping features and subjective assessments of transition. MRI captures pathobiological mechanisms more directly than clinical descriptors, making it a strong candidate for data-driven classification. A key challenge is disentangling phenotypic and temporal heterogeneity from cross-sectional and short longitudinal studies. The Subtype and Staging Inference (SuStaIn) algorithm can uncover data-driven subtypes with distinct temporal progression patterns from MRI, assigning subtype membership and stage to unseen individuals. The primary aim was to redefine MS subtypes based on MRI-visible pathological changes using SuStaIn, training on previously published clinical trials and observational studies and validating on independent datasets. A secondary aim was to test whether the MRI-derived subtypes differ in disability progression, disease activity, and treatment response at study entry.

Literature Review

The paper situates its work within prior knowledge that MS is heterogeneous and current clinical phenotypes (RRMS, PPMS, SPMS) often share imaging, immunologic, and pathologic features, with challenging-to-define transitions. Prior studies show SPMS and PPMS share MRI and pathogenic similarities, and MRI findings align with underlying pathology. SuStaIn has previously been used to model heterogeneity and temporal complexity in neurodegenerative diseases by inferring subtypes and stages from cross-sectional data. Evidence indicates early focal inflammatory demyelination can precede deep grey matter atrophy, while cortical neurodegeneration and chronic compartmentalized inflammation may underpin insidious progression. These works motivate a biologically grounded, MRI-based subtyping approach to better capture MS mechanisms than clinical labels alone.

Methodology

Study design: Aggregated MRI and clinical data from 16 MS randomized controlled trials (five PPMS, seven SPMS, four RRMS) and three observational cohorts with mixed MS subtypes; included two healthy control datasets (Human Connectome Project; UK Biobank). Ethical approvals were obtained; data controllers permitted pooled anonymized analyses. Training dataset comprised 14 datasets selected a priori; validation used five independent datasets. Participants: Training N=6322 (46% RRMS, 29% SPMS, 25% PPMS). Validation N=3068 (49% RRMS, 28% SPMS, 23% PPMS). Baseline characteristics were similar between sets. MRI acquisition and processing: Used T1-weighted (2D/3D), T2-weighted, and T2-FLAIR. A uniform cross-sectional pipeline processed each visit independently. Derived 18 variables: volumes of bilateral frontal, parietal, temporal, occipital grey matter; limbic cortex; cerebellar GM/WM; brainstem; deep grey matter; cerebral WM; total T2 lesion volume; regional NAWM T1/T2 ratios in corpus callosum, frontal, temporal, parietal, occipital lobes, cingulate bundle, cerebellum. Lesions segmented with Lesion Segmentation Toolbox and DeepMedic; tissue segmentations via GIF; T1/T2 ratio maps normalized using ventricular CSF. Regions based on the Neuromorphometrics atlas. Normalization and variable selection: In healthy controls, Bayesian linear regression adjusted MRI variables for total intracranial volume, age, and age squared. Patient values were adjusted and Z-scored against healthy distributions. From 18 features, those with significant differences and moderate-to-large effect sizes (Cohen’s d > 0.5) at baseline (13 retained) were entered into SuStaIn. Signs were flipped so higher Z-scores represent worse disease (lower volumes/T1-T2 considered worse). SuStaIn training and internal validation: Leave-one-dataset-out cross-validation across 14 training datasets determined optimal number of subtypes by maximizing held-out log-likelihood. Uncertainty in subtype trajectories assessed via MCMC (100,000 iterations), with consistency across folds quantified using Bhattacharyya coefficients. Final model refit on all 14 training datasets. External validation: Applied trained SuStaIn to five independent datasets to assign baseline subtype membership and stage. Differences in hazard of 24-week confirmed disability progression (CDP) across subtypes were tested with Cox models; disease activity differences (annualized relapse rate and baseline contrast-enhancing lesion counts) assessed with general linear models. Prognostic modeling: Using merged datasets, constructed mixed-effects models (trial as random effect) to assess associations of MRI-based subtypes and stages (and standard phenotypes) with time-to-24-week CDP, adjusting for age, sex, and baseline EDSS. Evaluated survival model concordance index (C-index) for models with SuStaIn subtype/stage alone and combined with clinical metrics (EDSS, Timed-Walk Test, 9-Hole Peg Test). Treatment response analyses: In three phase 3 progressive MS RCTs (ORATORIO, ASCEND, OLYMPUS; n=2099 pooled) and three RRMS phase 3 RCTs (DEFINE/CONFIRM/ENDORSE, OPERA1, OPERA2; n=2696 pooled), compared EDSS worsening slopes between each subtype on active treatment versus the same subtype on placebo using linear mixed-effects models with group, time, and interaction; reported percentage difference in EDSS worsening as treatment response. Only placebo/comparator arms used to characterize natural history. Two-tailed tests were applied throughout.

Key Findings
  • Identified and validated three MRI-based MS subtypes with distinct temporal trajectories: cortex-led, NAWM-led, and lesion-led. Subtypes were defined by the earliest abnormalities and stageable sequences of changes in regional GM atrophy, NAWM T1/T2 ratios, and lesion accrual.
  • Lesion-led subtype characteristics: highest baseline EDSS, longest disease duration, highest lesion load and accrual, smallest cortical and deep grey matter volumes, highest baseline SuStaIn stage, and fastest annual stage increase (all p<0.001). It was the second most frequent subtype in validation.
  • Disability progression: Significant differences in EDSS progression rates across subtypes in training and validation (log-rank p=0.05 and p=0.006). Lesion-led had higher risk of 24-week CDP vs cortex-led: training +30% (95% CI 5–62%, p=0.01); validation +32% (95% CI 9–59%, p=0.004). No other pairwise differences observed.
  • Disease activity: Lesion-led had the most active disease. Baseline contrast-enhancing lesion counts highest in lesion-led (training mean 2.28, SE 0.33; validation 2.35, SE 0.23; p<0.001 vs others). Annual relapse rate highest in lesion-led (training 0.56, SE 0.07; validation 0.41, SE 0.03). No significant difference between cortex-led and NAWM-led.
  • Staging predicts progression: Highest tertile of baseline stage (17–39) had shortest time to 24-week CDP (log-rank p<0.0001), with 37% higher risk vs lowest tertile (1–9) and 30% vs middle tertile (10–17) (both p<0.001).
  • Associations with CDP: Baseline MRI-based subtypes (β=0.04, SE 0.01, p=0.02) and stages (β=−0.06, SE 0.02, p<0.001) associated with time to CDP; standard clinical phenotypes (β=0.18, SE 0.15, p=0.22) and baseline EDSS (β=0.02, SE 0.03, p=0.26) were not.
  • Prognostic performance: Survival model C-index 0.55±0.01 using SuStaIn stage+subtype improved to 0.63±0.01 when adding clinical variables (EDSS, Timed-Walk, 9-Hole Peg Test) (p<0.01).
  • Treatment response: Lesion-led subtype showed significant treatment response. Progressive MS pooled trials (n=2099): −66% average EDSS worsening vs placebo for lesion-led (SE 25.6%, p=0.009). RRMS pooled trials (n=2696): −89% EDSS worsening vs placebo for lesion-led (SE 44%, p=0.04). No significant treatment effect observed for cortex-led or NAWM-led subtypes.
Discussion

MRI-based subtyping via SuStaIn uncovered three biologically plausible MS subtypes with distinct temporal MRI trajectories that were reproducible across independent datasets. These subtypes and their stages were more predictive of disability progression than traditional clinical phenotypes and baseline EDSS, indicating that MRI-derived patterns better capture underlying disease mechanisms relevant to outcomes. The lesion-led subtype, characterized by early extensive lesion accrual followed by deep grey matter atrophy, had the most aggressive course with highest relapse activity and disability progression risk, consistent with early focal inflammatory demyelination leading to secondary neurodegeneration. It was also the only subtype with clear treatment responsiveness in both RRMS and progressive MS trials, suggesting utility for enriching trials targeting inflammatory activity. The cortex-led subtype exhibited early cortical atrophy with later NAWM abnormalities and less treatment responsiveness, pointing to a more insidious neurodegenerative component and potentially different therapeutic needs. Combining MRI-based subtypes with simple clinical measures improved individual-level prognostic accuracy. Robustness analyses indicated that subtype effects outweighed center/trial effects despite multi-center heterogeneity, and the trained model generalized to unseen datasets.

Conclusion

The study identifies three MRI-based MS subtypes—cortex-led, NAWM-led, and lesion-led—using the SuStaIn algorithm applied to large multi-trial MRI datasets. These subtypes, with stageable temporal trajectories, predict disability progression, disease activity, and treatment response better than conventional clinical phenotypes. Integrating MRI-based subtypes with clinical measures enhances prognostication. Because subtyping can be performed from a single time-point using routine trial MRI sequences, this approach can prospectively enrich clinical trials by selecting patients more likely to benefit from specific therapies. Future work should adapt and validate the model in real-world clinical imaging, incorporate spinal cord measures, and explore additional MRI modalities to further refine subtyping and staging.

Limitations
  • MRI modality limitations: Advanced microstructural measures (e.g., diffusion tensor imaging, magnetization transfer) were not available across trials, limiting sensitivity to NAWM changes.
  • Spinal cord imaging was not available, despite its known relevance to disability in MS.
  • Multi-center heterogeneity: Data from 772 centers and varied protocols may introduce noise; although controlled analytically and shown robust, classifications/staging may be noisier than single-scanner models.
  • Clinical variables were not included in SuStaIn due to algorithmic assumptions (monotonicity, normative distributions), potentially omitting predictive information during subtyping (added later for prognosis).
  • Data access constraints prevented re-analysis of treatment effects within individual RCTs beyond pooled analyses.
  • Potential sampling bias: Validation set composition (e.g., ORATORIO enriched for active PPMS) influenced subtype prevalence compared to training.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny