
Medicine and Health
Arrhythmic sudden death survival prediction using deep learning analysis of scarring in the heart
D. M. Popescu, J. K. Shade, et al.
This groundbreaking research, conducted by an expert team including Dan M. Popescu and Julie K. Shade from Johns Hopkins University, introduces a novel deep learning approach to predict patient-specific survival curves for ischemic heart disease. Utilizing advanced imaging and clinical data, this study demonstrates impressive accuracy in forecasting survival, potentially revolutionizing decision-making in arrhythmic death probabilities.
~3 min • Beginner • English
Introduction
Sudden cardiac death (SCD) remains a leading cause of mortality, with high incidence in Europe and North America, and patients with coronary artery disease face particularly elevated risk of arrhythmic sudden cardiac death (SCDA). Current ICD implantation criteria based primarily on reduced left ventricular ejection fraction capture only a small fraction of patients who will experience SCDA, underscoring the need for improved, personalized risk stratification. Prior approaches typically provide subgroup-based risk at fixed time points, lack individualized temporal risk trajectories and prediction uncertainty, and often have limited external validation. Given that ventricular arrhythmias arise mechanistically from heterogeneous myocardial scar distribution visible on late gadolinium enhancement (LGE) cardiac MRI, the authors hypothesize that a deep learning framework leveraging raw LGE-CMR together with clinical covariates within a survival modeling paradigm can accurately predict individualized survival curves for SCDA over time, including calibrated uncertainty, and generalize across cohorts.
Literature Review
Previous risk stratification efforts have attempted to move beyond LVEF thresholds by incorporating additional clinical and imaging-derived predictors, but have been hindered by reliance on manually engineered features, arbitrary intensity thresholds, coarse or mathematically opaque descriptors, and limited external validation. Mechanistic image-based computational electrophysiology models using scar distribution can predict arrhythmic risk but are computationally intensive and impractical for broad screening. Deep learning advances in cardiology have largely focused on arrhythmia detection from ECG, with little progress using contrast-enhanced CMR to assess arrhythmia risk. The study positions a DL survival framework to directly learn prognostic features from raw LGE-CMR and non-linear relationships among clinical covariates, overcoming limitations of prior linear Cox models and manual feature pipelines.
Methodology
Study design and cohorts: Retrospective analysis using two prospective cohorts. Internal development/validation cohort: LVSPSCD (NCT01076660), patients meeting ICD therapy criteria (LVEF ≤35%) from three centers, focusing on ischemic cardiomyopathy with adequate LGE-CMR (n=156). External test cohort: PRE-DETERMINE (NCT01114269) and DETERMINE Registry (NCT00487279), multi-center patients with coronary disease, mild-moderate LV dysfunction; among 809 with LGE-CMR, 23 SCD cases were risk-set matched to four controls on age, sex, race, LVEF, and follow-up (final analyzed n=113 after excluding 2 for poor image quality). Baseline covariates included demographics, risk factors, CMR and ECG measurements, and medications (22 harmonized variables). Primary endpoint: SCDA for internal (ICD therapies for ventricular arrhythmias or ventricular arrhythmia not corrected by ICD); sudden and/or arrhythmic death for external (per Hinkle and Thaler criteria; aborted arrhythmic deaths included).
Imaging acquisition and preprocessing: LGE-CMR acquired at 1.5T (GE, Siemens) with heterogeneous protocols across sites; typical spatial resolution 1.5–2.4×1.5–2.4×6–8 mm, 2–4 mm gaps, images captured 10–30 minutes post-contrast. Automated LV myocardium segmentation via a cascaded CNN approach (two U-nets with residual connections plus an encoder–decoder correction module enforcing anatomical rules). 2D slices stacked apex-to-base, voxels outside myocardium zeroed, volumes resampled to 64×64×12 grid (voxel size 2.5×2.5×10 mm). Input tensor had two channels: (1) one-hot mask of myocardium and blood pool; (2) myocardium-only raw LGE intensities scaled by half the inverse of median blood pool intensity per slice. Data augmentation: 3D in-plane 90° rotations and panning. Covariates were standardized (demeaned, unit variance).
Model architecture (SSCAR): A supervised deep survival model with two sub-networks: (1) CMR branch: 3D convolutional encoder–decoder. Encoder: 3D conv + pooling + non-linear activations and dropout; a dense layer produced a compact encoding. Two heads: a survival head that stratified encodings into learned risk categories, then mapped to two outputs—location (μ) and scale (σ) parameters of a log-logistic distribution modeling log-time to SCDA—via a dense layer with bespoke survival activation (clipping ln μ to [−3, 3]; σ clipped from below to ensure at least 1-month 5th–95th percentile span), and a reconstruction head using transposed convolutions to reconstruct the input, serving as regularization. (2) Covariate branch: multi-layer dense network with dropout, outputting μ and σ similarly. Ensembling: After training branches separately, their survival outputs were frozen and combined via a learned linear combination to yield final μ, σ per patient.
Statistical survival model and loss: Time-to-event T modeled with a cause-specific log-logistic hazard with patient-specific μi, σi. Survival function S(t; μ, σ) = 1/(1 + exp((log t − μ)/σ)). Training minimized the negative log-likelihood accounting for right censoring and class imbalance: −log L = −δ log f(x; μ, σ) − (1−δ) log S(x; μ, σ), implemented as given in the text. The CMR reconstruction branch minimized MSE to input; its weight in the total loss was learned.
Training and hyperparameters: Internal validation used ten-times-repeated stratified ten-fold cross-validation (100 splits). Early stopping based on c-index on a 10% validation subset per fold, with up to 2000 epochs (20 updates/epoch). Optimizers: Adam for CMR (lr 1e−3), covariate branch (5e−3), and ensemble (1e−2); SGD used during hyperparameter tuning (lr 0.01). Hyperparameter search with hyperopt (Parzen estimator), up to 300 iterations for covariate branch and 100 for CMR branch, selecting architecture maximizing Harrell’s c-index. Batch size set by GPU memory (Nvidia Titan RTX). Final model trained on all internal data; external testing performed once on the fixed model. Confidence intervals for external metrics estimated via cross-validation on the external dataset supplemented with internal data in training folds, ensuring external-only test folds.
Evaluation metrics: Discrimination by Harrell’s c-index (risk score based on μ; exp(μ) is mode of the log-logistic), and calibration/discrimination by integrated Brier score (Bs), both IPCW-adjusted for censoring and computed up to 10 years unless specified. ROC and PR curves computed at multiple time points (years 2–9) with thresholds chosen via training-set F-score maximization or Youden’s J. External metrics were not covariate-adjusted, possibly underestimating performance.
Interpretability: Gradient-based sensitivity analyses. For images, a Grad-CAM-like approach adapted to regression: weights are averages of gradients of the location output with respect to last conv-layer feature maps, producing per-pixel gradient maps indicating contributions (sign and magnitude) to predicted μ. For covariates, gradients of μ with respect to each input covariate were averaged over all patients and stratified by SCDA vs non-SCDA.
Key Findings
- Internal validation (LVSPSCD, n=156): c-index 0.82–0.89 over times up to 10 years; integrated Brier score (Bs) 0.04–0.12. AUROC at 10 years: 0.87 (95% CI 0.84–0.90). AUPR at 10 years: 0.93 (95% CI 0.91–0.95). High AUROC maintained at years 2–9.
- External test (PRE-DETERMINE/DETERMINE, n=113): c-index 0.71–0.77; Bs 0.03–0.14. AUROC at 10 years: 0.72 (95% CI 0.67–0.77). AUPR at 10 years: 0.73 (95% CI 0.68–0.78).
- CMR-only sub-network: Internal c-index 0.70 (95% CI 0.67–0.72), Bs 0.17 (95% CI 0.167–0.178); External c-index 0.63 (95% CI 0.59–0.66), Bs 0.19 (95% CI 0.186–0.200).
- Covariate modeling comparison (clinical covariates only): Neural-network feature extraction with Cox survival (covariate only, Cox) outperformed linear Cox PH: c-index 0.73 vs 0.58; balanced accuracy 0.65 vs 0.45; F-score 0.78 vs 0.69; Bs 0.14 vs 0.30 (all at t=10 years, internal cross-validation averages).
- Ensemble benefit: Combining CMR and covariate sub-networks significantly improved overall performance over covariate-only models, indicating complementary information in raw CMR beyond manual imaging features such as infarct size.
- Uncertainty modeling: Predicted scale parameter (σ) positively correlated with absolute prediction error across the internal cohort (Pearson r = 0.42, P < 0.001), demonstrating patient-specific uncertainty estimates that reflect prediction difficulty.
- Interpretability: Gradient maps showed regions where LGE intensities either increased or decreased predicted T_SCDA, indicating nuanced learned relationships beyond simple enhancement masks. Covariate gradients: top positive contributors to longer T_SCDA included higher CMR LVEF, β-blocker use, higher ECG heart rate, and digoxin use; top negative contributors included higher LV mass (ED), diuretic use, longer QRS duration, and larger infarct size (%).
Discussion
The SSCAR framework addresses the central challenge of individualized SCDA risk prediction by integrating deep learning feature extraction from raw LGE-CMR with non-linear modeling of clinical covariates within a principled survival analysis. The model estimates full patient-specific survival curves up to 10 years, providing both a most probable time-to-event (location) and a calibrated, patient-specific uncertainty (scale). This directly remedies limitations of prior models that provide fixed-horizon risk scores without uncertainty quantification. Results show strong internal discrimination and calibration that generalize with modest degradation to a heterogeneous, multi-center external cohort with different case mix and imaging protocols. Crucially, SSCAR using CMR alone surpasses a standard Cox model built on clinical covariates (including manual CMR features), and ensembling with covariates further boosts performance, confirming that learned imaging features capture prognostic information beyond manual descriptors. The uncertainty estimates correlate with prediction error, providing a self-assessment that could inform clinical decision-making by flagging low-confidence predictions. Interpretability analyses offer insight into how both image regions and clinical factors influence risk estimates, enhancing trust and potential clinical adoption.
Conclusion
SSCAR is a deep learning survival framework that predicts individualized survival curves for arrhythmic sudden death in ischemic heart disease by combining raw LGE-CMR images and clinical covariates. It achieves high discrimination and calibration internally and maintains good performance on an independent, multi-center external cohort. The approach advances beyond traditional models by providing patient-specific time-to-event estimates with explicit uncertainty and interpretable feature attributions. Future work could incorporate explicit competing risks with all-cause and non-arrhythmic death data, expand and refine clinical covariates, include right ventricular imaging and additional cardiomyopathies, and validate prospectively across broader populations and imaging platforms to facilitate clinical translation.
Limitations
Competing risks were not explicitly modeled due to lack of all-cause mortality and other competing event data (for example, revascularization), limiting direct estimation of cause-specific cumulative incidence. The covariate set was harmonized between cohorts and not comprehensive (for instance, diuretic subclasses merged; no ARNI data), which may affect covariate-only comparisons though standard LV imaging covariates were retained. Dataset size was relatively small for deep learning, raising overfitting concerns; mitigations included regularization, encoder–decoder reconstruction, augmentation, early stopping, and extensive cross-validation. Internal and external cohorts differ in distributions (e.g., LVEF severity, case-control design, number of imaging sites and protocols), which may contribute to performance differences. External metrics were not covariate-adjusted, potentially underestimating performance. Heterogeneity in multi-center CMR acquisition without artifact correction may introduce noise.
Related Publications
Explore these studies to deepen your understanding of the subject.