Medicine and Health

The predictive performance of artificial intelligence on the outcome of stroke: a systematic review and meta-analysis

Y. Yang, L. Tang, et al.

This innovative study conducted by Yujia Yang and colleagues reveals the promising accuracy of artificial intelligence models in predicting stroke outcomes, with a pooled AUC of 0.872. Discover how AI technologies, particularly SVM and Xgboost, can enhance the decision-making process for physicians in stroke prognosis.... show more

Introduction

The study investigates how accurately artificial intelligence (AI) models predict outcomes after acute stroke, addressing the variability and uncertainty of clinician-based prognostication, especially for less experienced physicians. Stroke, particularly ischemic stroke, remains a leading cause of morbidity and mortality. While AI, including machine learning (ML) and deep learning (DL), has advanced diagnostic tasks, prognostic prediction is challenging due to the interplay of numerous clinical and patient-specific factors. Prior studies report mixed accuracy across algorithms, motivating a systematic review and meta-analysis to quantify overall predictive performance and compare algorithms using AUC.

Literature Review

Previous work shows heterogeneous performance across AI methods for stroke outcome prediction. Tree-based methods such as random forests achieved high AUCs (e.g., 0.936 in acute ischemic stroke; 0.917 in intracerebral hemorrhage) but on relatively small datasets with limited representativeness. SVM-based approaches using neuroimaging features reported AUCs from approximately 0.788 to 0.92. Deep neural networks demonstrated high accuracy in certain cohorts (e.g., AUC ≈ 0.904 in minor stroke; ≈0.88 in acute ischemic stroke) but suffer from limited interpretability. Traditional regression-based scores, while interpretable, often fail to capture complex nonlinearities; examples include AUCs of 0.808 for functional outcomes and 0.706 for survival from the Virtual International Stroke Trials Archive, and a biomarker-based CoRisk score AUC of 0.819 requiring copeptin measurement. These mixed findings underscore the need to synthesize evidence on AI models’ prognostic accuracy and compare performance by algorithm.

Methodology

Design: Systematic review and meta-analysis conducted per PRISMA guidelines. Databases and search: PubMed, Embase, and Web of Science searched from inception to February 2023 using terms including “acute stroke,” “artificial intelligence,” “deep learning,” “machine learning,” “prognosis,” and “outcome.” Eligibility: Cohort studies (retrospective or prospective) with patients diagnosed with acute ischemic or hemorrhagic stroke and reported prognostic outcomes (functional, radiologic, morbidity, or mortality). Index test: AI-based prognostic predictions. Reference standard: Recognized prognostic outcomes documented in each study. Primary outcome: AUC with 95% CI or standard error for AI models predicting stroke prognosis. Data extraction: Two independent reviewers extracted study characteristics (author, year, country), population details (age, sex), outcomes, and AI algorithms. Disagreements were resolved by a third reviewer. Quality assessment: Risk of bias and applicability were evaluated using the QUADAS tool independently by two reviewers. Statistical analysis: Pooled AUCs were calculated, primarily using a fixed-effects model; heterogeneity assessed via Q statistic and I², with significance defined as p<0.05 or I²>50%. Sensitivity analysis used leave-one-out elimination to test robustness. Software: MedCalc was used for statistical analyses.

Key Findings

Included studies: 7 studies with 4,379 ischemic stroke participants and 17 AI models across five algorithm families (SVM, RF, LR, DL, XGBoost). - Risk of bias: Low to moderate risk; overall low heterogeneity (I² = 27.67%). - Overall predictive performance: Pooled AUC (fixed-effects) = 0.872 (95% CI 0.862–0.881). - Subgroup AUCs: - Deep learning: 0.888 (95% CI 0.872–0.904) - Logistic regression: 0.852 (95% CI 0.835–0.869) - Random forest: 0.863 (95% CI 0.845–0.882) - Support vector machine: 0.905 (95% CI 0.857–0.952) - XGBoost: 0.905 (95% CI 0.805–1.000) - Outcomes used in primary studies included modified Rankin Scale (mRS≤2 as good outcome), radiological biomarkers, follow-up lesion volume, and neurological deterioration. - The findings indicate AI models perform well for ischemic stroke outcome prediction, with SVM and XGBoost showing the highest pooled AUCs among subgroups.

Discussion

This meta-analysis demonstrates that AI models provide good discriminatory ability for predicting outcomes after ischemic stroke, addressing the need for objective tools to support prognosis beyond clinician experience. Compared with traditional linear regression-based scores, AI approaches capture nonlinearities and interactions, achieving higher AUCs in many cases. Among algorithms, SVM and XGBoost showed the best pooled performance; DL also performed strongly but may face barriers in clinical adoption due to complexity and interpretability issues. The included models commonly used routinely available demographic, laboratory, and imaging features, enhancing clinical applicability and potential integration into electronic health records. Nonetheless, algorithm selection rationale was often unreported, and while accuracy is promising, interpretability and feasibility remain crucial for real-world use. The findings support further development and validation of AI-based prognostic tools in ischemic stroke, ideally with transparent modeling and external validation across diverse populations.

Conclusion

AI predictive models achieve high accuracy in forecasting outcomes after ischemic stroke and can aid clinicians in prognostication and treatment planning. SVM, XGBoost, and DL methods exhibit strong performance, surpassing many traditional models. Future research should focus on larger, multicenter datasets; comprehensive reporting of sensitivity/specificity alongside AUC; improved interpretability; external validation; and seamless implementation within clinical workflows and electronic health records.

Limitations

Metric limitation: Meta-analysis focused on AUC due to incomplete reporting of sensitivity, specificity, and accuracy in primary studies. - Scope: Systematic rather than exhaustive search may have missed relevant studies. - Sample sizes: Included datasets were relatively small; only one study exceeded 1,000 participants, limiting generalizability and the ability to exploit high-dimensional predictors. - Population: Findings primarily reflect ischemic stroke; many hemorrhagic stroke studies did not meet inclusion criteria. - Model interpretability: Some high-performing models (e.g., DL) have limited transparency, affecting clinical adoption.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

A systematic review and meta-analysis of the impact of transcranial direct current stimulation on cognitive function in older adults with cognitive impairments: the influence of dosage parameters

T. Prathum, T. Chantanachai, et al.

Medicine and Health

Systematic review and meta-analysis of performance of wearable artificial intelligence in detecting and predicting depression

A. Abd-alrazaq, R. Alsaad, et al.

Psychology

The effects of mindfulness-based interventions on anxiety, depression, stress, and mindfulness in menopausal women: A systematic review and meta-analysis

Hl, Hz, et al.

Medicine and Health

The effect of exercise on blood concentrations of angiogenesis markers in older adults: a systematic review and meta-analysis

B. X. Song, L. Azhar, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny