logo
ResearchBunny Logo
Introduction
The COVID-19 pandemic caused significant morbidity and mortality. Early identification of high-risk individuals is crucial for resource allocation and treatment optimization. Machine learning offers a potential solution for predicting ICU admission and mortality in COVID-19 patients. While some studies have explored machine learning in predicting COVID-19 outcomes, a comprehensive comparison of various algorithms is lacking. This study aimed to evaluate the performance of 18 different machine learning algorithms in predicting ICU admission and mortality among COVID-19 patients using data from the Mass General Brigham (MGB) Healthcare database, with the goal of identifying the best prognostication algorithm and key predictive variables.
Literature Review
Machine learning models have shown promise in predicting various clinical outcomes, including acute kidney injury and septic shock. These tools have also been applied to predict outcomes in outpatients. However, few studies have specifically compared various machine learning algorithms for predicting ICU admission and mortality in COVID-19 patients. This study addresses this knowledge gap by systematically evaluating a wide range of algorithms and identifying key predictors.
Methodology
Data were obtained from the MGB Healthcare database, including 3597 patients from March-April 2020 (training dataset) and 1711 patients from May-August 2020 (temporal validation dataset). Eighteen machine learning algorithms were evaluated, covering diverse categories such as ensemble, Gaussian process, linear, naïve Bayes, nearest neighbor, support vector machine, tree-based, discriminant analysis, and neural network models. Missing data were imputed using the k-nearest neighbor algorithm. Model performance was assessed using various metrics, including F1 score, PR AUC, ROC AUC, balanced accuracy, and Brier score. Hyperparameter tuning was performed using StratifiedKFold cross-validation (5 folds). SHAP analysis was used to interpret the models and identify key predictive variables. The training dataset was balanced using random undersampling of the majority class.
Key Findings
Ensemble-based models consistently outperformed other model types in predicting both 5-day ICU admission and 28-day mortality. For ICU admission, C-reactive protein (CRP), lactate dehydrogenase (LDH), and oxygen saturation were key predictors. For mortality, eGFR <60 ml/min/1.73 m², neutrophil and lymphocyte percentages were significant. While the performance of all models decreased in the temporal validation dataset, ensemble models maintained their superior performance. SHAP analysis revealed consistent key predictors for ICU admission across both datasets, while some shifts were observed in mortality prediction (D-dimer and initial oxygen saturation became more important in temporal validation).
Discussion
This study provides a comprehensive comparison of various machine learning algorithms for predicting ICU admission and mortality in COVID-19. The superior performance of ensemble methods highlights their potential for improving clinical decision-making. The identified key predictive variables align with existing clinical knowledge and suggest additional factors to consider in risk stratification. The decline in model performance in the temporal validation dataset might be attributable to changes in clinical practices, evolution of the virus, or dataset imbalance. Future research should investigate these factors further. The study's findings could be used to develop clinical prediction tools to aid in the early identification of high-risk patients.
Conclusion
This study demonstrates the superior performance of ensemble-based machine learning models for predicting ICU admission and mortality in COVID-19 patients. Key predictive variables were identified, providing insights for risk stratification. While limitations exist, the findings highlight the potential of these models to augment clinical decision-making. Future research should focus on larger, more diverse datasets and longitudinal studies to enhance the generalizability and robustness of these predictive models.
Limitations
The study's limitations include potential data distortion from missing value imputation, the use of F1-score as a primary metric (which doesn't always have a clear intuitive explanation), and the use of SHAP analysis which needs further adaptation for ensemble methods. Clinically, the study is limited by the unavailability of data on disease course prior to ED presentation and the potential delay in laboratory results availability. The generalizability of the models may be limited by the study's focus on a specific geographical region and hospital system.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny