logo
ResearchBunny Logo
Introduction
The COVID-19 pandemic disproportionately affected patients with chronic comorbidities, increasing mortality and hospitalization duration. Early and accurate assessment of disease severity is crucial for resource allocation and improved patient outcomes. Machine learning (ML) algorithms offer a potential solution for rapid clinical evaluation. This study aimed to predict mortality risk and length of stay (LoS) in COVID-19 patients with chronic comorbidities using various ML algorithms. Previous studies have used ML to predict survival and LoS in various chronic diseases, demonstrating its potential for improved healthcare resource management and clinical decision support. However, many studies have employed standard biostatistical methods or focused on specific patient populations (e.g., ICU patients), leaving room for more comprehensive ML approaches incorporating broader patient demographics and clinical features. This research seeks to address these gaps by comparing the performance of several ML algorithms in predicting both mortality risk and LoS in a cohort of COVID-19 patients with a history of any chronic comorbidity, identifying the most impactful clinical variables in the process.
Literature Review
Numerous studies have explored the use of machine learning algorithms to predict mortality risk and length of stay (LoS) in COVID-19 patients. Some studies utilized clinical and laboratory features to build predictive models using algorithms such as Lasso, linear support vector machine (SVM), and random forest. These studies reported varying levels of accuracy and sensitivity in predicting mortality and LoS. Other studies focused on specific populations like ICU patients or explored the role of chronic comorbidities in influencing outcomes. While these studies provide valuable insights, there's a need for more comprehensive research that incorporates diverse chronic comorbidities and utilizes advanced ML techniques for improved prediction accuracy across different outcomes. This research builds on previous work by using a wider range of ML algorithms and a broader patient population to improve the predictive accuracy and clinical usefulness of these models.
Methodology
This retrospective study analyzed data from 1291 COVID-19 patients (900 alive, 391 dead) with at least one chronic comorbidity admitted to Afzalipour Hospital in Kerman, Iran, between March 2020 and January 2021. Patients under 18 years old and pregnant women were excluded. Data were extracted from medical records, including demographic information, chronic comorbidities, symptoms upon admission, discharge status (alive/dead), and LoS. Data preprocessing involved handling missing values and using standard scaler normalization. Feature selection employed filtering methods to identify 26 important features. Several classification algorithms (Random Forest, Multilayer Perceptron (MLP), K-Nearest Neighbor (KNN), AdaBoost, Naïve Bayes, SVM) were used to predict mortality risk, and several regression algorithms (MLP, ElasticNet, support vector regression (SVR), Lasso, Ridge) were used to predict LoS. Ensemble learning methods were also applied to enhance model robustness. Model performance was evaluated using metrics such as accuracy, precision, recall, F1-score, AUC, ROC curve, MSE, RMSE, and MAE. 70% of the data was used for training and 30% for testing. The study was approved by the ethics committee of Kerman University of Medical Sciences.
Key Findings
The study included 1291 patients, with shortness of breath (53.6%), fever (30.1%), and cough (25.3%) being the most common symptoms, and diabetes mellitus (31.3%), hypertension (27.3%), and ischemic heart disease (14.2%) being the most frequent comorbidities. Twenty-six features were extracted for analysis. For mortality risk prediction, the Gradient boosting model achieved the best performance with 84.15% accuracy, outperforming other algorithms like SVM and MLP. The most important factors for predicting mortality were hyperlipidemia, diabetes, asthma, and cancer. For length of stay prediction, the MLP model with a rectified linear unit (ReLU) activation function showed the best performance (MSE = 38.96), outperforming other regression models. Shortness of breath was identified as the most significant predictor of LoS. Ensemble learning methods generally improved model performance. The ROC curves for the best models (SVM for mortality and MLP for LoS) demonstrated good discriminatory power. LIME was utilized to explain individual predictions and illustrate feature importance for individual patients.
Discussion
The findings demonstrate the potential of ML algorithms to accurately predict mortality risk and LoS in COVID-19 patients with chronic comorbidities. The superior performance of Gradient boosting for mortality risk and MLP for LoS prediction aligns with findings from other studies highlighting the effectiveness of these algorithms in similar contexts. The identified key predictors (hyperlipidemia, diabetes, asthma, cancer for mortality and shortness of breath for LoS) underscore the critical role of chronic conditions and specific symptoms in shaping COVID-19 outcomes. The results support the use of ML for early identification of high-risk patients, enabling timely interventions and optimized resource allocation within healthcare systems. The high accuracy achieved by these models highlights their potential clinical utility in improving patient care and reducing healthcare burdens during pandemics.
Conclusion
This study demonstrates the effectiveness of machine learning algorithms, specifically Gradient boosting for mortality risk and MLP for length of stay, in predicting outcomes for COVID-19 patients with chronic comorbidities. The identified key predictors highlight the importance of both chronic conditions and clinical symptoms in shaping disease severity. These findings have important implications for resource allocation and clinical decision-making. Future research could explore the incorporation of additional data such as laboratory and radiological biomarkers to further improve prediction accuracy and investigate the long-term effects of COVID-19 in patients with chronic comorbidities.
Limitations
This study has two main limitations. First, the retrospective nature and single-center design may limit the generalizability of findings. While Afzalipour Hospital was a major COVID-19 treatment center, the patient population might not fully represent the diversity of patients across different geographical regions or healthcare settings. Second, the study did not include crucial prognostic factors like laboratory and radiological biomarkers. Although clinically relevant features were considered, incorporating these additional variables in future studies could enhance the predictive capabilities of the models and improve their clinical relevance.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny