logo
ResearchBunny Logo
Introduction
The COVID-19 pandemic significantly strained healthcare systems globally. Early identification of high-risk patients is crucial for resource allocation and intervention. Existing risk factors for COVID-19 mortality often fail to account for temporal changes and patient heterogeneity. This paper addresses these limitations by developing CovEWS, a real-time mortality risk prediction system using machine learning on a large, diverse EHR dataset from multiple healthcare institutions. Unlike existing scores, CovEWS provides early warnings with clinically meaningful predictive performance, updating instantaneously to reflect new patient data. This allows for timely intervention and improved resource allocation. The study also aimed to demonstrate CovEWS's superiority over existing risk scores and its robustness to changing treatment regimes.
Literature Review
Several approaches have been proposed to identify risk factors for COVID-19 mortality, often focusing on demographics and inflammatory markers. However, these approaches frequently lack longitudinal data and consideration of risk factor changes over time. Existing analyses often rely on single data sources, limiting the model’s generalizability. This study aimed to overcome these limitations by developing a model that incorporates multiple data sources, longitudinal risk factors, and real-time updates to accurately predict COVID-19 mortality risk.
Methodology
CovEWS was developed using de-identified EHR data from over 66,000 COVID-19 patients across 69 healthcare institutions, utilizing two federated networks: Optum and TriNetX. The model is a time-varying neural network, designed to handle changing risk factors and their nonlinear interactions. The data included demographics, clinical measurements, vital signs, lab tests, and diagnoses. The model was trained on a subset of the Optum data and evaluated on held-out Optum data and an external TriNetX cohort. The study compared CovEWS’s performance to several baselines, including SOFA, CoVeR, and existing machine learning models. The model's interpretability was assessed using integrated gradients, providing insight into the relative influence of each risk factor. Stratified survival analysis was also conducted to evaluate the system's ability to stratify patients into risk groups over time. The study also assessed CovEWS's robustness to changes in treatment policies using a future Optum cohort. The core of CovEWS is a modified Cox proportional hazards model adapted to handle non-linear and time-varying effects of covariates. The baseline hazard is modeled using a neural network with a hidden layer, allowing for the capture of non-linear interactions between risk factors. The model uses automatic differentiation for gradient computation and an Adam optimizer for training. Hyperparameter optimization was performed using a systematic approach, and post-processing and calibration steps were implemented to ensure consistency with historical data. The integrated gradients (IG) method was used to quantify feature importance. Missing data was handled using multiple imputation by chained equations (MICE).
Key Findings
CovEWS significantly outperformed existing risk scores in predicting COVID-19 mortality across various prediction horizons (1-192 hours) on both the held-out Optum and the external TriNetX cohorts. At a sensitivity greater than 95%, CovEWS achieved a specificity ranging from 78.8% to 69.4% across different prediction horizons. The performance was consistent across various ethnic subgroups, though slightly lower in the Asian subgroup. Even for non-hospitalized patients, CovEWS demonstrated respectable predictive performance, although lower than for the overall cohort due to higher missingness of certain risk factors. Stratified survival analysis confirmed CovEWS's effectiveness in stratifying patients into risk groups with distinct mortality profiles, and this stratification remained consistent between the Optum and TriNetX cohorts. The model showed robustness to changes in treatment policies, evidenced by its performance on a future Optum cohort. Feature importance analysis, using integrated gradients, highlighted key clinical risk factors that influence the risk score.
Discussion
CovEWS addresses the need for a real-time, accurate, and interpretable system for predicting COVID-19 mortality risk. Its superior performance compared to existing methods highlights the advantages of using a time-varying neural network approach and incorporating a comprehensive set of risk factors. The ability to provide early warnings enables timely interventions, potentially preventing or mitigating mortality and optimizing resource allocation. The model's robustness across different datasets and changing treatment protocols demonstrates its generalizability and clinical utility. The interpretability of the model allows clinicians to understand the factors contributing to each patient's risk score, facilitating better clinical decision-making. Future research should focus on integrating CovEWS into clinical workflows and investigating its impact on patient outcomes and resource utilization.
Conclusion
CovEWS is a novel, real-time early warning system for COVID-19 mortality prediction, showing superior performance compared to existing methods. Its accuracy, interpretability, and robustness make it a valuable tool for clinical decision-making and resource allocation. Future work should explore its implementation in real-world clinical settings and its impact on patient outcomes.
Limitations
The study's limitations include potential data biases inherent in EHR data, including missing data and variations in data quality across different healthcare institutions. The lack of patient data on do-not-resuscitate (DNR) status is a limitation. Furthermore, the model's performance might be influenced by variations in data collection methodologies and treatment protocols across different healthcare systems. The impact of social determinants on mortality prediction is not fully addressed.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny