logo
ResearchBunny Logo
Artificial intelligence sepsis prediction algorithm learns to say “I don’t know”

Medicine and Health

Artificial intelligence sepsis prediction algorithm learns to say “I don’t know”

S. P. Shashikumar, G. Wardi, et al.

Discover COMPOSER, a groundbreaking deep learning model developed by Supreeth P. Shashikumar, Gabriel Wardi, Atul Malhotra, and Shamim Nemati for early sepsis prediction. This innovative approach minimizes false alarms and offers timely warnings for critical patient situations, potentially saving lives.

00:00
00:00
Playback language: English
Introduction
Sepsis, a life-threatening organ dysfunction caused by a dysregulated host response to infection, accounts for a significant number of hospital deaths. While effective treatment protocols exist, early and reliable detection remains a challenge. The increasing use of electronic health records (EHRs) has spurred the development of machine learning-based sepsis prediction tools. However, existing models often suffer from limitations such as poor generalizability across institutions, high false alarm rates, and the risk of automation bias (over-reliance on system output). These limitations are exacerbated by data distribution shifts (encountering unfamiliar patients), variations in data missingness, and differences in hospital workflows. This study addresses these challenges by proposing COMPOSER (COnformal Multidimensional Prediction Of SEpsis Risk), a deep learning model designed to improve generalizability and reduce false alarms by incorporating a conformal prediction network. This network statistically determines conformity with a predefined collection of representations (conformal sets), essentially establishing the algorithm's 'conditions for use' under various scenarios. COMPOSER aims to predict sepsis onset 4–48 hours prior to clinical suspicion.
Literature Review
The introduction section reviews existing literature on sepsis prediction models, highlighting the challenges of generalizability, high false alarm rates, and automation bias. It cites several studies that use machine learning for sepsis detection and prediction, noting that most are limited by their reliance on data from a single hospital or healthcare system. The existing literature lacks a built-in mechanism for detecting outliers and establishing the 'conditions for use' across various settings. This gap in existing algorithms motivates the development of COMPOSER, which explicitly addresses the issue of generalizability and low false alarm rates through a novel prediction scheme.
Methodology
COMPOSER consists of three modules: a weighted input layer, a conformal prediction network, and a sepsis predictor. The weighted input layer scales clinical variables based on the time since last measurement, mimicking clinician prioritization of recent data. The encoder network reduces data dimensionality, creating representations robust to missingness and institutional variations. The conformal prediction network, the core of the approach, statistically assesses whether a new patient's features conform to the training data's distributions (conformal sets). This determines the model's applicability to the new data. If conformity is established, the features are passed to the sepsis predictor (a feed-forward neural network followed by logistic regression) to generate a risk score. If not, the prediction is labelled as 'indeterminate'. The model was trained and validated on six patient cohorts (515,720 patients) from two academic medical centers, including internal and external validation sets, as well as temporal validation to assess the model's performance over time. Performance was evaluated using AUC, PPV, NPV, specificity, and false alarms per patient hour (FAPH), with a focus on achieving a clinically relevant prediction window (4–48 hours before sepsis onset), and a 6-hour silencing period after each alarm.
Key Findings
COMPOSER consistently achieved high AUCs across all six cohorts (ICU: 0.925–0.953; ED: 0.938–0.945). Internal testing showed an AUC of 0.953 and 0.945 for ICU and ED cohorts, respectively, with low false alarm rates. External validation on Hospital-B data demonstrated a significant reduction in false alarms (85.5% in ICU and 77.9% in ED) compared to a baseline feedforward neural network while maintaining superior AUC and PPV. The conformal prediction effectively identified indeterminate cases, with a higher percentage of indeterminate predictions among non-septic patients (75–86% of prediction windows satisfied the conditions for use). Patient-wise analysis showed a minimal deleterious effect of conformal prediction on sensitivity. Temporal validation showed consistent performance over time, with only a small reduction in AUC and PPV and a significant reduction in false alarms compared to a commercially available sepsis prediction model (ESPM). COMPOSER also showed significantly better lead times before clinical suspicion of sepsis compared to the ESPM model. The model exhibited high negative predictive values across all cohorts (98.8–99.1% in ICU, 99.5–99.6% in ED), indicating low missed-detection rates. Analysis of false alarms revealed that many were associated with patients presenting other critical conditions.
Discussion
COMPOSER's strong performance across diverse cohorts demonstrates its generalizability and robustness to data distribution shifts. The integration of conformal prediction effectively addresses the issue of false alarms by explicitly defining the model's 'conditions for use'. The high NPV mitigates concerns of missed detections and automation bias. The results highlight the importance of considering both false positives and false negatives in the development and evaluation of clinical AI models. The ability to flag indeterminate cases is a significant improvement over existing systems, reducing the risk of misinterpreting model outputs. Future work will focus on improving actionability by further categorizing indeterminate cases and developing a refined algorithm verification process. Additional research could involve expanding the model to include non-ICU settings, incorporating higher-resolution data from various sources, and conducting prospective clinical trials to further validate COMPOSER's real-time performance.
Conclusion
COMPOSER represents a significant advancement in sepsis prediction, offering high accuracy, reduced false alarms, and improved generalizability. The incorporation of conformal prediction provides a robust mechanism for handling out-of-distribution data and managing the risk of automation bias. Future research should explore refinements to enhance actionability and expand the model's applicability to different clinical settings.
Limitations
The study focused on ICU and ED settings, limiting generalizability to other care units. The gold standard for sepsis definition has limited temporal resolution, and competing risk factors may influence diagnostic certainty. The reliance on retrospective data and the sample-and-hold approach with mean imputation for missing data might affect the generalizability of findings. Future prospective clinical trials are needed for further validation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny