Medicine and Health

Impact of a deep learning sepsis prediction model on quality of care and survival

A. Boussina, S. P. Shashikumar, et al.

This study explored the transformative effects of the deep-learning model COMPOSER on sepsis outcomes in emergency departments. The research shows significant improvements in sepsis mortality and care compliance, demonstrating a promising advancement in sepsis management conducted by Aaron Boussina and colleagues at the University of California San Diego.... show more

Introduction

The study addresses the challenge of early recognition of sepsis, a condition affecting tens of millions globally with high mortality. Early interventions such as fluids, antibiotics, and source control are more effective when initiated sooner, yet heterogeneity in presentation makes timely detection difficult. The authors previously developed COMPOSER, a deep-learning model that predicts sepsis onset using real-time EHR data and reduces false alarms via conformal prediction. The research question/hypothesis is whether real-time deployment of COMPOSER, operationalized as a nurse-facing Best Practice Advisory (BPA), is feasible and associated with improved patient-centered outcomes (mortality, organ dysfunction) and care quality (bundle compliance) in emergency departments. A quasi-experimental before-and-after design with historical controls was used to assess impact while accounting for acuity, comorbidities, seasonal effects, and secular trends.

Literature Review

Prior sepsis detection systems based on clinical criteria (e.g., SIRS, hypotension) have shown variable improvements in process measures but limited impact on patient-centered outcomes and often suffer from poor positive predictive value (PPV), contributing to alarm fatigue and provider mistrust. Proprietary EHR-embedded models like the Epic Sepsis Score have exhibited inconsistent performance across institutions with high false positive rates. Recent machine learning approaches (e.g., TREWS) have demonstrated associations with reduced mortality and organ failure when alerts are acknowledged by clinicians, though many studies are non-randomized. A small ICU/ward RCT reported mortality and LOS benefits but was limited in scope. Few deep-learning models have been prospectively evaluated in ED settings, and patient-centered outcomes following deployment have rarely been reported. COMPOSER is positioned to address these gaps through deep learning with conformal prediction to reject out-of-distribution inputs and reduce false alarms, potentially improving clinician trust and adherence.

Methodology

Design: Prospective before-and-after quasi-experimental study at two UC San Diego Health emergency departments (one quaternary academic center, one urban safety-net hospital). IRB approval with waiver of consent (#805726) and ACQUIRE approval (#609); STROBE-compliant. Cohort: Adult patients (≥18 years) meeting Sepsis-3 criteria within 12 h of ED stay between 01/01/2021 and 04/30/2023. Exclusions: transitioned to comfort measures before sepsis onset; sepsis onset after 12 h of hospital admission. Final N=6217 (pre-intervention: 5065; post-intervention: 1152). Baseline characteristics were similar between periods. Sepsis identification: Sepsis onset defined per Sepsis-3 using suspicion of infection (blood culture plus ≥4 days of non-prophylactic IV antibiotics within specified windows) and organ dysfunction (SOFA increase ≥2) occurring from 48 h before to 24 h after suspected infection; onset time taken as time of clinical suspicion of infection. Data sourced from Epic Clarity via SQL. Intervention (Exposure): Nurse-facing BPA triggered by COMPOSER sepsis risk score written to EHR flowsheet; alert shown on chart open to ED nurses for adult patients above a fixed threshold; exclusions: discharged/deceased, comfort care, no longer under ED nursing care, prior sepsis bundle instituted. Acknowledgement options: (i) No infection suspected (8 h lockout), (ii) Sepsis treatment/workup in progress (12 h), (iii) Will notify MD immediately (12 h). Secure chat enabled for provider notification. Pre-intervention: 01/01/2021–12/06/2022; Go-live: 12/07/2022; Post: 12/07/2022–04/30/2023. COMPOSER algorithm: Feed-forward neural network using demographics, comorbidities, medications, vital signs, and labs to predict sepsis within 4 h. Conformal prediction used to flag out-of-distribution cases as indeterminate to reduce false alarms. ED AUROC reported previously at 0.938–0.945. Threshold fixed to 80% sensitivity; prior PPV 20.1% at that sensitivity. Hosted on AWS-based platform leveraging HL7v2 ADT streams and FHIR APIs for hourly data extraction; outputs risk score and top features to EHR flowsheet to trigger BPA. Implementation framework: EPIS (exploration, preparation, implementation, sustainment). Activities included nursing surveys, education, iterative BPA design, and a silent-mode trial with physician review for tuning. Data quality dashboard monitored feature medians against control limits; biweekly checks for sensitivity and PPV; PCCP defined for retraining if performance degraded. Outcomes: Primary—In-hospital mortality. Secondary—Sepsis bundle compliance (initial and repeat lactate if initial >2 mmol/L, blood cultures prior to antibiotics, IV antibiotics within 3 h of sepsis time, 30 mL/kg crystalloid within 3 h for septic shock/hypotension), 72-h change in SOFA after sepsis onset, ICU admission, ICU-free days (30 minus ICU days; 0 for deaths or ICU >29 days). Statistical analysis: Descriptive statistics; Kruskal–Wallis for continuous and chi-squared for categorical variables (alpha 0.05). Causal inference via Bayesian structural time-series (state-space with local linear trend and static spike-and-slab regression on covariates) to estimate counterfactual post-intervention outcomes. Covariates: ED volume, sex, baseline SOFA, Elixhauser comorbidity score, age, COVID-19 status, ED site, season, local trends. Monthly resolution; 1000 MCMC samples; residual diagnostics (QQ and autocorrelation). Additional adjusted linear regression assessed association between BPA acknowledgement reason and time-to-antibiotics.

Key Findings

Population: 6217 septic ED patients (pre 5065; post 1152). Baseline demographics and severity similar between periods (e.g., median SOFA at sepsis 2; median Elixhauser 5).
Alerts: Post-intervention generated on average 235 alerts/month (1.65 alerts per nurse per [text truncated in source]).
Primary outcome (mortality): Post-intervention in-hospital sepsis mortality 9.49% vs expected counterfactual 11.39% (95% CI 9.79–13.00), absolute decrease 1.9% (17% relative), Bayesian one-sided p=0.014; estimated 22 additional survivors over 5 months. Site-level analysis: significant mortality decrease at the safety-net hospital; no significant change at the quaternary center.
72-h SOFA change: Actual 3.56 vs expected 3.71 (95% CI 3.58–3.83), ~4% reduction; p=0.013.
Sepsis bundle compliance: Post 53.42% vs expected 48.38% (95% CI 45.46–51.01), absolute +5.0% (relative +10% [95% CI 5–16%]); significant increases in specific elements including antibiotics timing, repeat lactate, and 30 mL/kg fluids.
Additional process/outcome measures (Table 2): • Blood cultures before antibiotics: actual 73.9% vs expected 72.0% (69.9–73.9%). • Antibiotics within 24 h prior to and 3 h after severe sepsis onset: actual 84.6% vs expected 82.8% (81.3–84.4%). • Lactate within 6 h prior/3 h after: actual 85.6% vs expected 83.4% (81.3–85.8%). • Repeat lactate if initial elevated: actual 98.6% vs expected 97.3% (96.2–98.4%). • Vasoactive meds within 6 h of septic shock: actual 55.5% vs expected 57.5% (46.7–68.2%). • 30 cc/kg fluids within 3 h for shock/hypotension: actual 59.3% vs expected 53.9% (48.9–58.8%). • ICU transfer rate: actual 31.8% vs expected 32.5% (30.7–34.2%). • ICU-free days: actual 25.6 vs expected 25.1 (24.6–25.6%). Trends in ICU transfers (down) and ICU-free days (up) did not reach statistical significance.
Mechanism signal: In adjusted analysis of time-to-antibiotics, nurse acknowledgement “Will Notify MD Immediately” was associated with a significant reduction in time from ED triage to antibiotics (coefficient −19.95 h; p=0.002). “Sepsis Treatment/Workup in Progress” also associated with reduced time (−19.16 h; p=0.010).

Discussion

The deployment of the COMPOSER-driven nurse-facing BPA was associated with improved patient-centered outcomes (lower in-hospital mortality, reduced 72-h SOFA increase) and process measures (higher sepsis bundle compliance). These findings support the hypothesis that real-time deep-learning sepsis prediction integrated into ED workflows can facilitate earlier recognition and treatment, particularly antibiotics, which is a plausible mechanism for reduced mortality and organ dysfunction. The strategy of routing alerts to nurses, enabling rapid provider notification via secure chat, may have enhanced situational awareness and mitigated alert fatigue, aided by COMPOSER’s conformal prediction that rejects out-of-distribution cases to reduce false positives. Comparisons with prior literature (e.g., TREWS) indicate that successful implementation and clinician engagement are critical for translating predictive analytics into outcome improvements. Site-level differences suggest local context influences effect size, emphasizing the role of setting and implementation fidelity.

Conclusion

In a two-ED, before-and-after quasi-experimental study, real-time deployment of the deep-learning COMPOSER model, operationalized via a nurse-facing BPA, was associated with a significant absolute reduction in in-hospital mortality (−1.9%), a significant absolute increase in sepsis bundle compliance (+5.0%), and a reduction in 72-h SOFA change. These results represent, to the authors’ knowledge, the first report of improved patient-centered outcomes following ED deployment of a deep-learning sepsis prediction model. Future work should include multicenter randomized trials, external validation across diverse hospital settings and populations, assessment of long-term sustainability, and evaluation of impacts on non-septic patients and resource utilization.

Limitations

Non-randomized, before-and-after design limits causal inference despite causal impact analysis with confounder adjustment.
Conducted at two EDs within a large academic health system with strong sepsis and informatics programs; generalizability to community hospitals or settings without robust IT infrastructure may be limited.
Potential immediate awareness effects from an abrupt intervention; durability and sustainability over longer periods remain uncertain and may be affected by fatigue/complacency; ongoing education likely necessary.
No concurrent control sites during the same period; all EDs used the model; however, no other QI initiatives occurred during the intervention window.
Did not assess effects on patients who were ultimately non-septic (e.g., potential antibiotic overuse, adverse effects, costs).

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Design and Analysis of a Deep Learning Ensemble Framework Model for the Detection of COVID-19 and Pneumonia Using Large-Scale CT Scan and X-ray Image Datasets

X. Xue, S. Chinnaperumal, et al.

Psychology

Impact of technical reasoning and theory of mind on cumulative technological culture: insights from a model of micro-societies

A. Bluet, F. Osiurak, et al.

Medicine and Health

Impact of the severity of restrictive spirometric pattern on nutrition, physical activity, and quality of life: results from a nationally representative database

S. J. Chung, H. I. Kim, et al.

Medicine and Health

A multimodal deep learning approach for the prediction of cognitive decline and its effectiveness in clinical trials for Alzheimer’s disease

C. Wang, H. Tachimori, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny