logo
ResearchBunny Logo
Introduction
Antimicrobial resistance (AMR) is a global public health crisis. The effectiveness of antibiotics diminishes rapidly due to AMR, making it challenging to find effective empiric antibiotic treatments for hospitalized patients, minimizing collateral resistance. Inappropriate empiric treatment contributes significantly to AMR prevalence. Despite guidelines and stewardship initiatives, bug-drug mismatch remains high. Ciprofloxacin, a widely used fluoroquinolone antibiotic, exemplifies this issue. Its widespread use has led to increased resistance, impeding effective therapy. However, sensitivity to quinolones can quickly recover with decreased consumption, highlighting the importance of minimizing unnecessary use. Machine learning (ML) offers a potential solution, with its ability to analyze electronic medical records (EMRs) and predict resistance, potentially supporting clinicians' decisions on empiric therapy. Existing ciprofloxacin prediction models are limited in scope, often focusing on specific patient subsets, infections, or healthcare settings. This study aimed to develop an ensemble ML model to predict ciprofloxacin resistance in hospitalized patients using EMRs, incorporating hospital-wide resistance frequencies as variables. The models were applied in two settings: with and without knowledge of the infecting bacterial species.
Literature Review
The existing literature highlights the growing problem of ciprofloxacin resistance and the need for improved methods to predict it. Studies have explored various approaches, including statistical models and machine learning techniques, but many have limitations in terms of data scope, generalizability, and inclusion of relevant contextual factors such as local resistance patterns within a hospital. Some research has focused on specific infections or patient populations, limiting the broad applicability of the findings. This study builds upon previous work by addressing these limitations through a more comprehensive dataset and a novel approach to incorporate relevant hospital-level resistance data.
Methodology
Data was collected from EMRs of patients at Meir Medical Center in Israel (serving ~600,000 residents) with positive bacterial cultures tested for ciprofloxacin susceptibility between 2016-2019. Data included demographics, functional status, antibiotic usage, hospitalization history, bacterial pathogen, and susceptibility results. VITEK 2 and disk diffusion were used for susceptibility testing. Intermediate resistance was considered resistant. Additional features related to previous resistant infections, antibiotic usage, and hospitalizations were engineered. The final dataset contained 10,053 susceptibility test results from 5540 patients and 73 variables. Two datasets were created: bacteria-gnostic (with bacterial species information) and bacteria-agnostic (without bacterial species information). A time-based train-test split (75%/25%) was used to minimize data leakage. An ensemble model was developed using four base learners: LASSO-penalized logistic regression, random forest, gradient-boosted trees, and neural networks. Hyperparameter optimization was performed using 200 random searches with five-fold time series cross-validation. A stacking technique (super learner) combined base learner predictions to improve overall prediction. Model performance was evaluated using ROC-AUC, with confidence intervals calculated using bootstrapping. SHAP values were used for model interpretability. A decision curve analysis evaluated the net benefit of using the model for clinical decision-making.
Key Findings
The ensemble model consistently outperformed all base learners. For the bacteria-agnostic dataset, the ensemble achieved an ROC-AUC of 0.737 (95% CI 0.715–0.758) on the independent test set. The bacteria-gnostic dataset yielded a higher ROC-AUC of 0.837 (95% CI 0.821–0.854). Both models were well-calibrated. SHAP analysis identified key influential variables. In the agnostic dataset, these included previous ciprofloxacin resistance, patient origin (hospital, nursing home, etc.), recent hospital-wide antibiotic resistance in similar units, and previous resistance to ciprofloxacin. In the gnostic dataset, key variables were average resistance of the same bacterial species, prior fluoroquinolone resistant infections, the bacterial species (*P. aeruginosa* being influential due to the chosen reference species), and prior resistance to non-ciprofloxacin antibiotics. Decision curve analysis showed that using the model predictions offered a net benefit compared to assuming all infections were either resistant or susceptible across a range of cost-benefit ratios.
Discussion
The high predictive performance and good calibration of the ensemble ML models offer a significant advancement in predicting ciprofloxacin resistance in hospitalized patients. The inclusion of hospital-wide resistance patterns, in addition to individual patient data, proved crucial for improved accuracy. The models' ability to perform effectively in both bacteria-agnostic and -gnostic settings is valuable, as often the infecting species is initially unknown. The identified influential variables align with existing clinical knowledge and literature, enhancing the models' credibility and interpretability. The decision curve analysis demonstrates the clinical utility of these models across various cost-benefit scenarios, suggesting potential for practical application in guiding antibiotic stewardship and reducing unnecessary ciprofloxacin use.
Conclusion
This study demonstrates the feasibility and clinical relevance of using machine learning to predict ciprofloxacin resistance in hospitalized patients. The high predictive accuracy, well-calibrated probabilities, and clinically relevant influential variables highlight the potential of incorporating these models into clinical decision-support systems. Future research could focus on expanding the dataset to include additional patient characteristics, community-level antibiotic resistance data, and longitudinal data to further refine model performance and generalizability. Further research is also needed to explore the optimal integration of these models into clinical workflows and evaluate their impact on patient outcomes and antibiotic resistance rates.
Limitations
The study's limitations include the potential lack of generalizability to other hospitals or time periods due to variations in patient demographics, antibiotic use, and AMR patterns. The data lacked community-level information on antibiotic consumption, which may influence resistance. The model's performance might vary across different age groups, especially younger patients. The study relies on retrospective data, and prospective validation is needed. Finally, while SHAP values provide valuable insights into variable importance, they do not definitively establish causal relationships.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny