Introduction
Antimicrobial resistance (AMR) is a significant global health threat, hindering the effectiveness of antibiotic therapies and leading to treatment failures. The development of new antibiotics is hampered by high costs and limited support. This reliance on broad-spectrum antibiotics further exacerbates AMR. Accurate alignment of antibiotic therapy with pathogen susceptibility is crucial, ideally at the initiation of empirical treatment. This study proposes interpretable machine learning (ML) methods to predict AMR for UTIs, one of the most common bacterial infections worldwide. UTIs are caused by various pathogens, many of which can be carried asymptomatically and frequently exposed to antibiotics. This leads to high recurrence rates and multidrug-resistant strains. Current empirical treatment often lacks insight into pathogen susceptibility, increasing the risk of ineffective treatment. Prior studies using ML have demonstrated improved predictability of resistance using EHR data, including demographic information, prior antibiotic exposures, microbiology data, lab values, and comorbidities. Logistic regression and gradient boosting decision trees have shown effectiveness, but neural network-based architectures remain underexplored despite their potential benefits for larger datasets. Deep neural networks, though powerful, often lack interpretability, a crucial aspect for clinical decision-making. However, recent advancements in attention-based models, such as the TabNet architecture, offer improved interpretability for tabular data. This study aims to expand on previous work by evaluating ML-based prediction of AMR for potentially complicated UTIs and comparing the performance and interpretability of three interpretable ML architectures, including TabNet. The focus is on potentially complicated UTIs due to their higher risk of treatment failure and adverse outcomes. Interpretable ML is chosen to ensure clinical utility and integration into healthcare practices. The primary objective is predicting antibiotic resistance to assist clinicians, not to determine the necessity or type of antibiotic therapy. This requires clinicians to independently assess treatment suitability for each patient.
Literature Review
Several studies have explored the use of machine learning to predict antibiotic resistance in UTIs. Yelin et al. (2019) demonstrated that logistic regression and gradient boosting decision trees could effectively predict resistance to six different antibiotics using demographic data, microbiology sample history, and antibiotic purchase history. Their algorithm-suggested drug recommendations reduced the rate of mismatched treatments. Kanjilal et al. (2020) used EHR data to predict antibiotic resistance for uncomplicated UTIs, achieving AUROCs between 0.56 and 0.64 across four antibiotics, outperforming clinicians. While these studies primarily used logistic regression and gradient boosting trees, the effectiveness of neural network-based architectures remained largely unexplored. Deep neural networks have shown promise in image and text data but are less explored for tabular data due to issues with overparameterization and interpretability. Ensemble-based decision trees typically outperform deep learning on tabular data due to their interpretability. However, deep neural networks offer benefits like improved performance on large datasets and the ability to utilize transfer learning and self-/semi-supervised learning. Attention-based models, such as TabNet, are specifically designed for interpretable learning from tabular data, dynamically selecting relevant features at each prediction step.
Methodology
The study utilized the AMR-UTI dataset, containing over 80,000 patient records with UTIs from Massachusetts General Hospital (MGH) and Brigham & Women's Hospital (BWH) (2007-2016). The analysis focused on patients with potentially complicated UTIs (101,096 samples), including specimens tested for nitrofurantoin (NIT), co-trimoxazole (SXT), ciprofloxacin (CIP), or levofloxacin (LVX). The feature set included antimicrobial susceptibility profiles, previous specimen features, and basic patient characteristics. Categorical variables were one-hot encoded (787 features total). Missing values were handled using the existing binary representation (1 for presence, 0 for absence, including missing data). Temporal evaluation was used, training models on data from 2007-2013 and testing on data from 2014-2016. 90% of the training data was used for model development and 10% for validation and threshold adjustment. The models were also evaluated on an independent uncomplicated UTI cohort (15,806 specimens) from Kanjilal et al. (2020) for generalizability. Logistic regression (LR), XGBoost, and TabNet models (with and without self-supervised pre-training) were trained for each antibiotic. Hyperparameter optimization was performed using five-fold cross-validation and grid search. Threshold adjustment was applied to address class imbalance, optimizing for balanced sensitivity and specificity. Evaluation metrics included sensitivity, specificity, AUROC, AUPRC, PPV, and F1-score, with 95% confidence intervals calculated from 1000 bootstrapped samples. Feature importance was assessed using coefficients (LR), and importance scores (XGBoost, TabNet). The impact of removing feature sets (prior antibiotic resistance, prior antibiotic exposure, prior infecting organisms) on model performance was also evaluated. An additional experiment was conducted to assess the effect of excluding ethnicity/race as a feature in the best-performing model (XGBoost).
Key Findings
The study found that XGBoost and TabNet models generally outperformed logistic regression, indicating the presence of non-linear trends and interactions. XGBoost achieved the best performance across all antibiotics in terms of AUROC and AUPRC, followed by TabNet with self-supervised learning (TabNetself). Higher predictive performance was observed for second-line antibiotics (CIP and LVX) compared to first-line antibiotics (NIT and SXT). Temporal validation showed consistent performance across the four antibiotics. When validating on the uncomplicated UTI cohort, AUROC and AUPRC scores were lower than the complicated UTI cohort but comparable to those reported by Kanjilal et al. (2020). Excluding ethnicity/race as a feature did not significantly affect the performance of the XGBoost model. Feature importance analysis revealed that prior antibiotic resistance and exposure were the most important predictors of resistance, followed by other antibiotic exposures (including fluoroquinolones, cephalosporins, penicillins), previous UTI history, and comorbidities (paralysis, renal). Removing prior antibiotic resistance significantly decreased AUPRC scores, indicating its crucial role in prediction. Prior antibiotic exposure also significantly impacted AUPRC but less so than prior resistance. While removing prior infecting organism features slightly decreased AUPRC, it was not statistically significant in most cases.
Discussion
The study's findings demonstrate the potential of interpretable machine learning models for predicting antibiotic resistance in complicated UTIs. The superior performance of XGBoost and TabNet over logistic regression highlights the non-linear relationships between features and resistance. XGBoost's potential advantage might stem from its ensemble architecture, improving generalization. While XGBoost outperformed TabNet in this specific context, TabNet's capacity for transfer learning offers advantages for updating models over time. The better performance on the complicated UTI cohort compared to the uncomplicated cohort might be attributed to increased hospital exposure and associated factors, making susceptibility prediction easier. Despite training on the complicated cohort, the models performed comparably to previous studies using only uncomplicated UTI data, potentially due to the larger training dataset. The study acknowledges limitations such as the use of threshold adjustment to address class imbalance and the dataset's incomplete representation of EHR data (missing symptoms, treatment details, lifestyle factors). Threshold adjustment may affect model generalizability across different datasets. The study also highlights the importance of considering how ethnicity/race is captured in data and integrated into machine learning algorithms to prevent bias. Future studies could explore multilabel classification, incorporating additional features, and addressing the challenge of applying ML predictions in stable patient cases where waiting for culture results may be sufficient.
Conclusion
This study demonstrates the feasibility and potential benefits of using interpretable machine learning models to predict antibiotic resistance in complicated UTIs. The findings highlight the importance of prior antibiotic resistance and exposure in predicting resistance. Future research should focus on refining model development and validation, incorporating more comprehensive EHR data, and investigating the clinical utility of these models in different settings. Investigating multilabel classification and addressing the issue of class imbalance are also important avenues for future work.
Limitations
The study acknowledges several limitations. The use of a binary classification approach might oversimplify the clinical reality of intermediate resistance. The AMR-UTI dataset has limitations, such as missing patient symptoms, treatment details, and lifestyle factors, that could potentially improve predictive accuracy. The binary categorization of ethnicity/race might oversimplify a complex issue and potentially introduce bias. The study focused on a specific patient cohort and hospital system, limiting the generalizability of the results to other settings. The threshold adjustment used to address class imbalance may affect the generalizability of the model. Finally, the study didn't explicitly address the question of when rapid prediction might be clinically relevant in stable versus unstable patient cases.
Related Publications
Explore these studies to deepen your understanding of the subject.