logo
ResearchBunny Logo
Introduction
Type 2 diabetes mellitus (T2D) is a prevalent and costly disease, affecting approximately 10% of the US population and 415 million people worldwide in 2015. A significant portion of these cases remain undiagnosed, increasing the risk of serious complications. Current screening methods, primarily relying on fasting blood glucose (FBG) and/or hemoglobin A1c (HbA1c) levels, are recommended for adults aged 35 to 70 years who are overweight or obese. However, the prevalence of undiagnosed diabetes remains unchanged, particularly among older, obese adults, racial/ethnic minorities, and those with limited healthcare access. These underserved populations present a significant opportunity for improved detection methods. Chest radiographs (CXRs) are a ubiquitous and readily available diagnostic tool, with over 26 million radiographs reimbursed by Medicare in 2017. Body Mass Index (BMI), while commonly used, is a flawed predictor of T2D, failing to account for the crucial role of fat distribution, specifically visceral fat. The "thin-fat phenotype" observed in some populations further complicates accurate T2D prediction using BMI alone. Therefore, there is a need for additional, easily accessible indicators or predictors of T2D. Deep learning (DL) methodologies offer a powerful approach to disease detection, enabling the extraction of advanced biomarkers from existing data. Prior research has successfully used DL with abdominal computed tomography (CT) and CXRs to predict various health outcomes, including metabolic syndrome, healthcare expenses, and comorbidities. This study aims to leverage the widespread availability of CXRs and the power of DL to develop a model that can opportunistically detect T2D from readily accessible ambulatory frontal CXRs, thereby enhancing screening efforts and potentially improving early diagnosis and treatment in high-risk populations.
Literature Review
Existing literature highlights the substantial burden of T2D globally and the limitations of current screening strategies. Studies such as Xu et al. (2018) have quantified the prevalence of T2D in the US, while the International Diabetes Federation has provided estimates for global prevalence. The economic consequences of T2D are considerable, as detailed in the American Diabetes Association's report on the economic costs of diabetes in the US (2018). The limitations of BMI as a sole predictor of T2D have been discussed by Ahima and Lazar (2013), emphasizing the importance of considering fat distribution and other factors. Research by Gastaldelli et al. (2002) has underscored the metabolic effects of visceral fat accumulation in T2D. The challenges posed by the thin-fat phenotype in certain populations have been addressed in studies like Kurpad et al. (2011). Furthermore, the potential of DL in medical imaging for disease detection and prediction has been explored in numerous studies, including those utilizing abdominal CT (Pickhardt et al., 2021) and CXRs (Sohn et al., 2022; Pyrros et al., 2022a, 2022b). These studies demonstrate the feasibility and potential of DL for extracting relevant biomarkers from medical images to predict various health outcomes.
Methodology
This study employed a deep learning (DL) model to detect type 2 diabetes (T2D) from ambulatory frontal chest radiographs (CXRs) combined with electronic health record (EHR) data. The dataset comprised 271,065 CXRs from 160,244 unique patients (2010-2021) for model development, with a prospective test cohort of 9,943 CXRs (2022) and an external validation cohort of 5026 CXRs from Emory University (2019-2020). The flowchart (Fig. 1) details patient selection and exclusion criteria. Patients with type 1 diabetes and gestational diabetes were excluded. A multitask DL model was developed using a ResNet34 convolutional neural network (CNN). The model was trained on a subset of the development dataset (90% for training, 10% for validation). The training data included CXRs, age, sex, BMI, HbA1c, race/ethnicity, language preference, and a social deprivation index (SDI) derived from zip codes. Data augmentation techniques (random flips, rotations, perspective distortion, brightness/contrast adjustments) were applied to enhance model robustness. The model used binary cross-entropy loss for disease classifications and mean squared error for continuous variables (age, HbA1c, BMI, and risk adjustment factor). The model was trained for 23 epochs using the AdamW optimizer. Model performance was evaluated using area under the receiver operating characteristic curve (ROC AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Youden's index was used to determine optimal classification thresholds. Internal validation was conducted using 5-fold cross-validation on the development dataset. Explainable AI techniques, including occlusion maps and "gifsplanation" (using an autoencoder), were employed to identify image features predictive of T2D. Logistic regression (LR) models were also developed to compare the performance of the DL model with and without CXR data. A subgroup analysis was conducted to evaluate model equity across different demographic groups. Time-dependent ROC curves were generated to assess performance over time in the retrospective cohort. The incidence rate of T2D was also calculated. Finally, the linear relationship between predicted HbA1c values from the DL model and actual HbA1c values was analyzed using linear regression.
Key Findings
The DL model demonstrated excellent performance in detecting T2D from CXRs, achieving a ROC AUC of 0.84 (95% CI: 0.83, 0.85) in the prospective test cohort and 0.77 in the external validation cohort. In the prospective cohort, the model flagged 14% (1381/9943) of patients without a prior T2D diagnosis or HbA1c measurement as high risk for T2D, representing potential screening opportunities. The model's performance was consistent across subgroups and significantly outperformed a clinical LR model without CXR data in multiple scenarios (Table 1). The AUC was 0.84 for all cases of T2D vs. all controls, 0.85 for poorly controlled T2D vs. all others, 0.77 for T2D with BMI<25, 35-70 vs no T2D, and 0.80 for T2D with BMI ≥25, 35-70 vs no T2D. Subgroup analysis showed no significant differences in performance across race/ethnicity, suggesting a lack of bias; however, there was a statistically significant difference between male and female patients (P=0.045). Explainable AI techniques revealed that the model's predictions were primarily driven by features related to central adiposity (mediastinal lipomatosis), attenuation of ribs and clavicles (Figures 4, 5, 6, Supplementary Movie 1). The retrospective cohort showed a T2D incidence rate of 5.1 cases per 1000 person-years, with the model correctly identifying 71% of those patients. Time-dependent ROC curves demonstrated consistent performance over time. The model also showed a moderate ability to predict HbA1c from CXR data (R²=0.15, p<0.001, Figure 7).
Discussion
This study demonstrates the potential of using DL models to detect T2D from readily available CXRs, augmenting traditional screening methods. The high AUC values achieved across multiple cohorts indicate robust model performance. The identification of a significant number of previously unscreened high-risk individuals in the prospective cohort highlights the model's potential for opportunistic screening. The explainability analysis provides valuable insights into the model's decision-making process, linking predictions to clinically relevant features like central adiposity. This finding is particularly important as it suggests the model's capacity to identify high-risk individuals even those with normal BMI. The model's ability to predict HbA1c, while moderate, further suggests its value as a screening tool. The consistent performance across different time periods and in an external validation cohort suggests generalizability. These findings, coupled with the widespread availability of CXRs, position this DL-based approach as a valuable tool for enhancing population-level screening efforts and facilitating earlier intervention for T2D.
Conclusion
This study presents a novel deep learning model for the opportunistic detection of T2D using readily available CXRs. The model demonstrates robust performance and identifies a substantial number of high-risk individuals previously unscreened, highlighting its potential for population-level T2D screening. Future research should focus on larger-scale validation, integration into clinical workflows, and investigation of the model's long-term predictive capability.
Limitations
The study has several limitations. The absence of FBG and other glucometry data may have affected the model's accuracy. The retrospective nature of the study and the unavailability of HbA1c data for many patients could limit the generalizability of the findings. Only ambulatory CXRs were used, excluding portable CXRs and those with support devices. The relatively small size and short duration of the external validation cohort also limits the extent to which the generalizability of the model can be assessed. Finally, multi-year follow-up data from the prospective cohort are not yet available. Future work should address these limitations to further validate and refine the model before widespread clinical implementation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny