Introduction
Primary malignant central nervous system (CNS) tumors, predominantly diffuse gliomas, represent a significant health concern. Gliomas, encompassing WHO grades II-IV, exhibit substantial heterogeneity in biological behavior and survival outcomes. This heterogeneity poses challenges for accurate prognosis and personalized treatment strategies. Traditional methods such as nomograms and regression models often struggle with the complexity and high dimensionality of glioma data. Machine learning (ML) offers advantages by handling large, complex datasets, incorporating diverse variables (clinical, molecular, imaging), and identifying non-linear relationships. This study aims to leverage ML algorithms to develop a user-friendly web application for predicting survival outcomes in patients with WHO grade II and III gliomas, ultimately improving patient care and informing clinical decision-making. The goal is to address the complexities of glioma prognosis and enhance the precision of survival predictions, moving beyond generalized risk assessments towards personalized care.
Literature Review
Existing literature on glioma survival prediction utilizes various methods. Some studies employ algorithms like Cox Proportional Hazards, Support Vector Machines, and Random Forests, achieving varying degrees of accuracy but often lacking accessibility for clinicians. Other studies have developed nomograms using clinicopathologic variables, offering online calculators but limited by smaller sample sizes and restricted generalizability. Furthermore, attempts have been made to incorporate imaging (radiomics) and genomic data into predictive models. Radiomics-based models, while promising, may lack seamless integration into clinical practice. Integrated models combining radiomics and clinical variables require specific inputs that may not be readily available in all settings. Genomic-based models often rely on infrequently used profiling techniques, hindering widespread clinical application. The authors note that while numerous studies suggest various gene signatures as predictive markers for survival outcomes in glioma patients, their applicability is often restricted due to the infrequent incorporation of genomic profiling techniques into standard glioma care, thus relegating these to mostly feasibility studies and seldom adopted in real-world clinical settings. This study aims to overcome these limitations by developing a readily accessible web application based on a large dataset and employing readily interpretable models.
Methodology
This retrospective study utilized data from the 2020 version of the National Cancer Database (NCDB), a large, prospective database encompassing data from over 1500 Commission on Cancer (CoC)-accredited institutions. The study population consisted of adult patients (≥18 years) diagnosed with histologically confirmed cranial WHO grade II and III gliomas between 2010 and 2017. Predictor variables included sociodemographics, clinical presentation, diagnostic information, molecular markers (where available), and treatment modalities. Five supervised machine learning algorithms were employed: TabPFN, TabNet, XGBoost, LightGBM, and Random Forest. Hyperparameter optimization was performed using Optuna. Data were split into training (60%), validation (20%), and testing (20%) sets. The Synthetic Minority Over-sampling Technique (SMOTE) was used to address class imbalance in the training sets. Model performance was evaluated using metrics such as AUROC, AUPRC, sensitivity, specificity, accuracy, and the Brier score. Receiver operating characteristic (ROC) and precision-recall (PRC) curves were generated for visual assessment. SHAP values were used for model interpretability, providing both global and local explanations of feature importance. The top-performing models (based on AUROC) were integrated into a user-friendly web application accessible at https://huggingface.co/spaces/MSHS-Neurosurgery-Research/G2G3-Glioma. The study followed TRIPOD and JMIR guidelines for reporting machine learning predictive models.
Key Findings
The analysis of 21,457 patients (10,001 grade II, 11,456 grade III gliomas) revealed that LightGBM and Random Forest models consistently outperformed other algorithms in predicting 12, 24, 36, and 60-month mortality. For grade II gliomas, AUROC values ranged from 0.813 to 0.896, while for grade III gliomas, AUROC values ranged from 0.855 to 0.878. These results indicate good discriminatory ability (AUROC > 0.8) in distinguishing patients who died within the specified intervals from those who survived. Table 2 presents detailed performance metrics for the top-performing models, including sensitivity, specificity, accuracy, AUPRC, and Brier score. SHAP analysis revealed age as the most important predictor variable across most outcomes, followed by histology and extent of resection. The web application allows for input of patient characteristics to generate individualized survival probability predictions at different time points. Figures 1-4 display ROC curves, precision-recall curves, radar plots comparing model performance across multiple metrics, and SHAP bar plots illustrating feature importance, respectively. Supplementary materials provide additional model performance metrics, confusion matrices, and partial dependence plots.
Discussion
This study demonstrates the effectiveness of ML models in providing clinically useful, individualized survival predictions for patients with WHO grade II and III gliomas. The high AUROC values achieved across multiple time points highlight the models’ precision and ability to inform clinical decision-making. The readily accessible web application represents a significant advance in integrating predictive analytics into neuro-oncology practice. These predictions can improve shared decision-making with patients, facilitate risk stratification, and guide personalized management strategies. The models outperform previous approaches, particularly those limited by small sample sizes or the inaccessibility of required inputs such as deep radiomic signatures or specific genomic data. The use of SHAP explanations enhances transparency and trust in the model’s predictions, allowing clinicians to integrate the model's output with their clinical expertise.
Conclusion
This research successfully developed and validated robust machine learning models for predicting survival in WHO grade II and III gliomas, integrating them into a user-friendly web application. The models provide more precise, individualized survival predictions than previously available methods. Future work should focus on external validation of these models, assessing their clinical impact, and incorporating additional data such as IDH mutation status and more detailed treatment information to further refine predictive accuracy and clinical utility.
Limitations
This study's limitations primarily stem from the inherent nature of retrospective database analysis. The absence of molecular markers (e.g., IDH status), detailed clinical data (e.g., precise extent of resection, imaging data), and information on subsequent treatments may affect model performance and generalizability. The NCDB data, while extensive, represents only approximately 70% of all cancer diagnoses in the US and is limited to CoC-accredited hospitals, potentially introducing selection biases. The reliance on all-cause mortality data, rather than disease-specific mortality, is another consideration. External validation is needed to strengthen the study's conclusions.
Related Publications
Explore these studies to deepen your understanding of the subject.