Medicine and Health

Discriminating pseudoprogression and true progression in diffuse infiltrating glioma using multi-parametric MRI data through deep learning

J. Lee, N. Wang, et al.

This groundbreaking study reveals a new approach to distinguish between pseudoprogression and true tumor progression in diffuse infiltrating gliomas using a CNN-LSTM deep learning model and multiparametric MRI data. Conducted by leading experts at the University of Michigan, these findings pave the way for improved diagnostic performance and timely treatment decisions.

00:00

~3 min • Beginner • English

Index

Introduction

Diffuse infiltrating gliomas (astrocytic or oligodendroglial; WHO grades 2–4) commonly progress, and distinguishing true progressive disease (PD) from pseudoprogression (PsP) or treatment-related changes is crucial for management. PsP, defined by RANO as early post-radiotherapy new/enlarged enhancement without true PD, can mimic progression on MRI, typically occurring within 3 months after therapy. Standard diagnosis via biopsy is invasive and limited by sampling, while conventional MRI often requires weeks of follow-up to differentiate PsP from PD. The study hypothesizes that a CNN-LSTM deep learning model using multiparametric MRI as a spatial sequence input will outperform a conventional single-sequence CNN (VGG16) in discriminating PsP from true PD in diffuse infiltrating gliomas.

Literature Review

Prior work to differentiate PsP from PD includes advanced imaging (e.g., 18F-FET PET; DSC perfusion rCBV thresholds; diffusion/perfusion MRI), MR spectroscopy, IVIM, and voxelwise ADC analyses, showing varying performance but limitations in routine use. Machine learning and radiomics using multiparametric MRI have been explored, including texture analysis (GLCM), radiomic models incorporating diffusion/perfusion, and SVM-based multiparametric approaches. Deep learning applications are fewer due to limited data; Jang et al. combined CNN-LSTM using post-contrast T1 temporal slices with clinical features, achieving AUC ≈ 0.83 and showing added value of clinical data. CNNs excel at static image feature extraction, while RNN/LSTM models capture sequential dependencies. This study introduces using multiparametric MRI modalities at a single time point as a spatial sequence for CNN-LSTM, differing from temporal sequence approaches and fusion strategies, aiming to leverage cross-modality correlations.

Methodology

Study design: Retrospective single-center study of 43 biopsy-proven diffuse infiltrating glioma patients (WHO grade 3 or 4 at progression) who underwent gross total resection followed by chemoradiation and multiple follow-up MRIs (2010–2018). IRB approval obtained with consent waived. Cohort and labels: Histology confirmed outcomes: 7 PsP and 36 true PD. PsP was defined histopathologically as >90% treatment effect in resected tissue (minimal/non-proliferative viable tumor <10%). Imaging data: For each patient, a baseline MRI closest to the follow-up operation confirming PsP or PD was used. Five native sequences: pre-contrast T1-weighted, post-contrast T1-weighted, T2-weighted FSE, FLAIR, and ADC map. Two engineered sequences were created: (1) T1post − T1pre (enhancement map) and (2) T2 − FLAIR (to highlight more fluid tissue vs solid tissue). Images were intensity-normalized using White-Stripe normalization in R. Preprocessing and VGG16 model: Grayscale images were resized to 224×224 and stacked to 3 channels (224×224×3) for compatibility with ImageNet-pretrained VGG16. Transfer learning was applied by freezing all layers except the last four for fine-tuning. Data augmentation (rescaling, rotation, shift, shear, horizontal flip) was used for training. Each modality was trained independently as single-sequence input models. CNN-LSTM model: Proposed multiparametric spatial sequence input approach. Three input sets were prepared per patient: (a) 3 modalities: T1 pre, T1 post, T2; (b) 5 modalities: T1 pre, T1 post, T2, FLAIR, ADC; (c) 7 modalities: the 5 modalities plus T1post−T1pre and T2−FLAIR. For each sequence element, a CNN feature extractor consisted of 2D convolutional layers (kernels 2×2; filter sizes 64, 128, 256), each followed by 2×2 max pooling and batch normalization; outputs were flattened. The flattened vectors for the ordered set of modalities were fed sequentially into an LSTM layer (24 units), followed by a dense layer for binary classification (PsP vs true PD). Binary cross-entropy loss and stochastic gradient descent optimizer were used. Implemented in Python (Keras 2.2.4, TensorFlow 1.12.0). Training and evaluation: Threefold cross-validation was performed. For VGG16, each modality was trained and evaluated separately, reporting accuracy and AUC with 95% CI on the test folds. For CNN-LSTM, models were trained with the 3-, 5-, and 7-modality spatial sequences; performance (accuracy, ROC, AUC with 95% CI) was reported per test fold and summarized. Training epochs to best performance increased with sequence length: ~50 epochs (3 modalities), ~200 epochs (5 modalities), ~500 epochs (7 modalities). Hardware: Intel i9 3.50 GHz 12-core CPU, 128 GB RAM, NVIDIA GeForce RTX 2080 Ti GPU.

Key Findings

- VGG16 single-sequence models (mean over 3-fold CV): accuracy 0.44–0.60; AUC 0.47–0.59. By modality: T1 pre (Acc 0.51, AUC 0.53 [0.47–0.64]); T1 post (0.48, 0.49 [0.39–0.59]); T2 (0.44, 0.51 [0.41–0.62]); FLAIR (0.55, 0.55 [0.44–0.67]); ADC (0.48, 0.47 [0.38–0.57]); T1post−T1pre (0.60, 0.59 [0.49–0.70]); T2−FLAIR (0.58, 0.54 [0.42–0.66]). Overall mean accuracy 0.52 and mean AUC 0.53. - CNN-LSTM multiparametric spatial sequence models (mean over 3-fold CV): • 3 modalities (T1 pre, T1 post, T2): accuracy 0.62; AUC 0.64 [0.51–0.77]. • 5 modalities (add FLAIR, ADC): accuracy 0.70; AUC 0.69 [0.59–0.79]. • 7 modalities (add T1post−T1pre, T2−FLAIR): accuracy 0.75; AUC 0.81 [0.72–0.88]. - Multiparametric CNN-LSTM outperformed single-sequence VGG16 across metrics; performance improved with more modalities, peaking at AUC 0.81 for 7-modality input.

Discussion

The study addresses the clinical challenge of distinguishing PsP from true PD in diffuse gliomas, where conventional imaging and invasive biopsy have limitations. By treating multiparametric MRI acquired at a single time point as a spatial sequence, the CNN-LSTM can model inter-sequence dependencies, leveraging complementary tissue contrasts and engineered difference maps to enhance discrimination. Results demonstrate superior performance of the CNN-LSTM approach compared with a widely used transfer learning CNN (VGG16) on individual sequences, particularly benefiting from inclusion of engineered modalities (T1post−T1pre and T2−FLAIR). This aligns with prior evidence that multi-modal information and sequential modeling (e.g., Jang et al.) can improve classification, while avoiding reliance on temporal imaging or clinical features. The method may be advantageous in small medical datasets by aggregating signal across modalities, thus improving robustness. Increasing the number of modalities improved AUC but at the cost of longer training times, suggesting a trade-off that future architecture and training optimizations could mitigate.

Conclusion

This work introduces a CNN-LSTM framework that ingests multiparametric MRI at a single time point as a spatial sequence, including engineered sequences, to discriminate PsP from true PD in diffuse infiltrating gliomas. The approach significantly outperforms single-sequence CNN baselines, achieving up to accuracy 0.75 and AUC 0.81 with seven modalities. The study highlights the value of engineered difference images and cross-modality correlation learning. Future work should validate on larger, multi-institutional cohorts, explore incorporation of clinical/molecular features, optimize the architecture for efficiency, and assess generalizability and prospective clinical utility.

Limitations

- Small and imbalanced dataset (n=43; PsP=7, true PD=36) limits generalizability and contributed to weaker VGG16 single-sequence performance. - Retrospective single-institution study; potential selection bias. - Increased computational time with more modalities (3-modality ~50 epochs; 5-modality ~200; 7-modality ~500) indicates efficiency trade-offs and need for model optimization. - Evaluation based on threefold cross-validation; external validation not performed.

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Recent Advancements and Perspectives in the Diagnosis of Skin Diseases Using Machine Learning and Deep Learning: A Review

J. Zhang, F. Zhong, et al.

Engineering and Technology

Improved Fault Classification and Localization in Power Transmission Networks Using VAE-Generated Synthetic Data and Machine Learning Algorithms

M. A. Khan, B. Asad, et al.

Environmental Studies and Forestry

Addressing gaps in data on drinking water quality through data integration and machine learning: evidence from Ethiopia

A. A. Ambel, R. Bain, et al.

Engineering and Technology

Gas permeability, diffusivity, and solubility in polymers: Simulation-experiment data fusion and multi-task machine learning

B. K. Phan, K. Shen, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny