logo
ResearchBunny Logo
Introduction
Aphasia, a language processing disorder frequently caused by strokes affecting the language-dominant hemisphere, impacts approximately 30% of stroke survivors. While significant spontaneous recovery occurs, chronic language impairments persist in up to 60% of patients, substantially impacting quality of life. Current predictive models, incorporating lesion characteristics, acute aphasia severity, and age, only account for about 50% of the variance in chronic aphasia severity. This suggests the involvement of unidentified neurobiological factors. Existing research indicates that post-stroke aphasia severity depends not only on lesion location and size but also on the integrity of brain regions beyond the lesion. These spared regions, including language-specific areas and contralateral homotopic regions, play a crucial role in language recovery. However, assessing the spatial distribution of brain atrophy, which often occurs in distinct patterns in various neurological conditions, has been challenging. The study aims to address this gap by exploring whether deep learning methods can better predict chronic aphasia severity by leveraging information about spatially dependent multivariate features in three-dimensional brain images. The study specifically hypothesizes that: 1) a CNN applied to brain lesions and whole brain tissue will outperform standard multivariate machine learning methods in predicting aphasia severity; and 2) the 3D CNN will utilize spatially dependent neuroanatomical information extending beyond the lesion, identifying subtle patterns of atrophy crucial in understanding stroke recovery and aphasia progression.
Literature Review
The literature highlights the importance of spared brain regions beyond the lesion in predicting aphasia severity. Studies have shown associations between post-stroke aphasia and atrophy in specific regions like the inferior frontal gyrus, thalamus, and cingulate gyrus, as well as distributed regions across various lobes. Preservation of right hemisphere volumes, particularly in the temporal gyrus and supplementary motor areas, has been linked to better language outcomes. However, existing methods for assessing the spatial distribution of brain atrophy have limitations. Deep learning methods, particularly Convolutional Neural Networks (CNNs), offer a potential solution by extracting spatially dependent multivariate features from 3D images. While CNNs have been used in stroke research primarily for lesion segmentation, their potential for outcome prediction, particularly in chronic aphasia, has been less explored. Previous studies using CNNs in stroke have demonstrated their ability to outperform classical machine learning methods in predicting outcomes like disability and motor impairment, suggesting the potential for improved prediction of aphasia severity by accounting for spatial dependencies in whole brain neuroimaging data.
Methodology
This cross-sectional study retrospectively analyzed data from 213 individuals with chronic left strokes (mean years post-stroke = 3.2 ± 3.7). Aphasia severity was assessed using the Western Aphasia Battery-Revised (WAB-R) aphasia quotient (AQ). MRI data (T1 and T2-weighted scans) were preprocessed: lesions were manually segmented from T2-weighted images; lesion masks were resampled to T1-weighted images and refined; anatomical deformation during normalization was minimized using enantiomorphic healing, replacing lesion tissue with contralateral homologues; segmented tissues (gray matter, white matter, CSF) were generated using FAST; and images were normalized to MNI152 template space using FNIRT. Data were downsampled to 8mm voxel size, cropped, and scaled. The WAB-AQ scores were categorized into severe (AQ < 50) and non-severe (AQ ≥ 50) groups for binary classification. A nested cross-validation scheme with 20 repeats, stratified by WAB-R aphasia categories, was implemented for model evaluation. The primary models were a 3D CNN (VGG-style architecture) and a Support Vector Machine (SVM). The CNN was tuned for network complexity, dropout frequency, learning rate, and L2-norm. The SVM was tuned for kernel type, gamma, cost, and dimensionality reduction (PCA, ICA). Model fusion strategies included averaging predictions, stacking using LDA, and using CNN-learned features as input for SVM. The performance of models was assessed using precision, F1 score, balanced accuracy, and individual class accuracies. Feature saliency maps (Grad-CAM++, deep SHAP) were used to investigate the patterns driving model predictions. Consensus clustering was used to explore heterogeneity in patterns learned by the CNN.
Key Findings
The CNN significantly outperformed the SVM in predicting severe aphasia, achieving a median balanced accuracy of 0.77 and a median F1 score of 0.7. The SVM's performance did not improve significantly with dimensionality reduction (PCA, ICA). Model fusion techniques (averaging predictions, stacking) did not substantially improve upon the CNN's performance. Feature saliency maps revealed that the CNN focused on both ipsilateral and contralateral regions outside the lesion, capturing complex morphometry patterns not detected by the SVM. The SVM, in contrast, primarily focused on the lesion area. Consensus clustering of CNN saliency maps identified distinct morphometry patterns unrelated to lesion size, consistent across individuals, and implicating networks associated with various cognitive processes (language-related and domain-general). Analysis of these clusters revealed that different language subsystems and domain general regions were implicated in the prediction of aphasia severity in different subgroups. The right hemisphere was particularly important for predicting severe aphasia. ROI analysis showed Grad-CAM++ saliency was significantly higher in all left hemisphere regions for nonsevere predictions, while it was higher in all right hemisphere regions for severe predictions. SVMs showed higher feature importance in the lesion for severe predictions but higher importance in perilesional and extralesional ROIs for nonsevere predictions.
Discussion
The superior performance of the CNN in predicting aphasia severity highlights the importance of considering three-dimensional morphometry patterns beyond the lesion. The finding that the CNN identifies features outside the lesion, and that SVM performance does not improve using dimensionality reduction techniques, underscores the need for models sensitive to complex spatial relationships in brain structure. The finding that the SVM's performance improves significantly when trained on CNN-derived saliency maps strongly supports the hypothesis that the CNN is identifying uniquely informative patterns of brain morphology that are not accessible to classical models. The identified morphometry patterns implicate not only language-specific regions but also domain-general networks related to attention, working memory, and higher-level cognitive processes like decision making, suggesting that the severity of aphasia is related to widespread effects of stroke injury on global brain function. The heterogeneity of patterns suggests that individual differences in the interplay of these effects significantly affect aphasia recovery.
Conclusion
This study demonstrates that CNNs can effectively predict post-stroke aphasia severity by leveraging three-dimensional morphometry patterns, exceeding the capabilities of classical machine learning methods. The identified morphometry patterns extend beyond canonical language areas, highlighting the importance of considering broader brain networks in understanding aphasia. Future research should focus on longitudinal studies with larger, more diverse datasets to further refine predictive models and explore the clinical implications of these findings, potentially leading to improved patient care and targeted interventions.
Limitations
The study's limitations include its cross-sectional design, which limits causal inferences about the relationship between morphometry patterns and aphasia severity. The relatively small sample size, particularly for the severe aphasia group, might affect the generalizability of the findings. The manual lesion segmentation could introduce inter-rater variability. The downsampling of the data to improve computational tractability for CNN analysis might have introduced information loss, and thus could limit the conclusions regarding the relative importance of spatial properties for model performance.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny