logo
ResearchBunny Logo
Introduction
Alzheimer's disease (AD) is a progressive neurodegenerative disorder, with mild cognitive impairment (MCI) representing an early stage with a high likelihood of progression to AD. Traditional diagnostic methods, such as PET, MRI, and CSF biomarkers, are expensive and invasive, limiting their widespread clinical use. Therefore, there is a need for cost-effective and readily available diagnostic biomarkers. Machine learning (ML), particularly deep learning (DL), has shown promise in AD diagnosis, with fully supervised DL methods being prevalent. However, these methods rely on large amounts of labeled data, which can be expensive and difficult to obtain. Neuropsychological tests, being inexpensive, non-invasive, and widely available, offer a potential alternative. While previous studies have used machine learning methods like SVM with neuropsychological data, the application of advanced neural networks for AD diagnosis using these tests is limited. This paper addresses this gap by proposing a new semi-supervised learning approach, which reduces the dependence on labeled data. The method leverages easily accessible neuropsychological test data to improve the efficiency and accuracy of AD and MCI diagnosis. The proposed method will help to identify non-invasive, reliable, and widely available diagnostic biomarkers for AD and MCI.
Literature Review
Existing research on AD diagnosis using machine learning has primarily focused on supervised deep learning methods utilizing medical images (e.g., fMRI, MRI) and other complex biomarkers. Convolutional Neural Networks (CNNs) and Graph Convolutional Networks (GCNs) have demonstrated excellent performance. However, these methods require large, labeled datasets, which are costly to create due to the need for expert labeling. The authors review several existing studies using deep learning for AD diagnosis and highlight the limitations of fully supervised approaches due to data scarcity. Studies using neuropsychological tests and machine learning for AD classification are also reviewed, showcasing the potential of these tests while noting the absence of widespread neural network application in this area. The paper argues that semi-supervised learning techniques can help address the data scarcity issue, and it explores relevant semi-supervised learning techniques such as consistency regularization, pseudo-labeling, label propagation, and contrastive learning to set the stage for their proposed method.
Methodology
The study utilizes data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, specifically baseline neuropsychological data from 819 subjects (188 AD, 402 MCI, 229 NC). Initially, feature selection is performed using Pearson's correlation coefficient (PCC) to identify the 15 neuropsychological test items most strongly correlated with the diagnostic outcome from a total of 64 items across seven neuropsychological tests (ADAS-Cog, MMSE, CDR, RAVLT, FAQ, NPIQ, GDS). The core of the methodology is the proposed Dual Semi-Supervised Learning (DSSL) framework. DSSL employs two encoders to learn distinct feature representations from the same input sample. A difference regularization term is introduced to maximize the divergence between these two feature representations. The outputs of both encoders are fed into separate Multilayer Perceptrons (MLPs) to generate predictions. These predictions serve as pseudo-labels for each other, forming a consistency regularization component. The method incorporates a confidence threshold (τ) to select high-confidence pseudo-labels for the loss function. The overall loss function combines supervised loss (for labeled data), consistency regularization loss (between the two encoders' predictions), and the difference regularization loss. The model is trained using the Adam optimizer, with an exponential moving average of model parameters used to update the teacher model. The optimal parameters (τ and β - difference regularization weight) were determined via 5-fold cross-validation and additional experiments varying the number of labeled samples and the difference regularization coefficient. The performance of DSSL is evaluated using accuracy, sensitivity, specificity, recall, and F1-score. The stability of the method was assessed by repeating the experiments 100 times with different random selections of labeled samples.
Key Findings
DSSL demonstrated superior performance in classifying AD, MCI, and NC compared to other semi-supervised methods (MixMatch, FixMatch, SimPLE, CCSSL, LaSSL). With 60 labeled samples, DSSL achieved 85.47% accuracy, significantly outperforming the baselines (ranging from 77.29% to 81.44%). With 120 labeled samples, DSSL's accuracy increased to 88.40%, still exceeding baseline accuracies (ranging from 82.42% to 85.10%). The study also highlighted the importance of using different encoders in DSSL; using identical encoders resulted in reduced accuracy. Furthermore, DSSL exhibited higher stability compared to other methods, as evidenced by lower variance in accuracy across 100 repeated experiments with different labeled sample selections. Analysis of the impact of the confidence threshold (τ) showed that a value of 0.9 provided a balance between the number of pseudo-labels used and the accuracy of those labels. The feature selection process revealed that total scores from several neuropsychological tests (CDR, MMSE, ADAS, and FAQ) were strongly correlated with the degree of cognitive impairment, suggesting their clinical utility. Shapley value analysis provided insights into the relative importance of individual features in the DSSL predictions, showing varied contributions across the two encoders. This suggests that the two encoders successfully learn complementary feature representations. Training time for DSSL remained practical for clinical application, taking less than 3 minutes on a standard PC.
Discussion
The superior performance and stability of DSSL demonstrate its potential as a valuable clinical tool for AD and MCI diagnosis. The use of readily available neuropsychological tests makes it a cost-effective and accessible alternative to more expensive and invasive methods. The findings highlight the effectiveness of the dual encoder architecture and difference regularization in improving the model's ability to learn robust features for classification. The study also emphasizes the importance of selecting an appropriate confidence threshold to balance the quantity and quality of pseudo-labels used in semi-supervised learning. Future research can explore further refinements to the DSSL algorithm, such as using more advanced techniques for automatic selection of the confidence threshold and the development of more sophisticated methods for integrating data from multiple neuropsychological tests. Further investigations into the clinical interpretation of the learned features can potentially advance our understanding of the neuropsychological mechanisms underlying AD.
Conclusion
This study introduces a novel dual semi-supervised learning framework, DSSL, for classifying AD, MCI, and NC using neuropsychological data. DSSL demonstrates superior accuracy and stability compared to existing semi-supervised methods, highlighting its potential for cost-effective and accessible clinical applications. Future research could investigate the integration of DSSL with other multimodal data (MRI, PET) and explore the medical interpretation of learned features to gain further insights into AD pathology.
Limitations
The study's reliance on the ADNI database may limit the generalizability of the findings to other populations. The optimal confidence threshold (τ) was determined empirically and may not be universally applicable. Further research is needed to investigate the generalizability and robustness of the proposed model across different datasets and populations.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny