logo
ResearchBunny Logo
Introduction
Neurodegenerative disorders, such as Alzheimer's disease (AD), frontotemporal dementia (FTD), Parkinson's disease (PD), dementia with Lewy bodies (DLB), vascular dementia (VD), and mixed dementias, present significant clinical heterogeneity and diagnostic challenges. The increasing prevalence of dementia, projected to triple by 2050, underscores the urgent need for improved diagnostic tools and treatment options. Current research is hampered by this heterogeneity, overlapping clinical and pathological features, and complex comorbidity patterns. Discrepancies between clinical and post-mortem diagnoses are frequent, with up to one-third of dementia cases misdiagnosed. This study addresses these limitations by developing a data-driven approach using a large autopsy cohort to delineate clinical disease trajectories.
Literature Review
Existing research highlights the challenges in diagnosing and studying neurodegenerative disorders due to their heterogeneity and overlapping clinical features. While some studies have attempted to integrate clinical diagnoses, symptoms, or temporal profiles, a comprehensive combination of these approaches has been lacking. Brain banks offer valuable post-mortem brain tissue, but often provide limited clinical information, hindering the inclusion of key clinical parameters in statistical analyses. Most studies rely on binary case-control designs, ignoring phenotypic diversity. This study aims to overcome these limitations by utilizing the extensive medical record summaries available from the Netherlands Brain Bank (NBB).
Methodology
The researchers developed a computational pipeline using NLP techniques to convert unstructured medical record summaries from the NBB into standardized clinical disease trajectories. This pipeline involved three steps: parsing NBB donor files, defining and predicting attributes in clinical histories, and using the trajectories for downstream analyses. A total of 3,042 donor files were included. A new cross-disorder clinical categorization system was developed, encompassing 90 neuropsychiatric signs and symptoms across five domains. NLP models (bag of words, support vector machine, Bio_ClinicalBERT, PubMedBERT, and T5) were trained and optimized using a stratified fivefold cross-validation approach. The best performing model, PubMedBERT, was used to predict signs and symptoms in the full corpus. Enrichment analysis, temporal profiling, survival analysis, and predictive modeling using a gated recurrent unit (GRU-D) were performed to analyze the data. Dimensionality reduction and clustering techniques were used to characterize clinical heterogeneity and identify data-driven clinical subtypes.
Key Findings
The study identified 84 reliably identified neuropsychiatric signs and symptoms. Enrichment analysis revealed expected disease-specific signs and symptoms, such as the enrichment of 'dementia' in AD, FTD, DLB, and VD but not in PD without dementia. The analysis also identified signs and symptoms that differentiated between frequently misdiagnosed disorders, such as 'paranoia' and 'façade behavior' being unique to AD and 'hearing problem' and 'muscle weakness' being unique to VD. Temporal profiling showed that 'dementia' manifested at a significantly younger age in FTD compared to other dementias. Survival analysis revealed shorter survival after the first observation of 'dementia' in VD, PD, or PDD compared to AD or FTD. Analysis of synucleinopathies (PD, PDD, DLB, and MSA) showed unique temporal features suggesting distinct neuropathological processes. Comparison of clinical and neuropathological diagnoses revealed a substantial proportion of misdiagnoses. Predictive modeling using GRU-D accurately diagnosed most common disorders but performed less well for rarer disorders. Dimensionality reduction and clustering identified six main clusters enriched for different types of dementias, PD and related disorders, motor disorders, control donors, and psychiatric disorders. Subclustering analysis revealed data-driven clinical subtypes of dementia, MS, and PD, characterized by distinct symptom profiles and temporal manifestations. The EARLY-DEM cluster showed significant enrichment for the APOE4/4 genotype.
Discussion
This study provides a unique resource for researchers studying neurodegenerative disorders. The findings highlight the significant clinical heterogeneity within and between these disorders and the challenges associated with accurate diagnosis. The data-driven approach used in this study provides valuable insights into disease subtypes and temporal disease trajectories. These findings could improve diagnostic accuracy, personalized treatment strategies, and ultimately enhance the understanding of the complex pathophysiological mechanisms underlying neurodegenerative disorders.
Conclusion
This research established a valuable resource for neurodegenerative disease research by creating clinical disease trajectories from the Netherlands Brain Bank using NLP. The study demonstrated the utility of this dataset for temporal analysis, predictive modeling, and the identification of disease subtypes. This approach has potential for improved diagnosis, personalized medicine, and a deeper understanding of disease mechanisms. Future research should focus on validating these findings in larger, independent cohorts and exploring the molecular and cellular correlates of identified subtypes.
Limitations
The study's reliance on retrospectively collected medical record summaries introduces potential biases and limitations. Missing data, labeling errors, and the possibility of excluding relevant signs and symptoms are acknowledged. The temporal and survival profiles and clustering might be confounded by medical comorbidities and treatments. Variations in neuropathological assessments across time by different pathologists may also introduce bias. The predominantly Dutch/Caucasian cohort limits generalizability to other populations.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny