logo
ResearchBunny Logo
A dependency distance approach to the syntactic complexity variation in the connected speech of Alzheimer's disease

Linguistics and Languages

A dependency distance approach to the syntactic complexity variation in the connected speech of Alzheimer's disease

N. Gao and Q. He

This groundbreaking study by Nan Gao and Qingshun He explores how Alzheimer's disease affects syntactic complexity, revealing that patients struggle with complex structures and prefer simpler syntactic forms. With working memory deficits in focus, the research highlights critical insights into the cognitive challenges faced by individuals with AD.

00:00
00:00
~3 min • Beginner • English
Introduction
Alzheimer's disease (AD) is characterized by memory and cognitive deficits and early language deterioration. Subtle language changes can mark AD and are important for linguistic research and clinical reference. Prior work shows AD speakers differ from healthy controls (HC) in lexis and discourse (e.g., reduced lexical output and richness; fewer semantic units; discourse cohesion problems and fewer propositions). On syntax, some studies argued syntactic "preservation" in AD, suggesting more lexical than syntactic impairment and generally grammatical, coherent language. However, growing evidence indicates changes in syntactic complexity in AD: shorter utterances and fewer complex structures (e.g., reduced embedding and subordinate clauses), with difficulties in certain clause types (existential, conjoined, passive, impersonal). Assessment approaches vary (sentence length, frequency counts, ratings), yielding heterogeneous markers and complicating consistent identification of syntactic complexity deficits. Working memory is crucial for syntactic processing; complex structures require higher capacity. AD-related working memory deficits negatively affect complex syntax production, and interventions improving working memory can enhance complex language. Thus, working memory may be an effective marker to differentiate AD from HC in syntactic complexity. This study explores syntactic complexity in AD through dependency syntax, using dependency distance as a holistic metric of working memory load and syntactic complexity, and associating it with fine-grained features (sentence length, adjacent dependency, major grammatical relations) and dependency direction. Using a comparable dependency treebank of AD and matched HC transcripts, the study addresses: Q1: Can dependency distance and dependency direction detect syntactic complexity variation in AD? Q2: What are the reasons for the syntactic complexity variation in AD in terms of fine-grained syntactic features?
Literature Review
Connected speech studies consistently show AD-related differences at lexical and discourse levels: shorter lexical output, reduced lexical richness, fewer semantic units, and discourse cohesion/propositional deficits. Early syntactic accounts posited relative preservation of grammar in AD, with more pronounced lexical than syntactic deficits. Nevertheless, numerous studies report syntactic simplification: shorter utterances, reduced frequency of complex/embedded or subordinate structures, and difficulties with certain clause types, arguing syntax is not fully buffered in AD. Findings across studies diverge partly due to varied metrics (mean sentence/utterance length, frequency of structures, rating scales), yielding inconsistent syntactic markers. Given strong links between working memory and syntactic processing, and evidence that working memory deficits in AD impair complex syntax (and that WM interventions can improve it), integrating working memory-sensitive measures with fine-grained syntactic features may provide a more coherent profile. Dependency distance offers such an integrative metric, capturing both processing load and structural complexity, and can be related to sentence length, adjacency, and grammatical relations, as well as dependency direction patterns.
Methodology
Materials: The DementiaBank clinical dataset (Cookie Theft picture description transcripts) was used to build a comparable dependency treebank. The sample included 65 probable AD participants and an equal number of healthy controls (HC), selecting baseline sessions only. AD participants were approximately 49–90 years; HC 46–81 years. Transcripts were cleaned to remove transcript symbols, interjections, single decontextualized words without dependencies, and unrelated information. Utterances were segmented by conversational boundaries and intonation; included units encompassed complete sentences, sentence fragments (e.g., "water overflowing the sink"), aposiopesis, isolated nominal groups, non-finite verb groups, and prepositional phrases. Parsing and annotation: Cleaned texts (mean length ~100 words per transcript) were parsed with spaCy 3.7.0 (English model) to annotate dependencies and compute measures: mean dependency distance (MDD), dependency direction distributions (head-initial HI%, head-final HF%), percentage of adjacent dependencies (1dd%), mean sentence length (MSL), and MDDs/direction distributions for major grammatical relations (Subject, Object, Attributive, Adverbial) and their dependency types. Two linguistics Ph.D. candidates reviewed all annotations (agreement 95%); disagreements were resolved via discussion or a third expert. Measures: Dependency distance (linear distance between head and dependent) operationalizes syntactic complexity and processing cost. MDD for sentences/transcripts was computed excluding punctuation and root arcs. Fine-grained factors examined: 1dd% (percentage of dependencies with distance 1), sentence length (MSL), and MDDs of major grammatical relations (fifteen observed types across Subject, Object, Attributive, Adverbial based on Universal Dependencies). Dependency direction (HI vs HF) was calculated as percentages over total dependencies. Statistical analyses included independent t-tests for group differences (MDD, direction, MSL), linear regressions with interaction terms for MDD~1dd% and MDD~MSL per group, Benjamini-Hochberg FDR correction for multiple comparisons of dependency-type MDDs, and log-likelihood tests for direction frequency differences within grammatical relations.
Key Findings
- Overall syntactic complexity and direction: - MDD was significantly lower in AD than HC: AD MDD=2.47 (SD=0.259) vs HC MDD=2.79 (SD=0.264); t=-6.866, p<0.001, Cohen's d=-1.159. - Both groups showed higher HF% than HI%, but AD had significantly more head-final dependencies: HF% AD=58.69 (SD=6.733) vs HC=56.21 (SD=5.48); t=2.309, p=0.023, d=0.405 (HI% complementary). - Adjacent dependencies (1dd%) and MDD: - Distributions of 1dd% were generally similar across groups. - Negative correlation between 1dd% and MDD in both groups: AD Pearson r=-0.667 (p<0.001), HC r=-0.695 (p<0.001). - Linear models: AD MDD = 4.164 - 0.031*1dd% (R2=0.444); HC MDD = 5.098 - 0.043*1dd% (R2=0.482). Regression coefficients: Intercept(AD)=4.164 (p<0.001), Group(HC)=0.934 (p=0.017), 1dd%(AD)=-0.031 (p<0.001), Interaction(HC)=-0.011 (p=0.106). - Sentence length (MSL) and MDD: - AD produced shorter sentences: AD MSL=8.82 (SD=2.584) vs HC MSL=10.07 (SD=3.631); t=-2.24, p=0.026, d=-0.393. - Positive correlation between MSL and MDD: AD Pearson r=0.608 (p<0.001), HC r=0.604 (p<0.001). - Linear models: AD MDD = 1.940 + 0.061*MSL (R2=0.37); HC MDD = 2.335 + 0.045*MSL (R2≈0.365). Regression coefficients: Intercept(AD)=1.940 (p<0.001), Group(HC)=0.395 (p=0.002), MSL(AD)=0.061 (p<0.001), Interaction(HC)=-0.016 (p=0.197). - Major grammatical relations (MDD by dependency types; BH-FDR corrected p): - Significant lower MDD in AD for nsubj (AD 1.740 vs HC 1.877; t=-2.822; d=-0.153; p=0.021) and advcl (AD 4.717 vs HC 5.883; t=-2.470; d=-0.451; p=0.037). - advmod had higher MDD in AD (AD 1.986 vs HC 1.513; t=3.705; d=0.35; p=0.006). - Other types (ccomp, xcomp, relcl) trended toward lower MDD in AD but were not significant after correction (p=0.063). - Dependency direction within grammatical relations: - No significant differences in direction distribution across major relations except Subject-HI (LL=11.17, p<0.001). - AD produced significantly fewer pobj in Object-HI (LL=-8.78, p<0.01) and more advmod in Adverbial-HF (LL=4.84, p<0.05). - Qualitative patterns: - AD relies more on simpler structures (e.g., adjectives instead of relative clauses in subjects), fewer prepositional phrases, more sentence-initial adjuncts (advmod), and more inverted constructions (e.g., here is...). - AD shows greater difficulty as sentence length increases and performs worse in extending nominal groups and hypotactic constructions.
Discussion
Lower overall MDD in AD indicates greater working memory limitations and reduced syntactic complexity. Despite this, AD speakers maintain the dependency distance minimization (DDM) tendency, similar to HC, reflected in comparable 1dd% distributions and negative 1dd%-MDD relationships. Differences in regression intercepts suggest overall higher syntactic load in HC, while slopes suggest HC may maintain complexity more flexibly while still exploiting adjacency. AD shows a higher proportion of head-final dependencies. This aligns with a short-before-long preference affecting word order choices: shorter, more accessible constituents are placed earlier, lowering processing cost. Although both groups display higher HF% than HI%, AD's stronger HF tendency likely reflects increased preference for shorter, simpler phrases. Directional differences concentrate in specific relations (e.g., fewer Object-HI pobj; more Adverbial-HF advmod), helping explain overall direction shifts. Shorter MSL in AD partly explains lower MDD, and the stronger MDD increase per unit MSL in AD suggests escalating difficulty managing structure as sentences lengthen. Comparative examples show AD favoring coordination and simple clause chaining, whereas HC condenses information into more integrated nominal groups with non-finite postmodifiers, achieving similar content with lower MDD rises. Within grammatical relations, AD uses simplified realizations (e.g., adjectival modification instead of relative clauses for subjects; shorter adverbial clauses), and avoids extensive prepositional phrase stacking. Conversely, AD produces more sentence-initial adjuncts (advmod), which can lengthen certain dependencies but may often be non-informative scaffolding. Increased use of inversion (verb–subject) constructions may reflect reduced attention to topic introduction and broader executive deficits (e.g., multitasking, mental tracking). Overall, results support a continuum view: foundational syntactic abilities are preserved, but structures requiring higher working memory and integration are disproportionately affected.
Conclusion
Using a dependency syntax framework on Cookie Theft descriptions, the study shows AD differs from HC in both mean dependency distance and dependency direction, evidencing deteriorated syntactic complexity. AD maintains the universal DDM tendency but prefers simpler structures, produces shorter sentences, and faces greater difficulty as sentences lengthen. Variations in overall MDD and direction are best explained by specific dependency types within major grammatical relations (notably nsubj, advcl, pobj, and advmod). AD speakers preserve basic grammatical competence but underperform in extending nominal groups and hypotactic constructions, consistent with a hierarchical decline in syntactic impairment driven by working memory constraints. The findings refine the linguistic profile of AD and suggest dependency-based metrics as practical, integrative markers for diagnosis and monitoring. Future work should incorporate additional metrics (e.g., syntactic frequency), draw on interdisciplinary perspectives (psychology, neurology, clinical sciences), and investigate cross-linguistic generalizability beyond English.
Limitations
- Measures focused on dependency distance/direction and a subset of fine-grained features; other relevant metrics (e.g., syntactic frequency profiles) were not analyzed and could refine explanations. - Explanations for some patterns likely extend beyond linguistics (e.g., cognitive, psychological, neurological factors) and were not directly tested here. - Findings are based on English Cookie Theft picture descriptions from DementiaBank; cross-linguistic effects and task-generalization remain to be established. - Although automatic parsing was human-checked (95% agreement), residual annotation errors may persist.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny