logo
ResearchBunny Logo
Syntactic complexity and diversity of spontaneous speech production in schizophrenia spectrum and major depressive disorders

Psychology

Syntactic complexity and diversity of spontaneous speech production in schizophrenia spectrum and major depressive disorders

K. Schneider, K. Leinweber, et al.

This intriguing study by researchers including Katharina Schneider and Katrin Leinweber delves into the fascinating comparison of syntactic complexity in speech among schizophrenia spectrum disorder patients, major depressive disorder patients, and healthy controls. The findings reveal significant differences, highlighting the intriguing relationship between syntax and severe psychopathology. Discover how the nuances of speech can unveil deeper psychological insights.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses whether reductions in syntactic complexity are specific to schizophrenia spectrum disorder (SSD) or also present in major depressive disorder (MDD), and how syntactic complexity and diversity relate to language-related neuropsychology and psychopathology across these groups. Prior work shows SSD is associated with reduced speech production, verbal fluency deficits, neologisms, and less syntactic complexity, observable even early in illness. MDD speech tends toward longer latencies, reduced spontaneity, and self-focused language. Formal thought disorder (FTD) spans multiple disorders. The authors focus on subordinate clause use as a marker of syntactic complexity/diversity. They pose four questions: (1) frequency of subordinate clauses (adverbial, relative, complement, indirect questions) in German oral production across HC, MDD, SSD; (2) how SSD differs from HC and MDD in producing subordinate clauses; (3) whether lower syntactic complexity/diversity is associated with differences in language-related neuropsychology and psychopathology irrespective of diagnosis; (4) the structure of sub-networks linking syntax, language-related neuropsychology, and psychopathology by degree of syntax. Hypotheses: SSD will produce less complex speech than HC, while MDD will resemble HC; syntax will negatively relate to positive and negative symptoms.
Literature Review
The paper reviews evidence that speech abnormalities carry prognostic value in SSD and MDD and are objective, quantitative, and non-invasive biomarkers. SSD commonly shows reduced speech output and verbal fluency, word-retrieval difficulties (approximations, neologisms), and reduced sentence complexity. MDD demonstrates longer response latencies, reduced spontaneous speech, and increased first-person singular usage with negative affective content; some studies found more truncated, impersonal sentences in MDD compared to HC. Subordinate clauses facilitate coherence and complex idea expression; reduced subordination can impair communicative effectiveness. Prior NLP-based studies distinguished SSD from HC with 70–94% accuracy using various speech features, underscoring language’s diagnostic potential. Yet, specific use of different subordinate clause types in SSD and especially across diagnoses (including MDD) in German has been understudied. Relationships between syntactic features, neurocognition (executive function, working memory), and FTD have been noted, suggesting syntax reflects underlying cognitive processes.
Methodology
Design and participants: Cross-sectional study including N=112 German-speaking adults (20–67 years): SSD n=34, MDD n=38, healthy controls (HC) n=40, diagnosed via DSM-IV-TR semi-structured interview. Exclusions: verbal IQ<80, head trauma/unconsciousness, severe medical or neurological illness, current substance dependence. Ethics approval obtained; informed consent provided. Speech elicitation: Participants described four Thematic Apperception Test (TAT) pictures for 3 minutes each, with 1-minute breaks and non-directive prompts as needed. Speech was audio-recorded and transcribed verbatim by blinded transcribers. Linguistic analysis: Extracted metrics included tokens (total words), types (unique words), total sentences, mean length of utterance (MLU), type-token ratio (TTR), simple sentences, coordinated sentences, and 13 types of complex sentences (relative, complement, indirect questions; 10 adverbial types: temporal, local, modal, causal, conditional, adversative, final, consecutive, concessive, comparative). Coordinated sentences were not counted as complex; passive constructions were excluded. Primary syntax measures: - Relative sum of subordinate clauses: number of main clauses embedding subordinate clauses divided by total sentences. - Extended relative sum of subordinate clauses: total subordinate clauses per total sentences (allowing multiple per sentence). - Pure syntactic complexity: total subordinate clauses divided by total complex sentences only. - Weighted sum of subordinate clauses: counts of subordinate clauses weighted by number of different subordinate clause types within main sentences. - Syntactic diversity: number of different complex sentence types produced divided by 13. Neuropsychology: Executive functioning and verbal fluency (VF): semantic VF (animals), phonemic VF (letter “p”), alternating VF (sports/fruit) (60 s each). Verbal episodic memory: German CVLT (VLMT). Psychopathology: Global Assessment of Functioning (GAF), HAM-D, HAM-A, SANS (negative symptoms), SAPS (positive symptoms), including FTD subscales. Trained raters; interrater reliability ICC>0.86. Medication/clinical covariates: Chlorpromazine equivalents (antipsychotics), Sackeim score (antidepressants), medication load index (antidepressants, antipsychotics, mood stabilizers), number/duration of hospitalizations, duration current episode, age, sex, education. Statistical analyses: Group differences in syntactic measures via one-way ANOVA or Kruskal–Wallis as appropriate; correlations tested associations with medication, illness duration/severity, demographics. Classification: SVM with linear kernel; three binary tasks (HC vs SSD, HC vs MDD, SSD vs MDD), 2-fold cross-validation with 200 repetitions; permutation testing (1000 label shuffles) for significance. Cluster analysis: Unsupervised random forest proximity clustering on five syntax metrics; number of clusters selected via BIC; MANCOVA interactions tested diagnosis-by-cluster effects; clusters compared on medication/clinical variables and on neuropsychology/psychopathology (ANOVA/Kruskal–Wallis). Network analyses: Gaussian Graphical Model with EBICglasso (tuning 0.25), bootstrapped (1000 permutations), nodes: 5 syntax, 4 neuropsychology, 2 psychopathology; edges are regularized partial correlations; centrality metrics (betweenness, closeness, strength, expected influence) computed; Fruchterman–Reingold layout. Analyses via JASP/MATLAB/R (bootnet, qgraph).
Key Findings
- Syntax group differences: SSD showed significantly reduced syntactic complexity and diversity compared with both MDD and HC; MDD did not differ significantly from HC across the core syntax measures. • Relative sum of subordinate clauses: SSD 0.33±0.11; MDD 0.43±0.13; HC 0.41±0.12; p<0.001; η²=0.120. • Extended relative sum: SSD 0.48±0.23; MDD 0.71±0.33; HC 0.68±0.28; p=0.002; η²=0.109. • Pure syntactic complexity: SSD 1.43±0.26; MDD 1.62±0.37; HC 1.64±0.28; p=0.008; η²=0.085. • Weighted sum of subordinate clauses: SSD 0.74±0.45; MDD 1.26±0.91; HC 1.21±0.65; p=0.003; η²=0.100. • Syntactic diversity: SSD 0.52±0.13; MDD 0.62±0.13; HC 0.63±0.14; p=0.002; η²=0.111. • SSD produced a higher proportion of simple sentences (0.35±0.09) and a lower proportion of coordinated sentences (0.48±0.13) than MDD/HC (both p<0.001), and fewer relative, modal, final, and complement clauses in pairwise contrasts (p-values 0.017–0.038 as reported). • MLU: SSD 13.8±3.74 vs MDD 18.76±4.74 and HC 17.91±4.18; p<0.001; η²=0.202. - Neuropsychology and psychopathology (Table 1): SSD had lower semantic VF (18.94±5.45) and alternating VF (11.63±3.75) and lower verbal episodic memory (46.16±8.23) than MDD/HC (all p≤0.001). SSD showed higher SANS and SAPS scores than MDD/HC across subscales (all p<0.001); only SANS total also differed between MDD and HC. - Covariates: No significant associations of syntax measures with current medication indices, duration/severity of illness, age, or sex (all ps>0.05). Syntactic diversity correlated positively with years of education (r=0.33, p<0.001). - Classification (SVM on syntax metrics): HC vs SSD accuracy 0.66 (p<0.004); HC vs MDD 0.51 (p=0.35, ns); SSD vs MDD 0.63 (p<0.005) – indicating limited diagnostic utility of syntax alone. - Cluster analysis (unsupervised, BIC=294.88): Four transdiagnostic clusters (n=39, 19, 20, 34) spanning extremely, very, moderately, and slightly complex speech. • Extremely complex cluster composition: 45% HC, 45% MDD, 10% SSD. • Slightly complex cluster: 58.8% SSD, 23.5% MDD, 17.6% HC. • No diagnosis-by-cluster interaction for syntax measures (all ps>0.05). Clusters did not differ on medication/illness variables, age, or sex; education differed between extremely vs slightly complex clusters (p=0.005). • Lower syntax clusters showed poorer executive functioning, global functioning, and verbal episodic memory, and more pronounced positive and negative symptoms. - Network analyses: • In the full-sample network, domain-specific associations were stronger than cross-domain links; the extended relative sum of subordinate clauses had the highest expected influence (EI=2.29) and strength (S=2.25), followed by relative sum (S=0.87) and pure complexity (S=0.85). • Across cluster-based networks, syntactic complexity measures were tightly interrelated, whereas syntactic diversity tended to be a separate node. Cross-domain associations (syntax–neuropsychology–psychopathology) were more salient in clusters with higher syntactic complexity; links were weak or absent in lower syntax clusters. Both positive and negative FTD related to syntax in most clusters (except the moderately complex cluster).
Discussion
Findings confirm that reduced syntactic complexity and diversity are characteristic of SSD relative to MDD and HC, supporting syntax as a language-domain feature associated with schizophrenia. The absence of HC–MDD differences suggests that diminished syntactic complexity is not a general feature of depressive disorder in this sample, despite other known depressive speech characteristics. Although syntax-based classification modestly separated SSD from HC/MDD, performance was insufficient for clinical diagnosis, arguing for dimensional and transdiagnostic perspectives rather than categorical classification. The unsupervised clustering revealed transdiagnostic strata of syntactic performance: participants with higher syntactic complexity displayed better executive functioning and verbal episodic memory and fewer negative symptoms, while those with lower syntax showed worse cognition and more severe positive/negative symptoms. This aligns with literature linking syntactic complexity reductions to negative symptoms and cognitive deficits, indicating that syntax reflects broader neurocognitive and psychopathological burden. Network analyses further showed that syntactic complexity metrics form a cohesive sub-network and that syntactic diversity, although related, is partially distinct, suggesting separate underlying mechanisms. Negative formal thought disorder and verbal episodic memory emerged as influential nodes mediating connections across domains. Cross-domain coupling was stronger where syntax was more complex, implying that richer syntactic production may integrate with cognitive and symptom dimensions more extensively. Overall, the results emphasize syntax as a meaningful, quantifiable language marker tied to cognitive and clinical features, but not sufficient alone for diagnostic classification.
Conclusion
Reduced syntactic complexity and diversity in SSD, but not in MDD compared to HC, were demonstrated in spontaneous speech. Lower syntax aligned with worse executive functioning, verbal fluency, and verbal episodic memory, and with elevated positive and negative FTD. Unsupervised clustering identified transdiagnostic groups differing in syntax and in cognitive and psychopathological profiles, and network analyses highlighted distinct roles for syntactic complexity and diversity and their links to FTD and memory. Future work should use larger samples to improve stability and effect size estimation, examine lifetime medication effects, and extend analyses to written language and additional NLP measures to enhance clinical utility.
Limitations
- Modest sample size and heterogeneity within clinical groups may limit generalizability and effect size precision. - Cross-sectional design precludes causal inference. - Education differed between the extremely and slightly complex clusters, potentially influencing syntax. - No detected medication effects, but lifetime medication exposure could not be ruled out. - Manual syntax analysis limits efficiency and comparability versus automated NLP, though it allowed in-depth assessment.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny