Psychology
Plasma proteomics discovery of mental health risk biomarkers in adolescents
I. D. S. Maciel, A. Piironen, et al.
Adolescence is marked by profound biological, psychosocial, cognitive and emotional changes, alongside dynamic brain development that provides both opportunities for cognitive improvement and vulnerability to the onset of mental disorders, many of which begin before adulthood with a peak onset at around 14 years. Mental disorders in adolescence can lead to morbidity, mortality, and later life dysfunction, underscoring the need to identify high-risk youths and improve early diagnostics. Globally, 10–20% of adolescents are estimated to have mental health conditions, yet most cases are underdiagnosed and undertreated due to factors such as stigma, perceptions of care needs, and limited resources; mis- or overdiagnosis can also expose youths to unnecessary treatments. This study aims to discover plasma protein-based susceptibility biomarkers that reflect risk of developing mental health problems in adolescents, using the self-reported Strengths and Difficulties Questionnaire (SDQ) as an indicator of mental health dysfunction in a community cohort.
Prior research indicates blood-based proteomic alterations across major psychiatric disorders (e.g., depression, schizophrenia, psychosis, bipolar disorder), with commonly implicated pathways including complement cascade and interleukin signaling. Early-life changes in complement and coagulation have been associated with later psychotic disorders in adolescents. The SDQ is a validated screening tool with demonstrated predictive value for clinical diagnostics, especially when combined with parent/teacher reports, and shows good reliability for assessing behavioral problems in youths. However, SDQ is a screening measure rather than a diagnostic instrument. These insights motivate exploring plasma proteomics in adolescents to identify susceptibility biomarkers linked to mental health risk.
Study design and participants: Plasma samples were analyzed from a subsample of 91 adolescents (aged 11–16 years) from the Spanish WALNUTS cohort. Samples were collected in 2016 at approximately the same time participants completed the self-reported SDQ. Based on total SDQ scores, individuals were categorized as lower (SDQ=0–14; n=42) or raised (SDQ=15–25; n=49). Ethical approval was granted by CEIC Parc Salut Mar (2015/6026 WALNUTS; 2020/9688 Equal-Life); consent was obtained from legal guardians.
Sample handling: Whole blood was collected in K2EDTA tubes, rested 1 h, centrifuged at 2,500 × g for 20 min at 20 °C, refrigerated at 4 °C, and frozen at −80 °C within 4 h. Plasma was stored undisturbed at −80 °C until 2021 for protein depletion and proteomics.
High-abundance protein depletion: To enrich low-abundance proteins, the 14 most abundant plasma proteins were depleted using High Select Top14 Abundant Protein Depletion Mini Spin Columns (Thermo Scientific, A36370). Ten microliters of plasma were processed per sample; filtrates were stored at −20 °C.
Protein processing: Proteins were acetone precipitated, denatured in 8 M urea/50 mM Tris-HCl, reduced with 5 mM DTT, alkylated with 13 mM iodoacetamide, and digested overnight at 37 °C with trypsin (1:30 enzyme:protein). Peptides were desalted (Sep-Pak C18), dried, and stored at −20 °C.
Mass spectrometry: Peptides were dissolved in 0.1% formic acid, quantified (NanoDrop), and spiked with iRT peptides. Samples were analyzed on an Easy-nLC1200 coupled to a Q Exactive HF Orbitrap (Thermo Fisher Scientific) using a 15 cm C18 column and a 100-min gradient. DIA acquisition used one full scan (400–1,000 m/z) and 40 DIA MS/MS scans covering 400–1,000 m/z with variable windows.
Protein identification/quantification: Data were processed in Spectronaut v17.1.221229 using a directDIA approach. Parameters: Trypsin/P; up to 2 missed cleavages; fixed carbamidomethyl (C); variable acetyl (protein N-term), oxidation (M); precursor and protein FDR 1%; MS2 quantification (AUC); database Swiss-Prot 2022_05 Homo sapiens plus contaminant database; global median normalization; MaxLFQ for label-free quantification. Identified proteins with >20% missing values were excluded. Remaining missing values were imputed using the sample minimum method. Median-centering normalization (proBatch) was applied.
Statistical analysis: Associations between continuous SDQ score and protein abundances were tested using linear modeling with DEqMS (v1.16.0), adjusting for sex and age; school attended was considered as a random variable in additional analyses. Differential abundance was expressed as log2 fold change (raised vs low SDQ). P-values were adjusted using Benjamini–Hochberg.
Bioinformatics: Enrichment used ReactomePA (hypergeometric model; BH correction), Ingenuity Pathway Analysis (canonical pathways, z-scores), and STRINGdb (v11.5) for PPI networks with fastgreedy clustering. Highly abundant proteins targeted by depletion that still appeared as significant were considered potential bias and excluded from enrichment. The full study protein list served as background for enrichment.
Predictive modeling: Symbolic-regression-based QLattice (Feyn v3.0.3) with fivefold cross-validation generated low-BIC logistic models to discriminate low vs raised SDQ groups using the set of proteins significantly associated with SDQ. Models and performance (ROC/AUC) were evaluated; lowest BIC models per fold were retained.
- Proteome coverage: 1,485 proteins identified (N=91; mean 1,228 proteins/sample; SE=117). After removing 77 contaminants, 983 proteins were detected in ≥80% of samples and analyzed.
- Association with SDQ: 67 proteins showed a linear relationship with SDQ (48 positively, 19 negatively correlated). After excluding potential bias from depleted high-abundance proteins observed as significant, 58 proteins were considered significantly associated with SDQ.
- Notable proteins and directions: Positive associations included coagulation factors XI (F11), X (F10), and II (F2, thrombin); complement proteins C1q (C1QB), C1r (C1R), factor I (CFI), factor H (CFH), and C2. Negative associations included APLP1 (amyloid beta precursor-like protein 1) and RTN4 (Reticulon 4). CAMK2B was positively associated with SDQ.
- Enriched pathways/biological processes: Strong enrichment for immune responses (including complement cascade), blood coagulation/hemostasis, and processes linked to neurogenesis and neuronal degeneration.
- Predictive models (QLattice): Five diverse models (fivefold CV) using 11 total proteins achieved training AUCs of 0.93–0.95 with low BIC scores (e.g., Model 1 BIC=49.33, AUC=0.95). Model 1 combined APLP1 + (CAMK2B × RTN4); other models included PIGR, SERPINA4, CDH11; LYPD3, SERPING1; BTD; LCP1; CD9. Four of five models contained proteins previously connected to CNS/neurodevelopment.
- Cohort characteristics: 91 adolescents, ages ~11–16; SDQ stratification into low (0–14) and raised (15–25) groups; sex and age included as covariates; no significant differences across tested confounders (parental education, media/social media use, drug/alcohol use, physical activity).
The study addresses the need for early, objective indicators of adolescent mental health risk by identifying plasma proteins associated with SDQ scores. The findings align with and extend prior evidence implicating immune and coagulation pathways in psychiatric pathophysiology. Positive associations of coagulation factors and complement components with higher SDQ scores support a role for early immune/coagulative dysregulation in the predisposition to mental health problems. Proteins linked to neurodevelopment (APLP1, RTN4, CAMK2B, CDH11) also associated with SDQ status and featured in predictive models, suggesting that neurodevelopmental and immune/coagulative processes may converge in adolescent mental health risk. SDQ, while a screening tool rather than a diagnostic instrument, has validated predictive utility; its use here, alongside proteomics and machine learning, demonstrates potential for biomarker-based risk stratification. Consideration of confounders and inclusion of sex and age in models strengthen interpretability. Overall, the results suggest that multi-protein panels reflecting complement/coagulation and neurodevelopmental biology could improve early identification of at-risk adolescents.
This exploratory study identified plasma protein susceptibility biomarker candidates associated with self-reported SDQ scores in adolescents, with significant alterations in proteins and pathways related to immune response and complement activation, blood coagulation/hemostasis, neuronal degeneration, and neurogenesis. Predictive models combining a small number of proteins achieved high discriminatory performance between low and raised SDQ groups and included proteins with known CNS relevance. Future work should validate these biomarkers in larger, independent cohorts, include longitudinal follow-up to assess predictive value for transition to clinical mental disorders, and explore mechanistic links between immune/coagulative changes and neurodevelopment in adolescent mental health.
Main limitations include the relatively small sample size (n=91) relative to the proteome-wide testing burden, which may limit power and generalizability; and the use of non-fasting plasma samples, as food intake can influence circulating protein levels. Although high-abundance proteins were depleted, some appeared significant and were excluded to mitigate bias. Potential confounders (sex, age, school attended, parental education, media/social media use, substance use, physical activity) were assessed and/or adjusted; no significant between-group differences were detected for the measured factors, but residual confounding cannot be fully excluded. The SDQ is a screening, not diagnostic, tool, which may introduce misclassification of clinical risk. Model performance was reported on training folds; external validation is needed.
Related Publications
Explore these studies to deepen your understanding of the subject.

