Introduction
Post-traumatic stress disorder (PTSD) significantly impacts soldiers, with a higher prevalence among active-duty military personnel compared to the general population. The repeated exposure to life-threatening situations and the requirement to engage in combat differentiate deployment-related PTSD from civilian trauma. Understanding and mitigating modifiable risk factors for deployment-related PTSD is crucial to reduce individual and societal burdens, including the suffering associated with PTSD symptoms and comorbidities such as depression, substance abuse, chronic pain, and sleep disturbances. While initial skepticism existed, recent research suggests that identifying and mitigating pre-deployment risk factors is possible. Prior studies have highlighted the roles of inflammation, metabolomics, epigenetically altered networks, polygenic risk scores, neurocognitive dysfunction, and self-reported symptoms as potential predictors of PTSD development in soldiers. This study aimed to leverage a comprehensive multivariable dataset, incorporating biological, clinical, and neurocognitive variables, collected before deployment to predict PTSD symptom development and diagnosis post-deployment. The use of machine learning, specifically random forest (RF) and support vector machine (SVM) algorithms, provided a data-driven approach to analyze this complex, multidimensional dataset, addressing the limitations of traditional statistical methods and accounting for potential non-linear relationships and interactions among predictors. The study hypothesized that a combination of pre-deployment factors could effectively predict PTSD symptom trajectories and diagnosis, offering potential for improved deployment readiness assessments and the development of targeted preventative interventions.
Literature Review
Existing literature suggests various pre-deployment factors are associated with increased risk of PTSD in military personnel. These include alterations in inflammation and metabolomics, epigenetically altered networks, polygenic risk scores, neurocognitive dysfunction, and pre-deployment self-reported symptoms such as nightmares and mental health status. However, most previous studies examined these factors in isolation. This research aimed to build upon this existing knowledge by employing machine learning techniques to analyze a comprehensive set of pre-deployment predictors simultaneously, accounting for their complex interactions and non-linear relationships. Recent reviews have highlighted the potential of machine learning in PTSD resilience research, demonstrating the ability to identify complex combinations of risk factors that are highly predictive of PTSD, information that traditional statistical approaches struggle to elucidate.
Methodology
This prospective, longitudinal, naturalistic cohort study followed 473 active-duty Army personnel from the 101st Airborne at Fort Campbell, Kentucky, before and after a 10-month deployment to Afghanistan. Data collection occurred in three phases: pre-deployment, 3 days post-deployment, and 90-180 days post-deployment. The comprehensive dataset included demographic information, self-reported symptoms (using the PCL-5, PHQ-8, GAD-7, AUDIT, PSQI, DRRI-2, and CSI), neurocognitive assessment (using WebNeuro), and multi-omics blood markers (including GWAS data for PRS, epigenomic, metabolomic, endocrine, inflammatory, and routine clinical lab tests). Latent growth mixture modeling (LGMM) was used to identify distinct PTSD symptom trajectories, classifying participants into "increasing" or "resilient" trajectories. Random forest (RF) and support vector machine (SVM) algorithms were trained on the pre-deployment data to predict both trajectory membership and provisional PTSD diagnosis (defined as a PCL-5 score ≥ 31) at the 90-180-day post-deployment assessment. Model performance was evaluated using AUC, sensitivity, specificity, and precision-recall curves. Data preprocessing included imputation of missing values using bagged decision trees, and stratified random sampling was used to create training (75%) and test (25%) datasets. Model parameters were fine-tuned using bootstrapping and random search techniques to avoid overfitting. Predictor importance was ranked using a permutation procedure.
Key Findings
The machine learning models demonstrated high discriminatory power in predicting both PTSD symptom trajectories and provisional PTSD diagnosis based on pre-deployment data. The random forest model achieved an AUC of 0.85 (95% CI: 0.75-0.96) for predicting PTSD symptom trajectories, and an AUC of 0.78 (95% CI: 0.67-0.89) for predicting provisional PTSD diagnosis. The support vector machine model showed slightly improved performance, with AUCs of 0.87 and 0.88 respectively. Among the top-ranked predictive features were pre-deployment sleep quality, anxiety, depression, sustained attention, and cognitive flexibility. Blood-based biomarkers, including metabolites, epigenomic markers, immune and inflammatory markers, and liver function markers, also contributed to the predictive models. Participants on the "increasing" PTSD symptom trajectory experienced significantly more traumatic events during deployment compared to those on the "resilient" trajectory.
Discussion
This study's findings demonstrate the feasibility of using pre-deployment data to predict PTSD symptom development in active-duty military personnel. The high discriminatory power of the machine learning models suggests that integrating diverse biological, clinical, and neurocognitive measures can significantly improve the prediction of PTSD risk. The identified pre-deployment risk factors, particularly those related to mental health (anxiety, depression) and cognitive functioning (attention, cognitive flexibility), highlight potential targets for pre-deployment interventions. The inclusion of blood-based biomarkers provides a novel perspective on the biological underpinnings of PTSD risk and may open avenues for the development of novel biomarkers for early risk detection. These findings have significant implications for improving deployment readiness assessments and tailoring preventative interventions to reduce the incidence of PTSD.
Conclusion
This study successfully demonstrated the ability to predict post-deployment PTSD symptom trajectories and diagnosis using pre-deployment data. The high accuracy of the machine-learning models highlights the potential for using these models to identify at-risk individuals prior to deployment. Future research could focus on validating these findings in larger and more diverse samples, as well as investigating the efficacy of targeted pre-deployment interventions based on the identified risk factors.
Limitations
The study's sample size, while substantial, may limit the generalizability of findings. The primarily male sample may limit generalizability to female soldiers. The study relied on self-reported data, potentially subject to recall bias and response bias. Further research should address these limitations by examining larger, more diverse cohorts and integrating objective measures to minimize bias.
Related Publications
Explore these studies to deepen your understanding of the subject.