Introduction
Schizophrenia, a severe psychiatric disorder, is often characterized by aggressive behaviors, significantly impacting patients, staff, and families. The prevalence of aggressive behavior in hospitalized schizophrenia patients in China ranges from 15.3% to 53.2%, with psychiatric staff frequently being the target. The negative consequences include increased medical burdens and economic strain on families. Accurate prediction and prevention of aggressive behaviors are crucial. Machine learning, a powerful tool in the medical field, has proven effective in predicting various outcomes in psychiatry, including treatment response and suicide risk. This study aimed to leverage machine learning (specifically Multi-Layer Perceptron, Lasso, Support Vector Machine, and Random Forest algorithms) to predict aggressive behaviors in hospitalized schizophrenia patients, thereby improving assessment, risk warning, and intervention strategies. The researchers hypothesized that machine learning models would accurately predict aggressive behaviors, providing valuable clinical applications.
Literature Review
Existing literature highlights the high prevalence and significant consequences of aggressive behaviors in individuals with schizophrenia. Studies have explored various risk factors using logistic regression and other traditional statistical methods. However, machine learning offers the potential for more accurate and nuanced predictions. Previous research has demonstrated the successful application of machine learning in predicting violence in schizophrenia patients, with varying algorithms showing superior performance across different studies. For example, Wang et al. found Random Forest to perform marginally better, while Yu et al. found Neural Net to be superior in predicting violence in male patients. Other studies have explored different algorithms such as Gradient Boosting and Boosted Classification Trees, showcasing the algorithm-specific results based on sample characteristics. This highlights the need for a comprehensive exploration using multiple algorithms to identify the most effective approach for predicting aggression in a diverse sample of hospitalized patients.
Methodology
This study employed a cluster sampling method to select patients with schizophrenia hospitalized at the Second Affiliated Hospital of Xinxiang Medical University from July 2019 to August 2021. Inclusion criteria involved meeting ICD-10 diagnostic criteria for schizophrenia, age 14 or older, sufficient cognitive capacity for assessment, and at least 6 months of antipsychotic medication. Exclusion criteria included intellectual disability, severe physical illness, severe mental decline, and sensory impairments. Ethical approval was obtained, and informed consent was secured from patients and guardians. Data collection included a self-administered General Condition Questionnaire, the Insight and Treatment Attitude Questionnaire (ITAQ), the Family APGAR Questionnaire, the Social Support Rating Scale (SSRS), the Family Burden Scale of Disease (FBS), and the Modified Overt Aggression Scale (MOAS) to assess aggressive behaviors during hospitalization. Six psychiatric attending physicians and six psychiatric nurse practitioners conducted the assessments to ensure consistency. Data were processed using Python libraries (numpy, pandas, scikit-learn). The dataset was split into training (70%) and testing (30%) sets. Four machine learning algorithms (Multi-Layer Perceptron, Lasso, Support Vector Machine, and Random Forest) were used for model building and validation. Bayesian optimization was used for hyperparameter tuning. Model performance was evaluated using accuracy, sensitivity, specificity, and AUC of the ROC curve. The Delong test was used for statistical comparisons between ROC curves. A nomogram was constructed based on the Random Forest model to facilitate clinical application.
Key Findings
A total of 2037 patients were included in the analysis (611 with aggressive behaviors, 1426 without). The Random Forest model exhibited the best predictive performance on the testing set, with an AUC of 0.955 (95% CI: 0.935–0.970). This was significantly higher than the AUCs of the other three models (Multi-Layer Perceptron: AUC = 0.904, Lasso: AUC = 0.901, Support Vector Machine: AUC = 0.902; p < 0.0001). The Random Forest model's feature importance analysis revealed the top 8 predictors of aggressive behaviors: Family APGAR, ITAQ, Disease Duration, History of Attacks, SSRS, Medication Adherence, Age, and FBS. A nomogram was developed based on these eight variables to provide a visual and easily interpretable tool for clinical prediction of aggressive behaviors. The analysis of a balanced dataset (equal number of aggressive and non-aggressive cases) showed comparable model performance, suggesting that the data imbalance didn't significantly affect the results. Inner validation using 10 times 4-fold cross validation provided robustness of the model selection.
Discussion
This study demonstrates the potential of machine learning algorithms, particularly Random Forest, for accurately predicting aggressive behaviors in hospitalized schizophrenia patients. The high AUC value (0.955) indicates excellent discriminatory power. The identified top 8 predictors align with existing literature on risk factors for aggression in schizophrenia, highlighting the importance of family function, insight, disease duration, prior aggressive behaviors, social support, medication adherence, age, and family burden. The developed nomogram facilitates clinical application by providing a user-friendly tool for risk assessment. These findings can significantly aid in early identification of high-risk individuals, enabling timely interventions and improved safety management in psychiatric settings. Compared to traditional risk assessment tools, which are often time-consuming, this machine learning approach offers a more efficient and potentially more accurate method.
Conclusion
This study successfully developed and validated a machine learning model, specifically a Random Forest model, for predicting aggressive behaviors in hospitalized schizophrenia patients. The model's high accuracy and the creation of a clinical nomogram represent a significant advancement in risk assessment and prevention strategies. Future research should focus on expanding the dataset to include more diverse populations and incorporate biological markers and genetic information to further enhance predictive accuracy. Investigating the effectiveness of interventions tailored to the identified risk factors is also crucial for improving patient outcomes and enhancing safety in psychiatric care.
Limitations
This study is limited by its single-center design, potentially restricting the generalizability of the findings. The cross-sectional nature of the data prevents causal inferences. While a balanced dataset analysis was performed, it is still important to consider the potential impact of data imbalance on the model performance with a larger dataset. Furthermore, the study did not include predisposing factors or biological indicators that could potentially improve the predictive accuracy. Future studies should address these limitations to develop more robust and comprehensive predictive models.
Related Publications
Explore these studies to deepen your understanding of the subject.