Medicine and Health

Development of prediction models for screening depression and anxiety using smartphone and wearable-based digital phenotyping: protocol for the Smartphone and Wearable Assessment for Real-Time Screening of Depression and Anxiety (SWARTS-DA) observational study in Korea

Y. Shin, A. Y. Kim, et al.

This study uses smartphones and consumer wearables to develop machine-learning algorithms that detect depressive and anxiety disorders and classify symptom severity from passive and active digital biomarkers collected over four weeks in up to 2,500 adults in South Korea. Conducted by the authors listed in the Authors tag: Yu-Bin Shin, Ah Young Kim, Seonmin Kim, Min-Sup Shin, Jinhwa Choi, Kyung Lyun Lee, Jisu Lee, Sangwon Byun, Sujin Kim, Heon-Jeong Lee, Chul-Hyun Cho.

00:00

~3 min • Beginner • English

Index

Introduction

Depression and anxiety have high global prevalence and frequently co-occur, worsening prognosis and increasing societal and healthcare burdens. The COVID-19 pandemic further elevated rates. Current screening relies on self-report instruments (eg, PHQ-9, GAD-7), which are time-consuming and prone to recall and social desirability biases. Digital phenotyping via smartphones and wearables can provide objective, continuous assessment in naturalistic settings and potentially enable automated, frequent monitoring. Prior studies show promise using smartphone usage, mobility (GPS), actigraphy and app patterns to predict depression and, to a lesser extent, anxiety; however, limitations include small samples, right-skewed symptom distributions, variable device feasibility and limited integration of wearable-derived physiological and sleep data. Given high comorbidity of depression and anxiety, models capable of differentiating depression, anxiety or both are needed. The SWARTS-DA study aims to build large, combined smartphone and wearable digital phenotyping datasets to develop robust machine-learning models to identify individuals at risk and classify symptom severity, examine associations with clinical measures and assess feasibility of intensive 4-week data collection.

Literature Review

Prior work demonstrates that machine-learning models using smartphone metadata (call/SMS logs, app usage) can screen for depression and classify symptom severity. Features such as time on phone, screen events and internet/app usage, as well as gyroscope and mobility (GPS-derived features), have predictive utility for depression across multiple studies. Anxiety-focused digital phenotyping is less explored; some studies linked usage of passive information, gaming and health/fitness apps, actigraphy and messaging time to GAD severity. Wearable passive sensing extends digital biomarkers to physiology (heart rate, heart rate variability), sleep stages/efficiency/latency and circadian rhythm metrics (MESOR, amplitude, acrophase; relative amplitude, interdaily stability, intradaily variability). Recent digital phenotyping models report AUROC performance ranging from about 0.65 to 0.86, suggesting potential but with limitations: small samples, minimal symptom skew, limited wearable integration and feasibility issues (eg, finger-worn devices). The high co-occurrence of depression and anxiety underscores the need to model and differentiate comorbidity classes. Integrating multimodal smartphone and wearable data at scale may improve prediction and understanding of symptomatology.

Methodology

Design: Cross-sectional observational study collecting multimodal digital data over 4 weeks to develop screening algorithms for depression and anxiety in the general population. Setting and recruitment: Korea University Anam Hospital; community and hospital-based recruitment via flyers, posters, online ads, referrals and community presentations across South Korea. Enrollment target up to 2500 (minimum 1000) aged 19–59; recruitment began April 2024 for up to 2 years. Remote participation via Zoom/live streaming/chat apps is permitted to broaden reach. Inclusion criteria: age 19–59; informed consent; compatible smartphone (iOS 16.4+ or Android 18.2+); optional smartwatch (WatchOS 9.3+ or Galaxy Watch 6+); general technological literacy. Exclusion: inability to consent; epilepsy/seizure disorder; schizophrenia/psychosis; intellectual disability; dementia; concurrent interventional studies that could interfere. Procedures: Participants install the PixelMood app (Batoners, Inc) on iOS/Android; if available, continue using Apple Watch or Galaxy Watch; loaner devices may be provided when possible. Wearables should be worn continuously (except shower/charging). App presents daily/weekly/monthly tasks; push notifications prompt completion. Data monitoring occurs via an online portal; privacy safeguards exclude collection of personal identification, app content, internet use details, and call/text content. Schedule: Enrolment—eligibility, consent, socio-demographics, medical history, app setup. Daily—mood log (affect, energy, anxiety, irritation), lifestyle (alcohol/caffeine intake, smoking, meals, exercise, menstruation for women). Weekly—PHQ-9 and PROMIS Depression; GAD-7 and PROMIS Anxiety (administered biweekly and at exit). Four-weekly—comprehensive psychological/clinical battery. Endpoint—app uninstallation and device return (if applicable). Active measures: Socio-demographics (age, sex, height, weight, education, occupation, marital status, menopausal status). Medical and psychiatric history, current medications, lifetime suicide attempt, family psychiatric history. Depression: PHQ-9 (0–27; severity bands: 0 none, 1–4 minimal, 5–9 mild, 10–14 moderate, 15–19 moderately severe, 20–27 severe); PROMIS Depression SF 8b (raw 8–40; T-scores: <55 none–slight, 55–59 mild, 60–69 moderate, >70 severe). Anxiety: GAD-7 (0–21; 0–4 minimal, 5–9 mild, 10–14 moderate, 15–21 severe; cut-off ≥10 for potential GAD); PROMIS Anxiety SF 7a with T-score interpretation aligned to Depression SF. Daily mood states: affect and energy on 7-point scale (−3 extremely negative to +3 extremely positive), anxiety and irritation on 4-point scale (0–3). Psychological/clinical battery (administered once): SADS, PDSS-SR, MOCI, PC-PTSD-5, PSS, BRS, ULS-8, ASI-3, TAS, DII, ISI, KtCS (chronotype), SWLS. Passive smartphone measures: accelerometer (1 Hz), gyroscope/magnetometer (as detected), GPS (every 15 min), distance (every 10 min), step count (event-driven ~10 min), activity states (stationary/walking/running/automotive/cycling/unknown), screen lock/unlock, app usage, call/text logs (frequency/duration; not content), Wi-Fi SSID/BSSID and signal strength (every 10 min), ambient light (as detected), weather (hourly), air quality (hourly). Passive wearable measures (Apple Watch/Galaxy Watch, if used): sleep states/stages (inBed, asleep core/deep/REM/awake/unspecified), heart rate (every 5 s), heart rate variability (ms), respiratory rate, oxygen saturation (%), skin/body temperature, steps, accelerometer (1 Hz), active energy (kcal). Data handling: Continuous secure upload to cloud database with strict privacy protocols and GDPR compliance. Analysis plan: Preprocessing (outlier handling, smoothing, normalization, multiple imputation; temporal treatment and feature engineering for activity, heart rate, usage patterns, circadian features). Correlation analysis (Pearson/Spearman) between digital biomarkers and PHQ-9/GAD-7 to identify associations and interactions. Feature selection using random forest to identify predictive variables and handle nonlinear interactions. Model development: Primary binary targets—clinically significant depression (PHQ-9 ≥ 10) and anxiety (GAD-7 ≥ 10). Secondary—four-class comorbidity outcome (neither, depression-only, anxiety-only, both). Tertiary—ordinal/multiclass severity prediction for each scale. Algorithms: random forests, support vector machines, gradient boosting machines, deep neural networks. Training with 10-fold participant-level cross-validation; class-weighted losses and oversampling for imbalance (especially the 'both' class). Evaluation metrics: AUROC, F1-score; sensitivity and specificity reported. Anomaly detection and longitudinal analyses to detect deviations from baseline and temporal trends indicative of symptom progression.

Key Findings

This article is a study protocol; no empirical results are reported. Primary outcome is to develop and evaluate machine-learning algorithms to predict clinically significant depression (PHQ-9 ≥ 10) and anxiety (GAD-7 ≥ 10) using multimodal smartphone and wearable digital biomarkers. Secondary outcomes include quantifying associations between digital biomarkers and clinical measures and assessing feasibility/acceptability of intensive 4-week data collection in a large sample (target up to 2500). Models will be evaluated via participant-level 10-fold cross-validation using AUROC and F1-score, with sensitivity and specificity reported.

Discussion

The protocol outlines a comprehensive digital phenotyping approach integrating active self-report and passive smartphone and wearable sensing to address low detection rates of depression and anxiety. By incorporating sleep, physiological and circadian data alongside mobile usage and activity features, the study aims to build robust models across iOS and Android platforms that can differentiate depression, anxiety, and their comorbidity. Ethical, legal and regulatory considerations (privacy, consent, secure data handling) are central to design, supporting responsible AI healthcare implementation. Clinically, the models could enable community-level screening, augment clinical decision-support with daily-life phenotypes, and inform personalized digital therapeutics. The discussed strengths include large sample size and rich multimodal data, while limitations such as selection bias to compatible smartphone users, short monitoring duration, device heterogeneity, and reliance on self-report scales rather than clinician-administered interviews may affect generalizability. Integrating wearable-derived sleep and physiology is expected to improve understanding of pathological development and enhance prediction of symptom severity.

Conclusion

This protocol presents the SWARTS-DA study to develop and validate multimodal machine-learning models for screening depression and anxiety using smartphone and wearable digital phenotyping over 4 weeks in a large community sample. It details recruitment, measures, data streams, preprocessing, modeling strategies and evaluation, with emphasis on privacy and ethical compliance. The study is positioned to advance early detection, inform clinical decision-support and personalized interventions. Future work will include extended validation, improved data imputation, longer monitoring periods, and refined integration/normalization across devices and platforms to enhance generalizability and clinical translation.

Limitations

Selection bias due to inclusion of only users with compatible smartphones and, optionally, specific smartwatch models; cultural factors specific to South Korea may limit global applicability; potential low number of target symptomatic participants within the completed sample may restrict generalizability across varying symptom profiles; reliance on self-reported screening scales (PHQ-9, GAD-7) rather than structured clinical interviews may reduce diagnostic accuracy; challenges with participant compliance in continuous wearable use; a relatively short 4-week data collection window may not capture longer-term patterns; potential data loss and inconsistency due to use of commercial non-clinical devices; analytical challenges integrating and normalizing heterogeneous data across operating systems, smartphone models and wearable devices.

Related Publications

Explore these studies to deepen your understanding of the subject.

Social Work

Evaluating the effectiveness of the Kidogo model in empowering women and strengthening their capacities to engage in paid labor opportunities through the provision of quality childcare: a study protocol for an exploratory study in Nakuru County, Kenya

K. Okelo, M. Nampijja, et al.

Psychology

Differential temporal utility of passively sensed smartphone features for depression and anxiety symptom prediction: a longitudinal cohort study

C. A. Stamatis, J. Meyerhoff, et al.

Medicine and Health

Real-time tracking and prediction of COVID-19 infection using digital proxies of population mobility and mixing

K. Leung, J. T. Wu, et al.

Psychology

The Prevalence and Incidence of Suicidal Thoughts and Behavior in a Smartphone-Delivered Treatment Trial for Body Dysmorphic Disorder: Cohort Study

A. C. Jaroszewski, N. Bailen, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny