logo
ResearchBunny Logo
Patterns of activity correlate with symptom severity in major depressive disorder patients

Psychology

Patterns of activity correlate with symptom severity in major depressive disorder patients

S. Spulber, F. Elberling, et al.

This study explores the intriguing link between activity patterns and depression symptom severity in patients with major depressive disorder not on antidepressants. The research, conducted by S. Spulber, F. Elberling, J. Svensson, M. Tiger, S. Ceccatelli, and J. Lundberg, unveils that higher depression severity is tied to simpler activity patterns and a stronger reliance on external factors, highlighting actigraphy's potential in evaluating MDD patients.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates whether features derived from wrist actigraphy recordings correlate with depression symptom severity, measured by the Montgomery-Åsberg Depression Rating Scale (MADRS), in adults with major depressive disorder (MDD) prior to treatment. Prior literature shows MDD patients often have lower overall activity, shorter diurnal activity periods and bouts, and flattened circadian fluctuations. Symptom severity has been linked to physical activity intensity and sedentary behavior, and exercise can reduce symptoms. Biological links include alterations in circadian clock genes and coupling between central and peripheral oscillators in depression. However, direct correlations between detailed activity pattern features and symptom severity have been understudied. This work applies non-parametric and non-linear actigraphy feature extraction and trains linear models to predict MADRS, aiming to provide proof-of-concept that activity patterns reflect symptom severity and to interpret the biological significance of key features.
Literature Review
Background literature highlights: (1) Widespread use of actigraphy with established feature engineering and biological interpretation; (2) Psychiatric populations show distinct rest-activity alterations versus controls and across disorders; (3) In MDD, lower overall activity, shorter active periods, and blunted circadian amplitude are common; (4) Symptom severity correlates with moderate-intensity activity and sedentary bout metrics; (5) Exercise interventions are effective antidepressants; (6) Clock gene alterations and weakened coupling of circadian oscillators are associated with depression. These support exploring actigraphy-derived, sequence-dependent features (e.g., detrended fluctuation scaling, intradaily variability, interdaily stability, relative amplitude) for association with depressive symptom severity.
Methodology
Design: Secondary analysis of actigraphy data from two independent clinical studies of adults with an ongoing MDD episode, with no concurrent antidepressant treatment during the recording period. Ethics approvals: Swedish Research Ethics Committee (Dnr. 2017/799-31; Dnr. 2014/452-31); informed consent obtained. Datasets: - Training dataset: From a published CBT study. Inclusion: DSM-IV MDD with at least one prior episode, MADRS 18–35, no psychopharmacological treatment for MDD. N=12 subjects with ≥7 consecutive recording days immediately before CBT; total recording length 6–12 days. Device: GENEActiv Original wrist actigraph (3D accelerometer up to 8 g; 3.9 mg resolution) at 30 Hz; non-dominant wrist; continuous wear. Processing: Raw data downloaded with proprietary software; processed in Matlab using modified geneactivReader. Steps: compute Euclidean norm of acceleration change; smooth with 1 s rolling Gaussian (30 samples); high-pass filter threshold ~20 mg; sum changes over 1-min epochs (1440 samples/day). - Test dataset (external validation): From a ketamine PET study of SSRI-resistant depression (MADRS ≥20, resistant to adequate SSRI for ≥4 weeks). Antidepressants discontinued; actigraphy after washout (≥5× half-life) and prior to first ketamine infusion. Device: Actiwatch 2 (activity integrated in 1-min epochs). N=23 initial; period cropped to pre-ketamine interval; non-dominant wrist, continuous wear. Data exported via Actiware and imported to Matlab with custom function to harmonize structure with GENEActiv imports. Quality control and inclusion criteria (applied identically to both datasets by a blinded rater): Visual inspection to identify missing data, artifacts, shift-work, or abnormal circadian patterns. Exclusion if: MADRS >40; recording <5 consecutive days; shift-work during recording; continuous data gap >2 h; or other exceptional events strongly impacting circadian activity. Outcome: Train dataset 12/12 passed; Test dataset exclusions: MADRS>40 (1), length<5d (7), shift-work (2), missing data (2); resulting in 12 train and 12 test recordings (each from unique patients). Feature extraction: Recordings cropped from first to last midnight to yield an integer number of 24-h cycles. Emphasis on device-independent, magnitude-invariant features capturing regularity, fragmentation, and complexity of circadian activity patterns. Extracted features included: - Circadian period: Lomb-Scargle periodogram (oversampling factor 10; minute-level resolution), chosen over Sokolove-Bushell to avoid bias toward <24 h estimates. - Detrended fluctuation analysis (DFA) scaling exponents: computed on 1-min binned activity over box sizes from 4 min to 24 h (alpha full; also alpha short/long ranges per Hu et al.). Reflects intrinsic regulation and complexity of activity; sensitive to disease states. - Intradaily variability (IV) at 5-, 30-, and 60-min bins: ratio of mean squared differences of consecutive intervals to variance around global mean; higher with more frequent/magnitude transitions between rest and activity. - Interdaily stability (IS) at 5-, 30-, and 60-min bins: coupling of activity patterns to external circadian entrainers; ratio of variance around circadian profile to global variance; higher indicates more consistent daily patterns. - Relative amplitude (RA): robustness of circadian rhythms; bounded 0–1 (higher indicates more robust rhythms with consolidated rest >5 h). Also computed M10 and L5 magnitudes and locations (not emphasized due to device dependence of magnitude). Model development and validation: Outcome variable: MADRS score. To limit overfitting, models included at most 6 predictors (≥2 subjects per predictor). Two approaches: 1) Brute-force multiple linear regression: Generated all combinations of 1–6 predictors across the feature space. Internal filtering criteria: variable inflation factor (VIF) <5 for all predictors; R-squared >0.5; RMSE <3 on the training set. Surviving models underwent external validation. 2) Forward stepwise multiple regression (semi-supervised ML): Trained models of increasing complexity using either F-statistic or AIC for predictor inclusion. Ran with full feature space and then with manual restrictions (e.g., limiting inclusion of age or circadian period) before rerunning. Applied the same internal filtering as above. External validation: Applied filtered models to the independent test dataset. Performance metrics: coefficient of determination (R-squared) and RMSE between predicted and observed MADRS. External filtering: significant correlation p<0.05 (Pearson r>0.576 for n=12) and RMSE <3. Baselines: a dummy model (predicting the test set mean MADRS) and 1,000,000 simulated random prediction sets with integer values in the test range; compared RMSE distribution to model performance. For model interpretability, computed predictor occurrence frequencies across validated models (normalized by model complexity) and standardized coefficients to assess leverage.
Key Findings
- Correlations in training data: Three features significantly correlated with MADRS: DFA scaling exponent over full range (alpha full; negative correlation), and intradaily variability at 5- and 30-min bins (IV5, IV30; positive correlations). - Brute-force modeling: 14,892 models generated; 3,837 passed internal filtering (mean RMSE 1.84 ± 0.35; mean R² 0.67 ± 0.11). External validation reduced to 192 models (mean RMSE 2.70 ± 0.24; mean R² 0.59 ± 0.09). The average RMSE corresponded to a probability <0.001 under the random simulation distribution. Frequently included predictors across complexities included alpha full, IV5, and IV30. - Forward stepwise modeling: 18 models generated; 14 passed internal validation; 5 passed external validation (r>0.576 and RMSE<3). Across stepwise models, alpha full appeared in all internally validated models and had the largest absolute standardized coefficient, especially among externally validated models. IS5/IS30 and RA were also commonly included among externally validated models. - Individual-level accuracy (validated models): Predictions within 2 MADRS units in 54% of cases on average; within 3 units in 75%; within 4 units in 87%. Best models achieved >90% within 2 units. - Feature interpretation: Higher depression severity associated with less complex activity dynamics (lower alpha full), more fragmented activity (higher IV), stronger coupling to external circadian entrainers (higher IS), and less robust circadian rhythms (lower RA). Magnitude-based features (M10/L5) and timing of circadian peak/trough were not significant predictors of severity. - Selected models (examples): Stepwise model #1 (3 predictors) achieved train R²=0.705, RMSE=1.775; test R²=0.689, RMSE=2.402. Model #5 (3 predictors) achieved train R²=0.750, RMSE=1.583; test R²=0.491, RMSE=2.883. Bland-Altman analyses showed acceptable agreement between observed and predicted MADRS in both datasets.
Discussion
Findings demonstrate that specific sequence-dependent features of rest-activity patterns derived from actigraphy correlate with depression symptom severity independently of absolute activity levels. Models trained on one cohort generalized to an independent cohort, indicating robustness despite different devices and inclusion criteria. The most influential predictor, the DFA scaling exponent (alpha full), reflects reduced complexity of activity dynamics in more severe depression. Elevated interdaily stability suggests stronger synchronization to external zeitgebers in higher severity, consistent with decreased endogenous drive and reduced high-frequency variability. Lower relative amplitude aligns with less robust circadian rhythms and shorter, less consolidated active periods reported in MDD. Notably, magnitude and phase of peak/trough activity, while potentially diagnostic for MDD vs controls, did not track symptom severity within MDD. Clinically, actigraphy could complement assessments by providing objective markers related to symptom burden, with potential for monitoring, though accuracy must be sufficient to detect clinically meaningful changes (e.g., ≥2 MADRS units).
Conclusion
This proof-of-concept study shows that features capturing the complexity, fragmentation, and circadian organization of activity patterns correlate with MADRS-measured symptom severity in unmedicated MDD and can predict severity with moderate accuracy on an independent dataset. Key contributors include the DFA scaling exponent, interdaily stability, and relative amplitude, indicating that more severe depression is associated with less complex activity dynamics, stronger coupling to external entrainers, and less robust circadian rhythms. Actigraphy holds promise as an objective adjunct in the individual evaluation of depression. Future work should include larger, more diverse cohorts, broader severity ranges, harmonized multi-device pipelines, internal cross-validation with nested external validation, and longitudinal designs to assess responsiveness to treatment and generalizability across mood disorders and medicated states.
Limitations
- Population and scope: Only unipolar MDD during a pre-treatment stable state; results do not generalize to bipolar/cyclothymia or to dynamic changes during treatment. Effects in patients on antidepressants were not assessed. - Sample size and range: Small datasets (12 train, 12 test after QC) with a relatively narrow MADRS range, increasing overfitting risk and limiting generalizability. - Cohort differences: Training set (CBT trial) vs test set (SSRI-resistant candidates for ketamine) had different inclusion criteria, introducing selection bias; nevertheless, models transferred reasonably well. - Device differences: Different actigraphy devices (GENEActiv vs Actiwatch 2) led to significant differences in magnitude-based features (M10, L5); analysis focused on device-independent features, but residual device effects may remain. - Modeling constraints: Limited subjects per predictor (≤6 predictors) and heuristic internal filtering (VIF, R², RMSE thresholds). Some features (e.g., age, circadian period) were manually restricted in stepwise runs. - Assumptions: Assumed stable symptom severity during recording; sequence-dependent features were used without classification of activity states, which may omit potentially informative summary statistics.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny