logo
ResearchBunny Logo
Cognitive behavioral therapy skills via a smartphone app for subthreshold depression among adults in the community: the RESILIENT randomized controlled trial

Medicine and Health

Cognitive behavioral therapy skills via a smartphone app for subthreshold depression among adults in the community: the RESILIENT randomized controlled trial

T. A. Furukawa, A. Tajika, et al.

This large randomized trial tested a smartphone app delivering five CBT skills (behavioral activation, cognitive restructuring, problem‑solving, assertion training, and insomnia therapy) in 3,936 adults with subthreshold depression, finding all skills and combinations outperformed control conditions with substantial symptom improvement and high adherence. Research conducted by the authors listed in the <Authors> tag.

00:00
00:00
~3 min • Beginner • English
Introduction
Guidelines recommend psychotherapies, particularly CBT, for subthreshold to mild depression. Digital delivery formats are increasingly used due to scalability and resource constraints, with strong evidence that internet-based CBT is effective for subthreshold depression. However, CBT is commonly delivered as a package of cognitive and behavioral skills, and it is unclear whether all components contribute to mental health promotion. This study aimed to estimate the specific efficacies of five representative CBT skills—behavioral activation, cognitive restructuring, problem solving, assertion training, and behavior therapy for insomnia—delivered via a smartphone app to adults with subthreshold depression in the community, using a master protocol with four embedded 2×2 factorial trials.
Literature Review
Prior component network meta-analyses of iCBT suggested BA as the most efficacious component, with supportive evidence for BI, PS, and AT, but inconclusive evidence for CR. Multiple meta-analyses support digital CBT efficacy for subthreshold depression and highlight the importance of appropriate control conditions, as different controls (e.g., waiting list, attention control, health information) can yield varying effect estimates. The Dodo bird verdict (equivalence of psychotherapies) has been debated, with recent component-focused analyses indicating differential effects across CBT elements.
Methodology
Design: Master protocol trial comprising four 2×2 factorial randomized controlled trials to assess component-specific effects of CBT skills via a smartphone app. BA was included in all factorial trials to optimize efficiency and based on prior evidence of efficacy. Participants and setting: Adults (≥18 years) across Japan recruited via health insurance societies, companies, communities/local governments, and direct-to-consumer ads. Inclusion: possession of smartphone (iOS/Android), informed consent, completion of baseline questionnaires within 1 week, PHQ-9 5–9 or 10–14 with suicidality item <2. Exclusion: inability to read/write Japanese, current treatment from mental health professionals at screening. Of 34,123 assessed, 5,364 consented and were randomized; 3,936 formed the intention-to-treat cohort (baseline PHQ-9 ≥5). Participants were mainly aged 30s–50s; 51% men; most married and employed; mean baseline PHQ-9 8.07 (SD 2.74). Randomization and masking: App-embedded permuted block randomization stratified by employment status; allocation concealment ensured by an independent statistician. Neither participants nor staff were blinded due to the nature of the interventions. Outcomes were self-rated in-app. Interventions: Nine intervention arms (BA, CR, PS, AT, BI, BA+CR, BA+PS, BA+AT, BA+BI) delivered via the Resilience Training app v2.1 over seven chapters with weekly progression and skill worksheets. BA emphasized activity scheduling and gamified engagement; CR taught monitoring and alternative thoughts via mind maps; PS used structured problem-solving worksheets; AT trained assertive communication; BI applied sleep restriction and stimulus control with sleep diaries. Two-skill arms taught BA plus a second skill. Controls: Three controls differing in stringency: (1) Self-check: weekly PHQ-9 and encouragement emails (attention control without active CBT ingredients); (2) Health information: physical activity, nutrition, oral health content and quizzes (placebo-like, no weekly emails); (3) Delayed treatment: waiting list for 6 weeks before optional re-randomization. Outcomes: Primary—change in PHQ-9 from baseline to week 6. Secondary—PHQ-9 weekly changes (weeks 1–5), GAD-7 (weeks 3, 6), ISI (weeks 3, 6), SWEMWBS (weeks 3, 6), adherence metrics, and safety signals. Sample size: Powered to detect SMD 0.2 at α=0.05, β=0.10; required 3,168 participants (264 per arm) with PHQ-9 ≥5; accounting for 10% dropout, target 3,520. Achieved ITT cohort n=3,936. Statistical analysis: Mixed-model repeated measures (MMRM) with fixed effects for treatment, visit (categorical), treatment×visit interaction; adjusted for baseline PHQ-9, employment status, age, gender; Kenward–Roger degrees of freedom; unstructured covariance. Effect sizes (SMD) used observed baseline SD for within-group change and week 6 SD for group differences. Primary comparisons used delayed treatment; sensitivity analyses used health information and self-check. Interactions between components assessed; for BA+BI trial, shift workers were excluded in primary/sensitivity analyses. Combined analyses pooled all arms against all controls. Additional post hoc: extended PHQ-9 analyses to week 26 and component analyses decomposing nonspecific effects (ns), waiting list (wl), and specific skills. Adherence and engagement: Time-on-task tracked server-side; completion defined as all seven chapters or at least up to chapter 4 (main content). Safety monitoring included automated flags for increased suicidality (PHQ-9 ≥10 and item 9 ≥2 twice). Participants could seek external care freely.
Key Findings
Primary outcome (PHQ-9 change baseline→week 6, vs delayed treatment): All skills showed specific efficacy. SMDs: BA -0.38 (95% CI -0.48 to -0.27, P=5.3×10⁻¹³); CR -0.27 (-0.37 to -0.16, P=2.9×10⁻⁷); PS -0.27 (-0.37 to -0.17, P=1.8×10⁻⁸); AT -0.24 (-0.34 to -0.14, P=2.2×10⁻⁷); BI -0.27 (-0.37 to -0.16, P=3.8×10⁻⁷). Significant antagonistic interactions observed in all factorial trials: estimated single-skill SMDs ranged -0.52 (PS) to -0.65 (BA); two-skill ranged -0.57 (BA+BI) to -0.67 (BA+PS). Sensitivity analyses vs stringent controls remained significant though attenuated: e.g., vs self-check, PS SMD -0.16 (-0.30 to -0.02); BA+PS -0.31 (-0.45 to -0.17). Combined week 6 analysis (pooled vs controls): All intervention arms had 95% CIs below null against all controls. Example SMDs (differences vs delayed treatment; vs health info; vs self-check): BA -0.63; -0.48; -0.29. CR -0.52; -0.37; -0.18. PS -0.50; -0.35; -0.16. AT -0.53; -0.37; -0.19. BI -0.61; -0.46; -0.28. Two-skill arms similar or larger than single-skill arms. Secondary outcomes (week 6): Anxiety (GAD-7): all interventions superior to delayed/self-check; BA and BI also superior to health information with SMDs -0.24 (BA, 95% CI -0.37 to -0.10) and -0.21 (BI, -0.34 to -0.08). Insomnia (ISI): all interventions superior to delayed/self-check; BI and BA+BI superior to health information (SMDs -0.33 and -0.27). Wellbeing (SWEMWBS): most interventions better than delayed; only BA, BI, BA+AT better than self-check (SMDs 0.15, 0.17, 0.13 respectively); none superior to health information. Adherence and follow-up: Week 6 follow-up completion 97.2% (3,824/3,936). Acute phase adherence: 63% completed all chapters; 78% completed at least up to chapter 4 across interventions (by week 10). BI had highest adherence (84% all chapters by week 10; 92% by week 30). Engagement sessions were brief (2–10 minutes), with higher time on introduction/chapter 1 and during adding second skill. Safety: 225 automated warnings for increased depression/suicidality; 53 repeat warnings in 39 users (1%). No serious adverse events; adverse events too infrequent for between-arm comparison. Post hoc (week 26): All interventions maintained superiority vs controls; two-skill generally superior to single-skill (between-group SMD -0.08, 95% CI -0.16 to -0.00, P=0.049). Health information and self-check no longer differentiated as controls. Component analysis at week 6 indicated beneficial nonspecific effects (ns SMD -0.18) and harmful waiting list effect (wl SMD +0.17); by week 26 nonspecific effects diminished (SMD 0.08) and specific component effects increased (e.g., BI week 26 SMD -0.34).
Discussion
This trial directly demonstrates that individual CBT skills delivered via a smartphone app produce differential benefits for subthreshold depression and related symptoms. BA showed strong short-term efficacy and high adherence; CR, previously less supported in meta-analyses, exhibited clear specific efficacy in this large, well-controlled randomized trial. Antagonistic interactions at week 6 reflect shared nonspecific treatment effects (e.g., attention, self-monitoring, encouragement emails), limiting additive gains when combining skills in the acute phase. By week 26, nonspecific effects waned and learning multiple skills yielded incremental benefits, indicating durable skill acquisition. The study clarifies the impact of control conditions: waiting list behaves as a nocebo, inflating effect estimates relative to more active controls (health information, self-check), with the strongest control varying by outcome domain (self-check for PHQ-9; health information for GAD-7/ISI). Findings challenge the Dodo bird verdict in iCBT by demonstrating component-specific and outcome-specific efficacies. These results support efficient, scalable psychotherapy design, prioritizing skills matched to participant needs (e.g., BI for insomnia and anxiety; BA for depression and wellbeing) and mindful combination to manage interactions.
Conclusion
Smartphone-delivered CBT skills (BA, CR, PS, AT, BI) are differentially efficacious for reducing depressive symptoms in adults with subthreshold depression, with benefits sustained up to 26 weeks. The component-specific evidence enables optimized, scalable packaging and personalization of psychotherapies, with attention to short-term interactions and long-term additive effects. Broad implementation of such app-based interventions can promote mental health at population scale. Future research should evaluate additional components (e.g., mindfulness, acceptance), test combinations for additivity, extend follow-up to assess enduring effects, and examine generalizability across populations, delivery formats, and clinical severity.
Limitations
Participants and staff were not blinded, potentially introducing performance and assessment bias; all outcomes were self-reported, although self-ratings may yield conservative estimates compared to blinded observer ratings. The primary focus was acute effects with follow-up to 26 weeks; longer-term durability remains to be established. The component selection excluded relaxation, mindfulness, and acceptance due to theoretical and empirical considerations; their roles in combination require study. Interactions between skills were antagonistic at week 6 and largely diminished by week 26, but additivity across broader combinations remains an empirical question. Generalizability may be limited: interventions were delivered via a specific app to Japanese adults with subthreshold depression; results may not extrapolate to other apps, therapist-delivered CBT, school or elderly settings, populations outside Japan, or major depressive disorder.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny