logo
ResearchBunny Logo
Evaluating a brief smartphone-based stress management intervention with heart rate biofeedback from built-in sensors in a three arm randomized controlled trial

Psychology

Evaluating a brief smartphone-based stress management intervention with heart rate biofeedback from built-in sensors in a three arm randomized controlled trial

L. M. Fuhrmann, C. A. Lukas, et al.

An 18-day smartphone stress-management program with integrated heart-rate biofeedback reduced perceived stress versus waitlist in a three-arm randomized trial. The study, conducted by Lukas M. Fuhrmann, Christian Aljoscha Lukas, Lena Schindler-Gmelch, and Matthias Berking, also showed gains in emotion regulation and well-being with high usability and effects persisting at 1-month follow-up.

00:00
00:00
~3 min • Beginner • English
Introduction
Stress is widespread in industrialized societies and is linked to mental (e.g., depression, anxiety, substance use, sleep problems) and physical health problems (e.g., cardiovascular disease, obesity), imposing substantial societal costs. The transactional model of stress and coping emphasizes the role of cognitive appraisal and coping resources, informing interventions that target reappraisal, relaxation, breathing, mindfulness, and interoceptive awareness to improve emotion regulation and well-being. While face-to-face stress management shows moderate-to-large effects, barriers to access motivate scalable smartphone-based interventions. However, meta-analytic evidence indicates small effects for app-based stress interventions, likely due to low engagement and limited use of built-in sensing and ML. Heart rate is a relevant, actionable physiological marker for biofeedback that can increase interoceptive awareness and support relaxation. Prior work with external wearables shows promise, but external sensors reduce usability and adherence. Built-in smartphone sensors could overcome these barriers. Evidence for built-in sensing in stress apps is scarce and mixed; camera-based PPG is affected by light, skin tone, and motion. Accelerometer-derived ballistocardiography may provide a feasible way to derive HR on-device, yet has not been tested in stress biofeedback interventions. Research question and hypotheses: The study evaluated an 18-day CBT-based smartphone intervention (MT-StressLess) with accelerometer-derived HR biofeedback versus a waitlist control (WLC). Primary hypothesis: the biofeedback-enhanced app would reduce perceived stress more than WLC over time. Secondary hypotheses: both active conditions would improve emotion regulation, depressive symptoms, and well-being versus WLC. Exploratory aims: test whether MT-StressLess without biofeedback outperforms WLC and whether biofeedback adds benefits beyond the non-biofeedback version.
Literature Review
Meta-analyses indicate face-to-face stress management yields moderate-to-large stress reduction effects among employees and college students. App-based stress interventions offer accessibility but show small effects (g≈0.29), potentially due to adherence and engagement challenges and minimal integration of sensing/ML. Biofeedback integrated into digital interventions has shown benefits in adjacent areas using external devices (e.g., HRV feedback plus breathing for depression; wearable biofeedback reducing anxiety and depression in students; continuous HR feedback in veterans). Built-in smartphone sensing for stress has been minimally evaluated: one RCT measuring HRV via phone camera before mindfulness showed no significant effects, possibly due to small sample size and measurement limitations (ambient light, skin tone, motion). Accelerometers may provide HR indicators via ballistocardiography and could deliver user-friendly biofeedback without external hardware, but had not been studied for stress biofeedback before this trial.
Methodology
Design: Three-arm randomized controlled trial (CONSORT-EHEALTH compliant), with participants randomized to (1) MT-StressLess with HR-based biofeedback, (2) MT-StressLess without biofeedback, or (3) waitlist control (WLC). Assessments at baseline (t0), postintervention (+18 days; t1), and 1-month follow-up (t2). Randomization used block size of 3 by independent staff. Data collection occurred August–December 2017. Ethics approval: German Psychological Society (DGPS; MB 092017_amd_072016). Retrospective trial registration: DRKS00013073 (21/02/2018) before analysis. Pre-feasibility testing: n=17 healthy individuals. HR-focused pre-assessment (n=9) validated accelerometer-based HR versus ECG, determined stress/relaxation cutoffs. App-focused pre-assessment (n=8) optimized software/app components including AAMT modes. Sample size: A priori power estimated n=53 per arm for medium effect (d=0.50) in ANCOVA. Post hoc power for final LMM sample (N=159) indicated detectable f²=0.062 (~d=0.21) at 80% power. Participants: Recruited via social media, flyers, and boards; inclusion: age ≥18, Android ≥4.4, German proficiency. Screening assessed perceived stress, demographics, health, cardiac issues, psychotherapy/psychiatry, physical activity, prior relaxation experience, smartphone use. Informed consent obtained in writing. Flow: 295 screened; 129 excluded (125 consent not returned; 2 refused; 1 no Android; 1 underage). N=166 randomized (Biofeedback n=56; MT-StressLess n=56; WLC n=54). Response rates: Baseline—Biofeedback 54/56; MT-StressLess 53/56; WLC 48/54. Post—Biofeedback 41; MT-StressLess 42; WLC 47. Follow-up—Biofeedback 40; MT-StressLess 43; WLC 45. Conditions: - MT-StressLess: 14-module, fully automated CBT-based program grounded in the transactional model of stress, combining psychoeducation (delivered via fictional chat with E-coach and quizzes), approach-avoidance modification training (AAMT) using multiple interactive modes (swipe, draw, plus-minus, select, command, emotion recognition via SHORE algorithm), and daily-life skills practice tasks (text/audio exercises, reflections, lists, photos, behavioral tasks). - MT-StressLess with HR-based biofeedback: Identical content plus an HR biofeedback relaxation exercise before each AAMT task. Users selected a current stressor, placed the phone over the heart, and followed audio prompts through baseline (10 s), stress induction imagery (40 s), and guided relaxation. Accelerometer-based ballistocardiography computed HR in near real time (filters + FFT to identify 45–150 bpm range). Success thresholds: HR decrease ≥10 bpm from stress peak or ≥2 bpm below baseline maximum; if not reached, additional 20 s relaxation blocks (up to two extensions). Upon threshold or completion, users received graphical HR feedback and motivational reinforcement. - WLC: No intervention during study; access to MT-StressLess (no biofeedback) after follow-up. User flow/adherence rules: One module per weekday (14 modules over 18 days total with weekend pauses). Gating required completion of psychoeducation quiz, ≥1 AAMT task, and ≥3 daily-life practices to unlock next module. Additional on-demand resources and dashboard feedback provided. Measures: - Primary: Perceived Stress Scale (PSS-10; German). - Secondary: Emotion Regulation Skills Questionnaire (ERSQ-27), Patient Health Questionnaire (PHQ-9), WHO-5 Well-Being Index. - Usability: System Usability Scale (SUS; German) plus custom items on comprehensibility, appeal, goal achievement for components (psychoeducation, quizzes, AAMT, daily tasks, biofeedback). - Usage: Automatically logged: time in app, usage days, modules completed, quizzes solved, AAMT tasks solved, daily-life tasks solved; in biofeedback arm, number of relaxation exercises, completion, and success (meeting HR decrease criteria). Statistical analysis: ITT including all randomized with baseline values. Group differences at baseline tested by χ²/ANOVA/nonparametric tests. Longitudinal effects analyzed using linear mixed-effects models (lme4 in R 4.4.2) with fixed effects of Condition, Time (baseline/post/follow-up), and their interaction; random intercepts for participant; REML estimation; Satterthwaite df; Type-III ANOVA for fixed effects. Between-condition Cohen’s d computed using estimated mean differences at post/follow-up divided by pooled baseline SDs. Sensitivity analyses: per-protocol (engaged users) and covariate-adjusted models (sex, age). Usage–outcome associations: Spearman correlations between usage metrics and baseline-adjusted post PSS; separate linear regressions per active arm predicting post PSS from usage metrics; Benjamini–Hochberg correction applied to multiple comparisons. Missing data: <25%; MCAR test non-significant (χ²=34.53, df=30, p=0.260).
Key Findings
- Primary outcome (Perceived Stress; PSS-10): Significant Time×Condition interaction (F(4,255.43)=3.25, p=0.013). At postintervention, MT-StressLess + biofeedback reduced PSS significantly more than WLC (t(257.52)=-3.27, p=0.001, d=0.41, 95% CI [0.03, 0.79]); MT-StressLess (no biofeedback) did not differ from WLC (t(255.60)=-1.41, p=0.161, d=0.14). No significant difference between active conditions at post (t(263.43)=-1.84, p=0.068, d=0.29). At 1-month follow-up, MT-StressLess + biofeedback remained superior to WLC (t(258.13)=-2.77, p=0.006, d=0.55), while MT-StressLess vs WLC showed a small, non-significant difference (t(255.77)=-1.78, p=0.076, d=0.44). Active vs active at follow-up remained non-significant (t(263.45)=-1.00, p=0.317, d=0.15). Estimated means (PSS-10): Baseline—Biofeedback 21.26, MT-StressLess 21.15, WLC 20.23; Post—17.44, 19.38, 19.99; Follow-up—16.22, 17.23, 18.26. - Secondary outcomes: • Emotion Regulation (ERSQ-27): Significant Time×Condition (F(4,259.96)=4.95, p<0.001). Both active conditions improved vs WLC at post (biofeedback d=-0.58, p<0.001; MT-StressLess d=-0.59, p=0.006) and follow-up (biofeedback d=-0.47, p<0.001; MT-StressLess d=-0.46, p=0.031). No differences between active conditions (p>0.193). • Well-being (WHO-5): Significant Time×Condition (F(4,259.96)=3.10, p=0.016). Both active conditions improved vs WLC at post (biofeedback d=-0.25, p=0.006; MT-StressLess d=-0.27, p=0.043) and follow-up (biofeedback d=-0.26, p=0.005; MT-StressLess d=-0.37, p=0.012). No differences between active conditions (p≥0.430). • Depressive symptoms (PHQ-9): Time×Condition not significant (F(4,257.96)=1.56, p=0.185). - Usability and adherence: SUS overall M=83.51 (SD=11.95), good-to-excellent. SUS higher in biofeedback arm (M=87.38, SD=8.94) vs non-biofeedback (M=79.29, SD=13.55), W=1246, p<0.001. App engagement: 9.82% did not download; 4.46% did not engage with one competence. Mean completed competencies M=7.45/14 (53.21%); 26.04% completed all 14; 53.12% completed ≥7. No significant between-arm differences in usage metrics. In biofeedback arm, 73% used biofeedback ≥1 time; among users, mean 8.78 exercises started; 87.66% of 316 starts completed; 67.87% of 277 completed exercises achieved predefined relaxation threshold. Component ratings: high comprehensibility (4.28–4.85/5); lower appeal and goal achievement for quizzes, AAMT, and biofeedback exercise. - Usage–outcome relationships: Higher engagement associated with lower post PSS (baseline-adjusted). Correlations: MT-StressLess—minutes (rs=-0.41, p=0.025), AAMT tasks (rs=-0.36, p=0.038); Biofeedback—trend for successful biofeedback tasks (rs=-0.35, p=0.073). Regressions: Minutes predicted lower post PSS (MT-StressLess β=-0.01, p<0.001, R²=14.97%; Biofeedback β=-0.02, p<0.001, R²=9.92%). AAMT tasks predicted lower post PSS (MT-StressLess β=-0.24, p=0.003, R²=8.42%; Biofeedback β=-0.25, p=0.003, R²=7.15%). In biofeedback arm, successfully achieving relaxation predicted lower post PSS (β=-0.50, p<0.001, R²=15.95%). - Attrition: Postintervention loss 21.69% (36/166), follow-up loss 22.89% (38/166); no significant differences in dropout across conditions at post (χ²(2)=3.64, p=0.162) or follow-up (χ²(2)=2.12, p=0.330). Sensitivity analyses (per-protocol; adjusted for sex, age) were consistent with main results.
Discussion
The study demonstrates that a brief 18-day smartphone-based stress management program augmented with accelerometer-derived HR biofeedback can reduce perceived stress more than no intervention, with effects maintained at one month. Improvements in emotion regulation and well-being across both active conditions suggest the CBT-based content and skills training are beneficial. The biofeedback component may enhance interoceptive awareness and facilitate autonomic downregulation via relaxation, aiding self-regulation of stress. Nevertheless, the lack of significant differences between the two active arms and the non-significant advantage of the non-biofeedback app over WLC on perceived stress indicate modest core intervention effects and uncertain additive benefits of biofeedback. Engagement appeared important: greater usage predicted lower stress, and successful biofeedback sessions were associated with better outcomes, supporting dose–response dynamics. Non-specific factors (e.g., expectancy, novelty of biofeedback) may have contributed to perceived benefits. Measurement choices (static HR thresholds) and brief intervention duration may have limited effects. Future optimization could include adaptive, individualized thresholds, enhanced motivation and reinforcement strategies to boost adherence, integration of continuous HRV for resilience assessment, and personalization via AI and real-time sensing to tailor content and timing. Overall, the findings support the feasibility and potential of built-in-sensor biofeedback to augment mobile stress interventions but highlight the need for refinement and more rigorous component testing.
Conclusion
This three-arm RCT provides preliminary evidence that a brief smartphone-based stress management intervention incorporating accelerometer-derived HR biofeedback reduces perceived stress compared to a waitlist and improves emotion regulation and well-being. Usability was high, though adherence was moderate, and engagement predicted outcomes. The added value of biofeedback over the core program remains uncertain, warranting cautious interpretation. Future research should: (a) compare against established active controls and relaxation-only comparators; (b) employ adaptive, individualized biofeedback thresholds and integrate continuous HRV; (c) leverage AI-driven personalization and real-time physiological tracking to enhance engagement and efficacy; (d) include larger, more diverse samples and longer follow-ups; and (e) examine mechanisms via mediation/moderation and objective physiological/behavioral endpoints.
Limitations
- Comparator limitations: No state-of-the-art active control; absence of a relaxation-only condition without HR feedback; no standalone HR biofeedback arm to isolate specific effects. - Sample and power: Sample size may be underpowered for subtle effects and for mediation/moderation analyses; self-selected, predominantly younger female Android users limit generalizability. - Measurement limitations: Reliance on self-report outcomes susceptible to expectancy/placebo and social desirability; short 1-month follow-up limits inference on durability; static, non-individualized HR cutoffs may reduce sensitivity to individual variability; no continuous HRV assessment. - Engagement/adherence: Moderate adherence and component appeal (lower for AAMT, quizzes, biofeedback relaxation) could constrain efficacy and retention. - Technology constraints: Emotion recognition affected by lighting/occlusion; accelerometer HR extraction optimized for rest, not validated for all contexts.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny