Education
Exam Eustress: Designing Brief Online Interventions for Helping Students Identify Positive Aspects of Stress
M. Reza, T. Li, et al.
The paper addresses how to design brief, voluntary, and scalable online interventions that help students reappraise exam-related stress as eustress that can aid performance. While stress is prevalent in higher education and often interpreted negatively, prior evidence suggests that reframing arousal can improve performance in acute stress contexts such as exams. The research question focuses on which online design components (amount of elaboration, modality, and reflection prompts) effectively reinforce a core stress reappraisal message and translate into improved exam outcomes in real-world course settings. The study’s purpose is to define and evaluate a practical design space for a single-page intervention (<5 minutes) that can be delivered at scale, with minimal instructor burden, to help students internalize and apply eustress during exams.
The related work spans: (1) Psychological research showing stress reappraisal can mitigate worry and enhance performance in acute stress and exam contexts; modalities such as videos, emails, and text have shown benefits, but direct comparisons and brief, scalable combinations are underexplored. (2) HCI work on technology-mediated mindset and behavior change at scale, including chat tools, conversational coaches, reflection for physical activity, CBT-based apps, and JIT interventions. This paper contributes by comparing modalities to communicate stress-reappraisal in large classes and by combining qualitative and quantitative evaluations. (3) Shift from distress-reduction approaches to eustress-focused reappraisal, motivated by evidence that reappraisal outperforms suppression and acceptance for moderating arousal and subjective stress, and can reduce negative emotional experiences. (4) Design factor grounding in multimodal presentation, marketing communication, instructional and multimedia learning, and reflective learning, motivating varied layouts (paragraph vs bullets), modalities (text vs video), validation (citation), and reflection prompts to support retention and behavior change.
The authors defined a design space with a core message (D0: “Being stressed during an exam can actually help you do better”) and six optional factors: D1 explanatory elaboration (research-based rationale), D2 explicit suggestions with numbered steps, D3 instructional talking-head video delivering D2, D4 explicit citation and link to an empirical paper, D5 self-explanation prompt (voice/typing), D6 note-to-self prompt to use before the exam. Study 1: Semi-structured user interviews. Participants: N=20 students from a large introductory programming course (diverse years and disciplines; 14 women, 6 men). Procedure: remote interviews (45–60 minutes, $15/hr). Participants discussed past exam stress, then reviewed each component D0–D6 first individually (order of D2 text vs D3 video counterbalanced) and then all together on one page. Think-aloud was used; a silent notetaker recorded observations. Analysis: Reflexive thematic analysis (inductive), with stress mindset/reappraisal as a pre-existing code. Outputs: six qualitative findings (F1–F6). Study 2: Randomized field experiment. Context: Intro to Programming course (Spring 2020). Deployment: Optional intervention embedded in a graded online activity distributed ~10 days before an exam; activity completion earned 2% course credit. Randomization: Students assigned to control (no intervention) or treatment (D0 + random combination of D1–D6), with a 2:1 treatment:control split; automatic dual-anonymous randomization. Participants: 1,283 enrolled; data cleaning yielded N=664 for analysis (1,284 delivered; 1,014 clicked; 931 completed; 92 dropped/didn’t finish midterm; 175 completed after midterm; 664 final). Measures: Midterm exam score (dependent variable). Analysis: One-sided independent-samples t-tests comparing control vs treatment; linear regressions to assess add-on effects of D1–D6 within treatment; bootstrap validation of randomization balance; subgroup analyses by year (first-year vs upper-year) and gender identity; ANOVA and regression to test number of design factors. Handling of implementation artifacts (e.g., participants completing on multiple devices) was documented and sensitivity analyses yielded similar results.
Qualitative (Study 1, F1–F6):
- F1: D0 alone (brief message) is not convincing and can be misinterpreted; users desire explanatory context and timing clarity.
- F2: Research-based framing increases perceived credibility and reassurance.
- F3: Learners value citations for credibility but are unlikely to click research links; prefer accessible summaries.
- F4: Numbered, listed instructions (D2) perceived to aid recall under stress.
- F5: Short talking-head video (D3) can be preferable when stressed or feeling lonely; text preferred by some for self-pacing and reduced distraction.
- F6: Non-native speakers may prefer speaking over typing for reflection due to grammar/word-choice concerns; others value writing for organization and memory. Quantitative (Study 2):
- Treatment vs Control: Midterm scores improved by 3.84 percentage points (All: control mean 75.88 vs intervention 79.72), p=0.003, Cohen’s d=0.252.
- Subgroups: First-year students showed significant improvement (difference 4.54; p=0.022); upper-year not significant (difference 1.21; p=0.27). Gender effects similar (men +3.05; women +2.91; not statistically significant in subgroup tests).
- Validation: Bootstrap indicated <0.3% chance of observing a difference ≥3.84% by random assignment imbalance.
- Relative effects of design factors (within treatment regression): D4 (link to paper) had a significant negative add-on effect (estimate −3.27, p=0.015). D1: −0.03 (p=0.98); D2: −0.47 (p=0.72); D3: −0.88 (p=0.51); D5: +1.68 (p=0.21); D6: +0.21 (p=0.87); none significant except D4.
- Number of factors: No significant relationship between the number of design factors and exam score (regression p=0.334; ANOVA p=0.43).
The findings show that brief, scalable online interventions can translate stress-reappraisal theory into measurable performance gains when thoughtfully designed. Reinforcing a concise core message with elaboration (research-based rationale and actionable steps), appropriate modality (offering both text and short video), and reflection prompts can help students internalize and apply the eustress mindset during exams. The significant overall improvement, driven especially among first-year students, suggests that eustress interventions are timely and beneficial for populations undergoing transitions and higher stress. However, elements that increase perceived credibility (e.g., citations with external links) may distract or dilute focus in the moment, potentially reducing efficacy. Designers should balance elaboration with attentional limits, provide mixed modalities to accommodate stress states and preferences, include reflection to support retention and future use, consider delivery timing relative to exams, and tailor to subgroups (e.g., first-year students).
This work defines and evaluates a design space for brief, voluntary, scalable online exam eustress interventions. By layering six design factors (D1–D6) on a core reappraisal message (D0), the authors demonstrate through interviews (N=20) and a large field experiment (N=1283; analyzed N=664) that such interventions can significantly improve exam performance, with strongest effects for first-year students. The paper offers practical design considerations on elaboration, modality, and reflection prompts, and cautions about potentially counterproductive elements such as external citation links. Future research directions include optimizing credibility cues without distraction, expanding video modalities and in-video prompts, enhancing reflective components (e.g., voice/video notes delivered at just-in-time moments), exploring adaptive timing and personalization, and integrating interventions across in-person and remote learning contexts.
- Lack of pretest-posttest design and multiple post-intervention assessments limits inference on sustained effects; the final exam was cancelled due to COVID-19, leaving only the midterm as an outcome.
- Potential assignment imbalance is unlikely but addressed via bootstrap (<0.3% chance of equal-or-greater difference by chance); unmeasured stress heterogeneity across performance levels may remain.
- Field deployment introduced implementation complexities (e.g., caching, some participants completing on multiple devices). Analyses reclassifying those cases and sensitivity checks produced similar results (main effect remained significant, e.g., p=0.018 when excluding multi-device cases).
- Subgroup analyses had reduced power due to voluntary demographic responses and smaller splits; many subgroup effects were not statistically significant.
- The add-on effects of individual design factors are measured within treatment and cannot be interpreted as independent causal effects separate from the main intervention.
Related Publications
Explore these studies to deepen your understanding of the subject.

