logo
ResearchBunny Logo
Do Growth Mindset Interventions Impact Students' Academic Achievement? A Systematic Review and Meta-Analysis With Recommendations for Best Practices

Psychology

Do Growth Mindset Interventions Impact Students' Academic Achievement? A Systematic Review and Meta-Analysis With Recommendations for Best Practices

B. N. Macnamara and A. P. Burgoyne

This systematic review and meta-analysis, conducted by Brooke N. Macnamara and Alexander P. Burgoyne, finds that popular growth mindset interventions produce at best tiny, and often nonsignificant, effects on academic achievement once study quality and publication biases are accounted for — listen to the full paper to hear how stronger research standards change the story.

00:00
00:00
~3 min • Beginner • English
Introduction
The study asks whether growth mindset interventions improve students’ academic achievement and, if so, whether growth mindset is the operative mechanism. Mindset theory posits that believing intelligence is malleable (growth mindset) promotes learning goals, effort, challenge seeking, and resilience, purportedly enhancing academic outcomes. The theory has heavily influenced education practice and policy, spawning commercial programs (e.g., Brainology) and widespread implementation. Conflicting claims and rising popularity, alongside substantial funding, motivated a systematic evaluation of both the quantity and quality of evidence on achievement effects and mechanisms. The authors outline two primary questions: (a) Do growth mindset interventions generally improve academic achievement? (b) If benefits exist, are they due to changing mindsets, or to design flaws, reporting issues, and bias?
Literature Review
Early intervention claims (e.g., Blackwell et al., 2007) interpreted classroom-level assignments and non-declining grades in treated groups as prevention of decline, catalyzing broad adoption. Sisk et al. (2018) conducted the first meta-analysis of intervention effects across 38 samples (N = 57,155), finding a small average benefit (d = 0.08) but cautioning that manipulation checks were often absent or failed; paradoxically, significant achievement effects appeared in studies without manipulation checks or with failed checks, suggesting factors other than mindset might drive outcomes. Evidence for theoretical premises is weak: meta-analyses indicate small links between mindset and goal orientations (Payne et al., 2007; Burnette et al., 2013), and a preregistered test (Burgoyne et al., 2020) found minimal associations with learning/performance goals, persistence, and resilience (largest r = −.12, opposite the theoretical prediction). Correlational studies show mindset explains about 1% of variance in achievement (Sisk et al., 2018), with large samples failing to find robust associations across challenges and transitions (Li & Bates, 2019, 2020). Measurement validity concerns for the Implicit Theories of Intelligence Scale (Limeri et al., 2020) raise issues of heterogeneous interpretations and demand characteristics, undermining manipulation checks. Design confounds are common: treatment groups often receive additional encouragement, strategies, and motivational content absent in controls, making mechanisms indeterminate (e.g., effort encouragement can drive performance regardless of mindset; Li & Bates, 2019; Polley, 2018). Overall, prior literature offers mixed results, weak theoretical links, frequent measurement and design issues, and potential publication and researcher bias.
Methodology
The authors conducted a preregistered (OSF: https://osf.io/ga9jk) systematic review and three meta-analyses following PRISMA and APA reporting standards. Inclusion criteria: direct student-facing growth mindset treatments teaching malleability of a human attribute; comparable control group (active, passive, or fixed-mindset); direct academic achievement outcomes (course exam, course grade, GPA, standardized test); reported or computable post-intervention treatment–control effect size; English methods/results. Praise-only or attribution-only studies were excluded. Search: built on Sisk et al. (2018) through 10/28/2016 and extended (10/28/2016–8/7/2019) via PsycINFO, ERIC, PubMed, ProQuest Dissertations & Theses, Google Scholar, conference programs, and extensive author contacts. Screening yielded 61 records (63 studies), 79 independent samples, 96 effect sizes, N = 97,672. Effect size computation: standardized mean difference (Cohen’s d), prioritizing options (raw data; difference scores; Morris 2008 pre-SD method; ANCOVA/regression adjusting baseline; post means/SDs; t-tests; reported d; other tests). Variances from Campbell Collaboration calculator; dependency adjustments via Cheung & Chan methods; cluster assignment variance adjusted using Kish’s design effect (ρ = .20 default if not reported). Coding captured sample descriptors, developmental stage, SES, challenge level, delivery mode, administrator, intervention type (passive/feedback/interactive), sessions, context, achievement measure, time interval, publication status, financial incentives, manipulation checks, and adherence to best practices. Moderators (mixed-effects models): theoretical (developmental stage; academic challenge; SES; time interval) and methodological (intervention type; number of sessions; delivery mode; administrator; context; achievement measure). Bias analyses: funnel plots, Egger’s regression, trim-and-fill (Duval & Tweedie), PET-PEESE (restricted maximum-likelihood). Three meta-analytic models: (1) No Quality Control: all included studies (96 effects; 79 samples; N = 97,672). (2) Minimal Standard of Evidence: only studies with successful manipulation checks showing treatment-group pre–post shift toward growth mindset (25 effects; 21 samples; N = 18,355). (3) Best Available Evidence: studies meeting ≥60% of 10 best-practices criteria and demonstrating mindset change (10 effects; 8 samples; N = 13,571). Best practices included: active control; only malleability differed between groups; a priori power analysis; individual random assignment; blinding of students/administrators/teachers; manipulation checks; preregistered hypotheses/methods/analyses; reporting results for participants treated; reporting whole sample/subsamples; absence of authors’ financial incentives.
Key Findings
- Meta-analysis 1 (No Quality Control): Overall d = 0.05, 95% CI [0.02, 0.09], p = .004; heterogeneity I² = 39.14 (τ² = .005). No theoretically meaningful moderators were significant (developmental stage, challenge level, SES, time interval). Methodological moderators generally nonsignificant. Publication bias indicators: Egger’s regression intercept B₀ ≈ 0.36 (1-tailed p = .034), trim-and-fill estimated 10 missing studies; bias-corrected effects nonsignificant (trim-and-fill d = 0.03, 95% CI [−0.01, 0.07], p = .097; PET estimate d = 0.01, 95% CI [−0.03, 0.05], p = .667). Financial incentives: In published literature, studies with authors having financial incentives showed larger effects than those without (d = 0.18 vs. d = −0.10; Q(1) = 9.66, p = .002). Adjusting for cluster design effects rendered some seminal findings nonsignificant (e.g., Blackwell et al., 2007 p = .119 after adjustment). - Best-practice adherence (descriptive, 79 samples; 97,672 students): manipulation checks reported in 59% of samples (33% of students); active controls 58% (33%); individual random assignment 51% (33%); blinding 28% (29%); a priori power analysis 25% (26%); only malleability differed between groups 6% (3%); preregistration 3% (7%). - Meta-analysis 2 (Minimal Standard of Evidence): d = 0.04, 95% CI [−0.01, 0.10], p = .146; I² = 39.45 (τ² = .004). No significant theoretical or methodological moderators; no evidence of publication bias (Egger’s p = .402; trim-and-fill and PET estimates remained nonsignificant). - Meta-analysis 3 (Best Available Evidence): d = 0.02, 95% CI [−0.06, 0.10], p = .666; I² = 62.71 (τ² = .006). Insufficient studies to examine moderators; no bias indicators suggesting inflated effects. - Financial incentive patterns: Articles with financially incentivized authors were over 2.5× more likely to report significant positive results (56% vs. 21%; χ²(1, N = 61) = 7.09, p = .008). Published vs. unpublished interaction significant (F(1,75) = 6.35, p = .014), consistent with selective publication of larger positive effects by financially incentivized authors. - Reporting issues: ~10% of null effects described as significant; multiple studies failed to adjust for clustering; common confounds include additional encouragement/strategies in treatment but not control, obscuring mechanisms.
Discussion
The findings address both research questions. First, across a large and updated corpus, growth mindset interventions yield, at best, very small average differences in achievement that become nonsignificant after correcting for publication bias. Second, when limiting analysis to studies where interventions demonstrably changed mindsets, effects on achievement remain nonsignificant, undermining claims that mindset change is the operative mechanism. The absence of significant theoretically meaningful moderators (challenge level, SES, time interval) contradicts predictions that benefits concentrate among struggling students or compound recursively over time. Evidence of publication and researcher bias—especially among authors with financial incentives—alongside poor adherence to best practices (rare blinding, limited preregistration and power analyses, common confounds in treatment content) suggests apparent positive effects derive from design flaws, reporting practices, and selective publication rather than robust causal impacts of mindset on achievement. This challenges the theoretical foundations linking mindset to motivational processes and downstream achievement and calls for rigorous designs that isolate mechanisms, equate expectations, and transparently report outcomes.
Conclusion
This comprehensive systematic review and meta-analysis of growth mindset interventions on academic achievement shows no reliable benefits when evidence quality is considered and publication bias is addressed. Apparent positive effects likely reflect inadequate designs, reporting flaws, and bias. The paper contributes: (a) a quality-weighted synthesis distinguishing minimal and best-available evidence; (b) documentation of widespread departures from best practices; (c) evidence of financial and publication biases; and (d) recommendations for study design, analysis, and reporting to enable valid causal inference. Future research should: isolate malleability messages via active controls matched on all non-mindset elements; use preregistered protocols, adequate power, individual randomization, and blinding; validate mindset measures and assess demand characteristics; report both per-protocol and intent-to-treat results with transparent handling of missingness; examine potential harms and equity implications; and redirect resources toward educational practices with stronger theoretical and empirical support if robust effects remain elusive.
Limitations
- Temporal and access limits: The meta-analysis reflects studies available through 8/7/2019; additional unpublished or inaccessible studies (including an estimated 10 missing smaller-effect studies) may exist. - Language: Non-English studies were largely excluded; while many international studies publish in English, some relevant work may be missing. - Scope decisions: Restricted to interventions explicitly teaching malleability of human attributes; results may differ for mindsets of stress, belonging, or willpower, or for specific attributes (e.g., math ability). - Measure validity: Assumes pre–post mindset measures validly capture beliefs; recent work questions response process validity and susceptibility to demand characteristics. - Effect size metric: Used Cohen’s d (slight upward bias in very small samples) rather than Hedges’ g; overall impact likely minimal. - Code availability: Analyses conducted in Comprehensive Meta-Analysis software; code not publishable, though all data and decisions are openly shared. - Heterogeneity and study quality: Substantial heterogeneity persisted; the best-available-evidence model still included studies meeting only ≥60% of best practices due to limited high-quality studies, potentially inflating effect estimates.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny