
Psychology
Learned Nocebo Effects on Cutaneous Sensations of Pain and Itch: A Systematic Review and Meta-analysis of Experimental Behavioral Studies on Healthy Humans
M. A. Thomaidou, J. S. Blythe, et al.
Discover the intriguing world of nocebo effects, where negative treatment expectations exacerbate pain or itch sensations. This systematic review and meta-analysis, conducted by esteemed researchers including Mia A Thomaidou and Joseph S Blythe, uncovers significant findings that could reshape our understanding of patient expectations in treatment outcomes. Dive into the details!
~3 min • Beginner • English
Introduction
Negative expectations about treatment outcomes can aggravate cutaneous sensations such as pain and itch. Experimental models show that nocebo responses are increases in sensation following a nocebo treatment relative to control and are typically induced via classical conditioning, verbal suggestions, or both. Although nocebo responses reliably aggravate pain and itch, eliciting sensations without a baseline stimulus (conditioned allodynia) has yielded mixed results, and effects may generalize across sensations or arise through social observation. Because studies use diverse methods (e.g., type and intensity of sensory stimulation, learning procedures, and duration), a systematic evaluation is needed to clarify the magnitude of learned nocebo effects and factors contributing to variability. The present study systematically reviews and meta-analyzes experimental nocebo effects on cutaneous pain and itch in healthy participants, comparing learning mechanisms (verbal suggestions alone versus conditioning combined with suggestions), and testing methodological moderators including stimulation modality and intensity, conditioning length, timing of effect measurement, and risk of bias.
Literature Review
Prior work demonstrates that learned expectations underpin nocebo effects and that combining conditioning with verbal suggestions typically produces stronger responses than either alone. An early meta-analysis of 10 experiments (through 2013) reported moderate-to-large nocebo magnitudes, larger when verbal suggestions were paired with conditioning. Subsequent narrative reviews concluded that nocebo effects occur across multiple sensations, including pain and itch, and are induced by instructional (verbal) and associative (conditioning) learning. The placebo literature parallels these findings, often showing enhanced effects when conditioning is added to suggestions. However, across nocebo studies, considerable methodological heterogeneity exists—sensation type (pain versus itch), stimulation modality (thermal, electrical, laser, pressure, histamine, cowhage), stimulus intensity levels, and the number of learning/evocation trials vary widely. These differences may influence effect sizes and warrant quantitative synthesis.
Methodology
Protocol was preregistered on ClinicalTrials.gov (NCT04387851) and followed PRISMA 2009 and Cochrane guidance. Searches of PubMed, PsycINFO, EMBASE, and Cochrane CENTRAL were conducted March 18, 2019, and updated June 2020 and July 2021, restricted to English, Dutch, and German. Inclusion criteria were original, peer-reviewed, controlled experimental studies in healthy humans that used learning paradigms to induce explicit negative expectations about an inert treatment (nocebo) affecting cutaneous sensations (pain or itch). Studies without explicit treatment-related negative suggestions, lacking a control condition, excluding nonresponders, or focused on non-cutaneous or patient samples were excluded; observational learning studies were excluded post hoc due to insufficient number. Two reviewers independently screened and selected studies, extracted data using a standardized form (including design, learning method, stimulation type, outcomes, sample size, means/SDs), and resolved disagreements via a third reviewer. Missing data were requested from authors; when unavailable, data were digitized from figures. Risk of bias (RoB) was assessed with Marcuzzi et al.'s tool for quantitative sensory testing; items scored 0 (satisfied), 1 (unclear), 2 (not satisfied), or N/A, summed to yield a 0–34 score (higher = greater risk). Meta-analyses used random-effects models due to heterogeneity. The primary outcome was nocebo magnitude, defined as the difference in self-reported pain/itch between nocebo and control trials during evocation. When scales were not 0–10, difference scores were normalized to a 0–10 scale. Effect sizes were computed as Hedges g (nocebo minus control), positive values indicating greater sensation under nocebo. For within-subject designs, the correlation between conditions was imputed as r = 0.5. Publication bias was assessed with funnel plots and Duval and Tweedie's trim-and-fill. Heterogeneity was quantified by I2. Four primary pooled effects were computed: pain (verbal suggestions alone; conditioning + verbal suggestions) and itch (verbal suggestions alone; conditioning + verbal suggestions). Subset analyses examined stimulation modality and timing of nocebo measurement (first evocation trials versus mean across evocation). Meta-regressions tested relationships of conditioning length (number of learning trials), evocation length (extinction trials), stimulus intensity differences during learning, and RoB with effect sizes. In total, 24,814 records were identified; after screening and exclusions, 37 studies (40 arms: 30 pain, 10 itch) were included, spanning 2008–2021.
Key Findings
• Overall, nocebo effects were moderate to large: across the four primary outcomes, Hedges g ranged from 0.26 to 0.71; average heterogeneity was moderate (mean I2 ≈ 41%). Publication bias appeared low; trim-and-fill suggested ~7 potentially missing studies. • Pain: conditioning + verbal suggestions produced a somewhat larger pooled effect (k = 21, g = 0.71, 95% CI 0.60–0.82, I2 = 50.71%) than verbal suggestions alone (k = 12, g = 0.63, 95% CI 0.40–0.86, I2 = 55.59%). • Itch: verbal suggestions alone yielded a medium pooled effect (k = 4, g = 0.53, 95% CI 0.23–0.82, I2 = 53.81%), whereas conditioning + verbal suggestions produced a small pooled effect (k = 4, g = 0.26, 95% CI 0.09–0.43, I2 = 0%). • Modality (pain): with conditioning + suggestions, thermal stimulation showed g = 0.75 (k = 13, 95% CI 0.59–0.91), electrical g = 0.65 (k = 7, 95% CI 0.51–0.79). With suggestions alone, electrical g = 0.91 (k = 5, 95% CI 0.65–1.17), thermal g = 0.69 (k = 4, 95% CI 0.21–1.16), mechanical g = 0.60 (k = 2, 95% CI 0.14–1.06). Too few itch studies per modality to analyze. • Timing: measuring the first evocation trials yielded larger pain effects (k = 6, g = 0.82, 95% CI 0.57–1.07) than taking the mean across evocation (k = 13, g = 0.66, 95% CI 0.54–0.79). • Meta-regressions: no significant associations were found for stimulus intensity during learning (Q = 0.89, p = 0.35), number of conditioning trials (Q = 0.81, p = 0.37), or number of evocation trials (Q = 0.19, p = 0.67) with effect sizes. RoB scores were not significantly related to nocebo magnitudes across pain or itch subgroups.
Discussion
The meta-analysis confirms that learned nocebo effects on cutaneous sensations are, on average, moderate to large, indicating that negative expectations can substantially aggravate perceived pain and itch. In pain paradigms, combining conditioning with verbal suggestions yields stronger effects than suggestions alone, consistent with the idea that aversive experiential learning enhances expectancy-driven modulation. In itch, the limited evidence suggests verbal suggestions alone may be relatively potent, whereas added conditioning did not increase effect sizes; however, small sample sizes preclude firm conclusions. Measured methodological factors (stimulation type or intensity, conditioning or evocation length) did not account for the moderate heterogeneity observed, whereas assessing effects during the first evocation trials produced larger estimates, consistent with extinction processes reducing effects over repeated trials. The findings suggest that additional, unmeasured sources of variability—such as the precise content and emotional valence of verbal suggestions, experimental context, and participant characteristics (e.g., fear, demographics)—likely contribute to between-study differences. Given evidence that aversive learning is prioritized in the brain, the robustness of nocebo effects may reflect conserved mechanisms integrating expectations and sensory processing (e.g., anterior cingulate cortex, insula, spinal mechanisms). Greater standardization of paradigms and systematic analysis of linguistic features of suggestions could improve comparability and clarify moderators.
Conclusion
This systematic review and meta-analysis quantified nocebo magnitudes for cutaneous pain and itch in healthy participants and compared learning mechanisms. Conditioning combined with negative verbal suggestions most reliably amplified pain, whereas itch effects were small to moderate overall. Methodological variables such as stimulation type, intensity, and conditioning length did not explain heterogeneity; measuring early evocation trials yielded larger effects, consistent with extinction over time. The field would benefit from standardized nocebo paradigms, consistent reporting, and targeted investigations into the content and valence of verbal suggestions, as well as individual differences and contextual factors. Future research should include larger, well-characterized samples, systematically measure expectations and affective states (e.g., fear), and compare across modalities and populations to identify determinants of nocebo susceptibility and persistence.
Limitations
• The number of itch studies was small, limiting power to detect moderators and to compare pain versus itch conclusively. • Only healthy participants and cutaneous stimulations were included, which may limit generalizability to patient populations and other sensory domains. • Many potentially relevant variables (e.g., demographics, fear, experimenter demeanor, setting) were inconsistently reported and could not be meta-analyzed. • Observational learning studies were excluded due to insufficient numbers. • Risk of bias was assessed with a tool for quantitative sensory testing and may not capture all biases (e.g., selective reporting or publication bias), although publication bias appeared low. • Heterogeneity remained moderate and largely unexplained by recorded methodological factors.
Related Publications
Explore these studies to deepen your understanding of the subject.