Psychology
Timing along the cardiac cycle modulates neural signals of reward-based learning
E. F. Fouragnan, B. Hosking, et al.
Decision-making under uncertainty can be influenced by the timing of the cardiac cycle. While prior work shows cardiac phase (systole versus diastole) modulates perception of near-threshold sensory stimuli, it is unclear whether similar heart-brain interactions affect internal, non-sensory representations central to learning. The study asks whether the cardiac cycle modulates neural signals of reward-based learning, specifically the signed prediction error (how outcomes are better or worse than expected) and the absolute prediction error (unsigned surprise/salience). Because prediction error magnitude can be dissociated from sensory magnitude, the authors test whether near-threshold absolute PEs are differentially represented across cardiac phases even with suprathreshold sensory outcomes. They hypothesize that timing within the cardiac cycle modulates the strength of neural representations of outcomes, with near-threshold absolute PEs being better represented during diastole than systole, and that inter-individual differences in such modulation relate to learning rates and task performance.
Prior studies show cardiac phase influences perception and neural processing of sensory stimuli, with diastole enhancing perceptual sensitivity and systole attenuating it via baroreceptor signaling to brainstem and forebrain structures including ACC, anterior insula, amygdala, and OFC. Executive functions such as attention switching and motor control may show complementary enhancements during systole, suggesting phase-dependent prioritization of cognitive processes. Reinforcement learning formalizes value updating via prediction errors; signed and absolute PEs recruit partly distinct neural networks, with absolute PE frequently associated with ACC and AI (salience network) and appearing earlier after outcomes than signed PE in EEG/MEG. Trial-by-trial physiological measures (eye gaze, pupil dilation) and single-trial EEG variability have been used to dissociate attentional and learning processes and reveal temporal-spatial correlates of PEs. Active sensing and action timing are coupled to cardiac phase in naturalistic contexts, with increased sampling during diastole. These literatures motivate examining whether cardiac phase modulates internal learning signals (especially absolute PE) and their behavioral impact.
Participants: Thirty-five right-handed healthy adults participated; 3 were excluded for excessive EEG noise, leaving 32 (mean age 24 ± 7.13 years; 10 female; right-handedness 0.83 ± 0.13). Ethics approval from University of Oxford (R55856/RE002); informed consent obtained. Task: Reward-guided decision task with eight blocks of 60 trials (480 total). Each block introduced two new cues (face, house). Participants predicted which color (orange/blue) would occur using left/right button presses after sequential presentation of two cues (500 ms each). Outcome (a single color) was displayed for 4000 ms to capture multiple heartbeats (mean 4.58 ± 0.85). Four association schemes: three predictive (highly predictive anticorrelated, highly predictive correlated, variable predictive) and one non-predictive, manipulating cue-outcome contingencies to modulate learning. Data acquisition: EEG recorded from 62 scalp electrodes (BrainAmp, 1000 Hz, 0.01–100 Hz online filters), ground at AFz, reference to right mastoid then average re-reference offline. ECG recorded via chest electrode; R-peaks detected (Pan-Tompkins). Preprocessing and HEP: Offline band-pass 0.5–40 Hz, bad channel interpolation (1–2/channel per subject), ICA (RUNICA) to remove ocular, cardiac field artefact (CFA), and other physiological components (mean 4.78 ICs removed). Segmentation time-locked to feedback onset, R-peak onset during feedback, and visual stimulus onset. Baselines: −0.15 to −0.05 s for R-locked, −0.9 to −0.1 s for feedback-locked, −0.2 to −0.05 s for stimulus-locked. Surrogate R-peaks (±500 ms shift) used to confirm HEP specificity. Cardiac phase definition: Systole defined as R-peak to 300 ms post R (approximate T-wave end). Diastole from systolic offset to next R-peak. Outcomes categorized by whether onset occurred in systole or diastole; analysis focused on the first heartbeat after outcome onset. Computational modeling: Four RL models fit to choices: (1) Simple Cue model learning separate cue values (V1, V2) with fixed learning rate α and softmax with inverse temperature β and choice stickiness γ; (2) Recency weighting model with additional parameter to weight the second-presented cue more strongly; (3) Conjunction model learning values for cue pairs; (4) Dynamic learning rate model where α varies with smoothed |PE| slope governed by parameters rho and gamma. Fitting via iterative EM to estimate session-level parameters with group-level Gaussian priors, convergence at ΔNPL < 0.001 or 800 steps. Model comparison via integrated BIC and Laplace-approximated model evidence with exceedance probabilities. Simple Cue model best fit (BIC≈9350 vs alternatives 9370–9400; exceedance probability p=0.95); parameter recovery performed. EEG analyses:
- Mass-univariate ERP analyses: Cluster-based permutation tests (FieldTrip) within ROIs defined by HEP morphology (frontocentral and centro-parietal) to test differences for signed PE (positive vs negative), absolute PE (high vs low), and outcome valence (correct vs incorrect) in the 0.1–0.6 s window post R.
- Multivariate single-trial analysis: Regularized Fisher discriminant analysis in sliding 60 ms windows from −100 to 600 ms relative to R-peak to identify spatial weights discriminating (a) very high vs very low absolute PE trials, (b) positive vs negative signed PE. Performance quantified by cross-validated Az with bootstrap significance (P < 0.01). Forward models computed for scalp topographies. Analyses run across all HEPs during feedback and separately for first, second, and third HEP after outcome. Discriminator outputs applied to left-out intermediate absolute PE trials to assess parametricity. Statistical modeling: Mixed-effects linear models tested relationships between absPE, cardiac cycle, and single-trial variability (STV) in absPE-HEP; logistic mixed-effects regression predicted switch/stay at t+1 from absPE, residual absPE-HEP at t, and cardiac phase (and their interaction). Subject-wise regressions quantified each participant’s systole/diastole modulation of absPE-HEP; these coefficients were correlated with individual learning rates and mean rewards overall and by block type (predictive vs non-predictive).
- ERP HEP morphology: Broad frontocentral and centro-parietal distribution.
- Signed PE (ERP): Increased HEP amplitude for negative vs positive signed PE at frontocentral sites 198–252 ms (Monte Carlo p=0.004; Cohen's d=0.695). Correct vs incorrect outcomes differed around 250 ms in same cluster (Cohen's d=0.696).
- Absolute PE (ERP): Significant differences at centro-parietal sites. Cluster 1 (252–292 ms): high > low absolute PE (p<0.003; d=−0.724). Cluster 2 (418–464 ms): low > high (p<0.003; d=−1.17).
- Multivariate discrimination: Robust heart-related component discriminated very high vs very low absolute PE in 100–300 ms post R-wave (Az above P=0.01 threshold). No discriminating component for signed PE despite ERP average differences, indicating lack of trial-by-trial covariation for signed PE. Discriminator outputs scaled parametrically with absolute PE on left-out intermediate bins (t31=7.303, p<0.001), confirming linear relationship.
- Temporal specificity: Only the first heartbeat after feedback carried discriminative information about absolute PE (100–300 ms window); second and third heartbeats did not.
- Cardiac phase modulation: Mean absPE-HEP was more negative when outcomes were presented at diastole vs systole (t31=2.846, p=0.007, 95% CI [0.0107, 0.065], d=0.55). Mixed-effects model showed main effect of absolute PE on STV (t12=3.539, p<0.001, 95% CI [0.11, 0.39], Partial Eta²=0.644) and main effect of cardiac cycle (t12=−2.336, p=0.021, 95% CI [−0.35, −0.03], Partial Eta²=0.05); interaction trend (t12=1.96, p=0.052).
- Behavior coupling: Participants more likely to switch after higher surprise; residual absPE-HEP at t (controlling for absolute PE) predicted switch at t+1. Interaction between cardiac phase and absolute PE significant (SysDias*AbsPE p=0.035), indicating stronger cardiac-phase impact as absPE decreased (near-threshold).
- Individual differences: Greater systole–diastole modulation of absPE-HEP correlated with higher learning rates (t30=2.176, p=0.037; r=0.391) and greater rewards earned (t30=2.74, p=0.01; r=0.366). Effects were specific to predictive blocks (learning rates: t30=2.17, p=0.038; r=0.372; rewards: t30=3.21, p=0.003; r=0.334) and absent in non-predictive blocks.
The results demonstrate a robust link between heartbeat-evoked potentials and internal learning signals: HEP magnitude parametrically tracks absolute prediction error (surprise/salience) at the first heartbeat following outcome, whereas it does not vary trial-by-trial with signed PE. Crucially, cardiac phase modulates this representation: outcomes occurring during diastole evoke stronger absPE-HEP, especially when absolute PE is small (near-threshold). This neural modulation relates to behavior, predicting subsequent switching and explaining inter-individual differences in learning rates and overall performance. These findings align with interoceptive and predictive coding accounts in which cardiac afferents influence neural gain and attentional allocation in salience networks (e.g., anterior insula, ACC), prioritizing processing during diastole when perceptual sensitivity is enhanced. The data suggest that natural bodily oscillations rhythmically gate the impact of outcome surprise on learning, linking heart-brain interactions to adaptive value updating beyond pure sensory processing.
This study shows that the timing of outcomes within the cardiac cycle shapes neural representations of absolute prediction error and, in turn, learning and performance. A machine learning analysis of EEG revealed an absPE-HEP component in the first heartbeat after feedback that is stronger during diastole, particularly for near-threshold absolute PEs. Individuals with greater diastole–systole modulation of this signal learned faster and earned more rewards. These findings extend heart-brain interaction research to internal learning signals and underscore a rhythmic, interoceptively driven gating of surprise-based updating. Future work should: (1) causally manipulate cardiac phase (e.g., phase-locked outcome delivery or biofeedback) to test directional effects on learning; (2) examine neural sources (e.g., source-resolved EEG/MEG or intracranial recordings) linking absPE-HEP to salience network nodes; (3) test clinical or trait variability (e.g., interoceptive sensitivity, anxiety, depression) in cardiac modulation of learning; (4) investigate interactions with pupil-linked arousal and neuromodulatory systems; and (5) delineate how cardiac-driven gain modulation integrates with signed PE pathways across tasks and sensory modalities.
Stimuli and outcomes were not experimentally locked to cardiac phase; instead, the study leveraged natural timing, which enhances ecological validity but limits causal inference about phase effects. The findings do not by themselves establish that cardiac deceleration (longer diastole) improves sensory intake; future work should directly link trial-by-trial HEP amplitude changes to sensory integration following feedback. The sample size (N=32 after exclusions) is typical but may limit detection of smaller effects or interactions. Equal-contribution notes were not institutional affiliations, and gender was not analyzed as a factor. Although ICA was used to mitigate cardiac field and motion-related artefacts, residual physiological artefacts cannot be completely ruled out.
Related Publications
Explore these studies to deepen your understanding of the subject.

