Psychology
The role of mPFC and MTL neurons in human choice under goal-conflict
T. Gazit, T. Gonen, et al.
This groundbreaking study reveals how the mPFC and MTL neurons interact to resolve approach-avoidance conflicts, with significant findings on how punishment influences decision-making. Conducted by Tomer Gazit and colleagues, it uncovers the intricate dynamics of motivation and learning in the human brain.
~3 min • Beginner • English
Introduction
The study examines how human medial prefrontal cortex (mPFC) and medial temporal lobe (MTL; hippocampus and amygdala) neurons contribute to resolving approach-avoidance goal conflicts by encoding outcome valence (reward vs. punishment) and influencing subsequent choices. Prior animal and human work implicates striatum (prediction errors), mPFC, and MTL in reinforcement learning and outcome processing, but their distinct roles during goal conflict and in shaping future approach tendencies are unclear. Approach-avoidance conflicts, central to anxiety, require integrating potential gains and losses to guide behavior. The authors use intracranial recordings during a goal-conflict task to test whether and how mPFC and MTL differentially encode outcomes under controllable and uncontrollable conditions and whether such encoding predicts subsequent approach behavior, especially under high goal conflict.
Literature Review
Classical animal studies (elevated plus maze and avoidance paradigms) implicate amygdala, hippocampus, and mPFC in avoidance behavior during goal conflict, with ventral hippocampus involved in conflict resolution at decision time. Other work shows hippocampus and amygdala support learning from outcomes affecting future choice; developmental and lesion studies highlight MTL contributions to reinforcement learning signals and subsequent decisions. In psychopathology (depression, schizophrenia), altered outcome learning signals in hippocampus/anterior cingulate and amygdala-hippocampal dysfunction suggest MTL importance in adapting behavior. However, how these findings extend to approach-avoidance conflicts remains uncertain. Prior imaging with the current task showed differential ventral striatum activation under high conflict and individual differences in approach tendencies. Together, literature supports roles for mPFC and MTL in valence processing and learning, but their distinct contributions to conflict-driven approach updating in humans require direct neuronal evidence.
Methodology
Participants: Fourteen patients (9 male; 35.2 ± 14.6 years) with pharmacologically intractable epilepsy undergoing depth electrode monitoring (10 at Tel Aviv Sourasky Medical Center; 4 at UCLA) completed 20 sessions. One patient had two implantations. Implant sites were determined clinically. An additional 20 healthy participants performed the task behaviorally. Ethics approvals and informed consent obtained.
Electrophysiology: Through clinical depth electrodes, 9 Pt/Ir microwires (8 active, 1 reference) were implanted. Signals were amplified and sampled at 30 kHz (Blackrock), band-pass filtered 300 Hz–3 kHz, spikes detected/sorted with wave_clus (v1.1) in MATLAB (2018a). Units classified by spike shape, variance, and refractory period. Localization via post-implant CT registered to pre-implant T1 MRI (SPM12). Recorded units: amygdala 79, hippocampus 61, dmPFC 63, cingulate cortex (CC) 107 (total 310).
Task (PRIMO game): Subjects controlled an on-screen avatar to catch coins (rewards) and avoid balls (punishments). Two conditions: Controlled (outcomes contingent on actions) and Uncontrolled (cues always hit avatar regardless of action; colored to signal non-controllability). Each coin caught yielded +5 points; each ball hit −5 points. Difficulty (speed) adapted every 10 s to maintain engagement and balance outcome frequencies. Inter-stimulus interval for reward trials jittered 550–2050 ms. Goal conflict manipulation: number of balls between avatar and coin defined low (LGC: 0–1 balls) vs high (HGC: 2–6 balls) conflict. Three to four 6-min blocks per session.
Behavioral classification: Controlled reward trials classified as approach vs avoidance using a machine learning model based on movement patterns as in prior work. Analyses focused on HGC due to ceiling approach in LGC.
Neural analyses: Rasterized firing in non-overlapping 200 ms bins. Peri-stimulus time histograms (PSTHs) computed −3 s to +5 s around outcome (time 0 = coin or ball contacting avatar). Responsiveness assessed in 200–800 ms post-outcome using a bootstrap procedure: distributions from 1000 random window averages across the session matched in trial count; significant positive response if observed FR ≥ 99th percentile (p<0.01), negative if ≤ 1st percentile (p<0.01). A unit was responsive if any of the 200–800 ms windows was significant. For comparisons, reward and punishment were pooled for controllability main effects and vice versa. Chi-square tests assessed response probabilities across regions.
Normalized firing rate (FR) computed as mean across 200–400, 400–600, 600–800 ms of (FR_window − FR_random)/σ_random. Positively and negatively responsive neurons were analyzed separately; mixed-sign neurons excluded. Repeated-measures ANOVAs tested effects of region group (MTL vs mPFC), controllability, and valence on normalized FR. Time-course significance vs baseline tested in successive 200 ms bins with signed-rank tests (FDR corrected).
Movement control: Additional analyses balanced trials by key-press counts to match movement across contrasted conditions and reassessed selectivity and controllability effects.
Outcome-behavior link: For neurons with significant FR increases to a given outcome, outcome trials were split into those with vs without neural response (200–800 ms). Subsequent behavior was assessed for the next Controlled HGC coin trial only. Mann–Whitney tests compared approach probabilities after reward- vs punishment-evoked firing. Six GLMMs (binomial) modeled subsequent HGC behavior with predictors including whether a temporal/frontal neuron fired to prior punishment or reward, movement measures, trial outcomes, time lags, and number of ball hits; neuron identity was a grouping factor with random effects. Additional GLMMs tested interaction between average frontal and temporal firing to punishment (four sessions with both region responses) and to reward (three sessions), including movement and timing covariates.
Statistics: McNemar’s exact tests, chi-square tests, repeated-measures ANOVAs (reporting η²), signed-rank tests with FDR correction, Mann–Whitney tests, and GLMMs with FDR correction were used. Source data and code availability provided.
Key Findings
- Behavior: Participants approached reward cues on 89.4% of 3285 trials. Approach was reduced in HGC vs LGC (HGC 83.4 ± 10.4% vs LGC 94.6 ± 2.7% approach; t(14)=4.78, p=0.0003, mean difference=0.11, CI (0.06,0.16), Cohen’s d=4.9). Reaction times were faster in HGC (807.05 ± 151.2 ms) than LGC (902.08 ± 160.3 ms; t(14)=3.52, p=0.003, mean difference=95 ms, CI (37.2,152.9), d=0.92). Approach tendencies did not differ between MTL seizure onset zone (SOZ) and extra-MTL patients (HGC U=19.5, Z=0.61; LGC U=19, Z=0.07; p>0.05). Aggregate GLMMs found no direct behavioral effect of punishment on subsequent HGC trials without considering neural firing.
- Neuronal responsiveness (200–800 ms post-outcome): Responsive units to at least one outcome condition: amygdala 31/79 (39%), hippocampus 26/61 (43%), dmPFC 26/63 (41%), CC 46/107 (43%). Neurons in all regions were more likely to respond to Controlled vs Uncontrolled outcomes (McNemar’s exact: amygdala χ²=9.3, p=0.0088; hippocampus χ²=3.3, p=0.07; combined MTL χ²=13.8, p=0.0004; dmPFC χ²=7.7, p=0.011; CC χ²=4.3, p=0.049; combined mPFC χ²=12.25, p=0.0005; FDR corrected). No main valence effect across regions.
- Valence selectivity under control: In Controlled trials, mPFC neurons showed a negative valence bias: dmPFC 17 vs 4 and CC 20 vs 10 neurons responsive to Controlled punishment vs Controlled reward. MTL showed no such bias (amygdala 12 vs 12; hippocampus 8 vs 10). Across regions χ²=7.2, p=0.065; MTL vs mPFC comparison χ²=6.04, p=0.014. No valence bias in Uncontrolled trials.
- Normalized FR analyses (positively responsive neurons): Three-way interaction region×controllability×valence F(1,56)=13.6, p=0.001, η²=0.2, driven by heightened mPFC responses to Controlled punishment. Valence×region F(1,56)=5.6, p=0.021, η²=0.09 (greater negative bias in mPFC). Valence×control F(1,56)=7.72, p=0.007, η²=0.12. Main effect of controllability F(1,56)=22.44, p<0.001, η²=0.29; Controlled mean=3.0 (CI 2.2,3.8) vs Uncontrolled mean=1.1 (CI 0.6,1.5). MTL: Controlled 3.2 (CI 1.9,4.6) > Uncontrolled 1.6 (CI 0.7,2.4), F(1,23)=6.3, p=0.02, η²=0.22. mPFC: Controlled 2.8 (CI 1.7,3.8) > Uncontrolled 0.6 (CI 0.2,1.1), F(1,33)=19.2, p=0.001, η²=0.37; Punishment 2.9 (CI 1.9,3.9) > Reward 0.4 (CI −0.7,1.5), F(1,33)=9.37, p=0.004, η²=0.22; interaction F(1,33)=18.8, p<0.001, η²=0.36; Controlled punishment > other three conditions (p<0.001). For FR decreases, only controllability was significant: F(1,51)=11.6, p=0.001, η²=0.19 (Controlled −2.2, CI −2.8,−1.6 vs Uncontrolled −0.8, CI −1.4,−0.2).
- Movement-controlled analyses: mPFC maintained higher FR to Controlled punishment vs Uncontrolled punishment (sign test Z=2.6, p=0.009, FDR) and vs Controlled reward (Z=3.36, p=0.003, FDR). Overall χ²=4.23, p=0.04 indicated mPFC sensitivity to punishment/controllability, while MTL showed relatively greater responses to Controlled reward; MTL controlled vs uncontrolled difference was not significant after movement balancing.
- Context of punishment: mPFC punishment selectivity held whether punishment occurred without reward present (21 punishment-only vs 8 reward-only neurons) or after failed approach (14 punishment vs 8 reward); MTL showed no such differences.
- Temporal dynamics: In Controlled conditions, mPFC responses preceded MTL. Punishment: mPFC significant at 0–200 ms (Z=346, p<0.05, FDR); MTL at 200–400 ms (Z=78, p<0.05). Reward: mPFC at 0–200 ms (Z=43, p<0.01); MTL at 400–600 ms (Z=102, p<0.01). Uncontrolled outcomes generally did not elicit significant responses in early windows (exception: mPFC 200–400 ms for Uncontrolled reward, Z=36, p=0.02).
- Brain–behavior link (subsequent Controlled HGC trial): For MTL neurons, firing after Controlled punishment predicted decreased likelihood of approaching next coin, whereas firing after Controlled reward predicted increased approach (Mann–Whitney U=13, Z=2.9, p=0.01, FDR). mPFC neurons showed no such predictive effect. Uncontrolled outcomes were not predictive in either region.
- GLMMs: Only MTL firing after punishment predicted subsequent HGC approach behavior (beta=1.1, t=4.22, p<0.0001, FDR), robust after excluding SOZ neurons (beta=1.2, t=4.3, p<0.0001), and specific to HGC (not LGC). By structure, hippocampus was significant (beta=1.25, t=3.2, p=0.006), whereas amygdala, dmPFC, and CC were not. Reward-evoked firing did not predict subsequent HGC behavior. An interaction between mPFC and MTL firing to Controlled punishment predicted subsequent HGC behavior (beta=12.19, t=3.14, p=0.0018) in sessions with responsive neurons in both regions.
Discussion
The findings delineate complementary roles of mPFC and MTL during outcome evaluation in goal-conflict situations. mPFC neurons (dmPFC and cingulate) preferentially encode negative outcomes, particularly when outcomes are controllable, and respond earlier than MTL, consistent with roles in pain/loss processing and action planning/inhibition. MTL neurons, while less valence-selective, translate punishment-related outcome information into behavioral updating: their firing after Controlled punishment decreases the likelihood of subsequent approach under high conflict, with hippocampal neurons driving this effect. The temporal precedence of mPFC responses and the significant interaction between mPFC and MTL firing suggest that mPFC may tag negative value that influences MTL encoding, which then modulates future approach tendencies. This extends reinforcement learning and anxiety models by highlighting hippocampal involvement not only in online conflict processing but also in using outcome information to update future behavior. The specificity to Controlled outcomes underscores the role of agency estimation in outcome valuation and motivated behavior. These results align with evidence of hippocampus–mPFC network dynamics in anxiety and avoidance and suggest context-dependent bidirectional influences within this circuit.
Conclusion
This study provides intracranial neuronal evidence for differential, time-ordered contributions of mPFC and MTL to resolving human approach-avoidance conflicts. mPFC preferentially encodes controllable negative outcomes, while hippocampus-dominated MTL activity following punishment predicts reduced subsequent approach under high conflict. The interaction between mPFC and MTL further modulates future choice. These insights emphasize the hippocampus’s role in learning from outcomes to guide future behavior and the importance of agency in outcome valuation. Future work should: (1) use designs enabling balanced cue-related analyses and anticipation phases, (2) examine decision-phase neural dynamics with reduced movement confounds, (3) parse MTL substructures (e.g., ventral vs dorsal hippocampus; basolateral vs central amygdala), (4) extend to noninvasive modalities in healthy cohorts, and (5) leverage these process-specific findings for computational models and targeted interventions in psychiatric disorders (e.g., addiction, PTSD, depression).
Limitations
- Patient cohort: Data were from epilepsy patients; although behavior resembled healthy controls and results were robust after excluding SOZ neurons, generalizability may be limited.
- Anticipation and decision phases: Movement and overlapping events precluded analysis of neural responses prior to outcomes (anticipation) and complicated assessment of decision-phase dynamics.
- Substructure aggregation: Neurons from different MTL subregions (ventral/dorsal hippocampus; basolateral/central amygdala) were pooled, potentially masking distinct roles.
- Behavioral imbalance: High overall approach rates yielded few avoidance trials, preventing dedicated analysis of neural responses during avoidance.
- Movement confounds: Although movement-balancing analyses were performed, some controllability effects in MTL diminished after movement control, indicating residual confounding is possible.
- Task design constraints: Unbalanced cue presentations limited direct evaluation of cue-evoked activity independent of saliency; LGC ceiling effects limited subsequent-behavior analyses to HGC.
Related Publications
Explore these studies to deepen your understanding of the subject.

