Introduction
Dopamine's crucial role in cognitive and motivational processes, including reinforcement learning and decision-making, is well-established. Phasic dopamine neuron responses encode reward prediction errors, driving learning through the activation of D1 and D2 receptors in the striatum's direct and indirect pathways respectively. Neuroimaging studies consistently show reward prediction error coding in the striatum, while animal studies demonstrate a causal link between dopamine signaling and reinforcement learning. However, the human evidence regarding dopamine's causal role in reinforcement learning is mixed, with studies showing heterogeneous effects of pharmacological manipulation. Some studies suggest L-dopa improves go learning by enhancing reward prediction error responses, while others find no such effect or even observe increased punishment-related responses. Similarly, the effects of D2 receptor antagonists are inconsistent, complicated by dose-dependent effects on presynaptic autoreceptors, which can increase rather than decrease striatal dopamine release at lower doses. Beyond learning (value updating), dopamine's contribution to performance (action selection) is also debated. One perspective posits that increased striatal dopamine increases go pathway activation and reduces no-go pathway activation, facilitating action initiation. This is linked to dopamine's role in regulating response vigor and decision thresholds, supported by some human and animal studies. The current study, initially designed to replicate a prior study showing L-dopa's beneficial effect on learning from rewards, uses a larger sample (n=31), a within-subjects design, and higher drug dosages (150mg L-dopa, 2mg haloperidol). Leveraging recent advancements in reinforcement learning drift-diffusion models (RLDDMs), the study tests for dopamine effects on both learning and action selection.
Literature Review
Existing literature on dopamine's role in reinforcement learning and action selection presents conflicting findings. Studies investigating the effects of L-dopa (dopamine precursor) on learning show varied results. Some report improved learning from rewards and enhanced striatal coding of prediction errors, while others find no effect or even blunted prediction error responses. Similarly, the effects of D2 receptor antagonists are inconsistent, sometimes impairing reinforcement learning, sometimes showing no effect, or effects limited to post-learning decision-making. The dose-dependent effects of D2 receptor antagonists on presynaptic autoreceptors further complicate interpretation, as lower doses may actually increase striatal dopamine release. Regarding action selection, increased dopamine availability has been associated with increased striatal go pathway activation and reduced no-go pathway activation. This has been linked to dopamine's role in regulating response vigor and decision thresholds, although direct causal evidence in humans is lacking. This study aims to clarify the inconsistent findings by employing a robust methodological approach and advanced computational modeling.
Methodology
Thirty-one healthy male volunteers participated in a double-blind, counterbalanced, within-subjects study involving three fMRI sessions under three drug conditions: placebo, 150 mg L-dopa, and 2 mg haloperidol. Participants performed a stationary reinforcement learning task during fMRI scanning, involving choosing between two fractal images with different reinforcement probabilities (80% and 20%). Response times (RTs) were recorded, and choices of the suboptimal option were coded as negative RTs. Model-agnostic analyses examined accuracy, total rewards, and median RTs, using Bayesian repeated measures ANOVAs. To examine dopamine effects on both learning and action selection, three computational models were compared: a null model (DDM) without learning, and two reinforcement learning drift-diffusion models (RLDDMs) with either a single learning rate (RLDDM1) or dual learning rates for positive and negative prediction errors (RLDDM2). Model comparison used estimated log pointwise predictive density (-elpd). Posterior predictive checks evaluated model fit by comparing observed and simulated data. Drug effects on RLDDM2 parameters (boundary separation, non-decision time, value coefficient, positive and negative learning rates) were analyzed using a combined model with placebo as baseline and drug effects modeled as additive changes. fMRI analyses focused on a priori regions of interest (ROI) based on meta-analyses of value effects (ventral striatum, ventromedial prefrontal cortex, posterior cingulate), examining model-derived values, prediction errors, and drug effects using GLMs. Additional analyses included exploration of meta-learning effects across sessions, examining alternative modeling schemes with collapsing bounds, and testing for associations between drug effects on decision thresholds and individual differences in behavioral measures. Working memory capacity (WMC) and body weight were included as covariates in some analyses.
Key Findings
Model-agnostic analysis revealed no significant drug effects on accuracy, total rewards, or median RTs. Replication attempts of a previous study's finding that L-dopa improves learning from rewards yielded no credible evidence of a difference. Model comparison favored RLDDM2 (dual learning rates) over the other models across all drug conditions. Posterior predictive checks confirmed RLDDM2 accurately captured learning-related changes in accuracy and RTs. Analysis of drug effects on RLDDM2 parameters revealed consistent reductions in decision thresholds (boundary separation parameter) under both L-dopa and haloperidol compared to placebo. This effect was robust across different modeling schemes and consistent with the numerical pattern of lower accuracy and faster median RTs under both drugs. Individual differences in drug-induced decision threshold reductions were significantly associated with individual differences in RT differences between conditions, indicating that the observed reduction in decision thresholds wasn't an artifact of parameter shrinkage. fMRI analyses, focusing on a priori ROIs in reward-related circuits, replicated the effects of model-derived values and prediction errors across drug conditions, but found no significant drug effects on these neural responses after correction for multiple comparisons. Exploratory analyses found higher value effects under L-dopa and haloperidol in the left anterior insula and dorsal anterior cingulate/pre-SMA, suggesting a potential role of this circuit in drug-induced decision threshold modulation.
Discussion
The study's failure to replicate the previously reported L-dopa-induced improvement in learning from rewards is likely due to differences in experimental design, particularly the omission of loss trials in the current study. The consistent finding of reduced decision thresholds under both L-dopa and haloperidol supports a computational account of dopamine's role in action selection by modulating decision thresholds. The similar effect under both drugs, despite their different mechanisms of action, suggests that the effect is linked to increased dopamine availability. The finding is consistent with previous studies suggesting that dopamine boosts action-specific activation contrasts between striatal go and no-go pathways, impacting action selection mechanisms in the ACC/pre-SMA. The observed reduction in decision thresholds potentially bridges circuit-level accounts of action selection with dopamine's role in regulating response vigor, suggesting that this threshold modulation might increase reward rate by adjusting the trade-off between speed and accuracy. The exploratory fMRI findings implicating the anterior insula and ACC/pre-SMA warrant further investigation.
Conclusion
This study provides evidence for dopamine's role in regulating decision thresholds during reinforcement learning, contrasting with inconsistent findings on dopamine's influence on learning itself. The robust reduction in decision thresholds under both L-dopa and haloperidol supports a computational model of dopamine's role in action selection, potentially bridging circuit-level accounts with dopamine's role in response vigor. Future research should address the study's limitations, such as the focus on male participants and the relatively small sample size, and investigate the potential role of the identified brain regions in dopamine's effects on decision-making.
Limitations
The study's limitations include the exclusive use of male participants, restricting the generalizability of findings. The sample size (n=31), while larger than some previous studies, is still relatively small and may not capture non-linear baseline-dependent drug effects adequately. The timing of the task relative to peak L-dopa plasma levels might have affected the observed lack of L-dopa effects on learning and neural prediction error signaling, although this is considered less likely given the robust L-dopa effect on boundary separation. Further research with larger and more diverse samples and further exploration of task and drug interaction is necessary for a more comprehensive understanding.
Related Publications
Explore these studies to deepen your understanding of the subject.