logo
ResearchBunny Logo
Introduction
The exploration-exploitation tradeoff, balancing the exploration of new options against the exploitation of known ones, is crucial for adaptive decision-making. Exploitation offers predictable immediate rewards, while exploration risks lower rewards but potentially higher long-term payoffs. Imbalances in this tradeoff are linked to psychiatric disorders like addiction and gambling disorder. Neural evidence suggests distinct brain systems for exploration and exploitation, with the ventromedial prefrontal cortex (vmPFC) implicated in exploitation and a frontopolar cortex-lateral PFC pathway linked to exploration. Dopamine's role in reward prediction and value signaling has been established, and its involvement in exploration, particularly 'directed exploration' (seeking information based on uncertainty), has also been suggested. Noradrenaline, on the other hand, has been linked to exploratory behavior, particularly 'random exploration' (value-independent exploration), potentially acting as a 'reset button' interrupting information processing to promote exploration of new options. The current study aimed to clarify the distinct contributions of dopamine and noradrenaline to the exploration-exploitation tradeoff in human decision-making by systematically manipulating reward features in a virtual patch-foraging task while pharmacologically modulating dopamine and noradrenaline levels.
Literature Review
Existing research suggests distinct neural circuits and neurotransmitter systems underpin exploration and exploitation in decision-making. The vmPFC is associated with exploitation, while exploration involves pathways from the frontopolar cortex to the lateral PFC. Dopamine's role in reward prediction and value signaling is well-established, but its contribution to exploration, particularly directed exploration driven by uncertainty, is debated. Studies have linked genes involved in striatal dopamine signaling to exploitation, yet other evidence points to a role for prefrontal dopamine function in promoting exploratory decisions based on uncertainty about potential better outcomes. This 'directed exploration' may involve a novelty bonus added to unknown alternatives. Conversely, noradrenaline has been associated with exploration, particularly 'random exploration'. Studies show that high noradrenaline levels increase strategy shifts, while low levels promote perseverative behavior. Unlike dopamine, noradrenaline doesn't seem to bias exploration towards uncertainty but rather increases stochasticity, potentially by acting as a 'reset button' that interrupts ongoing processing and inhibits the use of prior knowledge.
Methodology
Sixty-nine healthy participants (33 women, 36 men; mean age = 24.98) were randomly assigned to one of three groups: placebo, amisulpride (400 mg, D2/D3 receptor antagonist), or propranolol (40 mg, β-adrenergic receptor antagonist). Drugs were administered orally 90-120 minutes before a virtual patch-foraging task. The task involved harvesting apples from virtual trees, with each subsequent harvest yielding fewer apples. Participants chose to either harvest the current tree or switch to a new one, with switching involving a time cost (6s or 12s). Reward values, depletion rates, and travel times were systematically varied to investigate their impact on exploration and exploitation. Physiological measures (blood pressure, heart rate, pupil diameter, blink rate) were taken to verify drug action. Choice behavior was analyzed using mixed-effects logistic regression, considering previous reward, travel time, depletion rate, number of previous stays, and group, including interactions. A marginal value theorem (MVT) model was used for optimal foraging analysis. A computational model based on the MVT was fitted to explore learning rates, temperature parameter (β, reflecting value dependence of choices), and choice bias.
Key Findings
The manipulation check confirmed drug action through changes in physiological measures. Heart rate and blood pressure decreased significantly more in the propranolol group. Blink rate decreased in the propranolol group, while pupil diameter decreased most in the amisulpride group. Logistic regression showed that previous reward and travel time significantly influenced choices. The amisulpride group showed decreased switching after high rewards and long travel times, indicating increased sensitivity to these choice features. Surprisingly, the amisulpride group also showed a higher probability of switching with a higher depletion rate. The propranolol group showed no significant effect of these choice features. In the first half of the task, the amisulpride group showed enhanced switching, suggesting initial directed exploration that then informed subsequent choices. The propranolol group showed decreased switching in the second half, indicating a shift toward exploitation. There were no group differences in total rewards, but the amisulpride group tended toward higher rewards. MVT analysis showed no group differences in exit thresholds. Computational modeling revealed a significantly lower learning rate in the amisulpride group compared to the propranolol group and a trend toward a lower learning rate compared to the placebo group; the temperature parameter and choice bias did not differ significantly across groups.
Discussion
The findings support functionally distinct roles for dopamine and noradrenaline in exploration-exploitation. Amisulpride's effect suggests dopamine enhances sensitivity to decision-relevant information, supporting directed exploration. The initial increase in switching in the amisulpride group followed by a reduction suggests that this directed exploration informs later choices. The lower learning rate in the amisulpride group may reflect better integration of information over longer time spans. Propranolol's effect suggests noradrenaline influences decision noise and when to switch from one information path to explore new options randomly. The heterogeneous findings on noradrenaline's effects might be due to tonic versus phasic noradrenaline activity, or a potential inhibitory mechanism of β-adrenergic receptors affecting noise levels. The overall effect of propranolol on the exploration-exploitation tradeoff was less pronounced than amisulpride, possibly because noradrenaline exerts higher-order control, such as urgency signals, rather than directly driving specific decision components.
Conclusion
This study demonstrates functionally dissociable roles for dopamine and noradrenaline in human exploration-exploitation. Dopamine enhances sensitivity to decision-relevant information, promoting directed exploration, while noradrenaline influences decision noise and the timing of switching to randomly explore new options. Future research should investigate these effects further, considering tonic vs. phasic noradrenaline activity, controlling for confounding factors such as fatigue, and employing within-subject designs.
Limitations
The study used a between-subjects design, potentially confounding drug effects with individual differences. Confounding factors like fatigue or boredom were not directly measured. Baseline task performance was not assessed, and the limited range of reward values in the task may have restricted interpretation of the temperature parameter. Future studies should address these limitations.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny