logo
ResearchBunny Logo
Dopamine neurons encode trial-by-trial subjective reward value in an auction-like task

Psychology

Dopamine neurons encode trial-by-trial subjective reward value in an auction-like task

D. F. Hill, R. W. Hickman, et al.

This groundbreaking study reveals how dopamine neurons dynamically encode subjective reward value in primates during a BDM auction-like task. Conducted by Daniel F. Hill and colleagues, the results highlight an exciting instant feedback mechanism of reward anticipation that could reshape our understanding of motivation and behavior.

00:00
00:00
Playback language: English
Introduction
The subjective nature of reward value, dependent on individual experiences and preferences, is a fundamental aspect of reward processing. While the dopamine reward prediction error signal is known to be subjective, previous studies have mainly assessed it through aggregate choices, obscuring potential trial-by-trial fluctuations. These fluctuations are likely due to the stochastic nature of brain processes involved in valuation. Common behavioral methods infer subjective reward value from series of choices, yielding an average value across multiple trials. However, the true subjective value of a reward likely varies on a trial-by-trial basis, even for identical rewards. Neurophysiological studies on dopamine neurons and reward have also averaged reward value from multiple choices. While traditional trial-by-trial assessments of neuronal reward responses exist, they haven't fully explored how subjective reward value reflects intrinsic stochasticity across trials despite consistent objective reward amounts. To address this, the researchers employed the Becker-DeGroot-Marschak (BDM) mechanism, an auction-like task where participants' bids directly reflect their instantaneous subjective reward value. The BDM mechanism's incentive compatibility ensures that inaccurate bidding leads to suboptimal rewards, thereby encouraging truthful valuation. Human studies have already demonstrated the BDM mechanism's effectiveness in assessing trial-by-trial fluctuations in subjective value. This study aimed to leverage the BDM mechanism combined with single-cell neurophysiological recordings in monkeys to investigate the trial-by-trial neuronal coding of subjective reward value. The use of highly experienced monkeys who had undergone extensive behavioral training for several years minimizes the potential pitfalls associated with short-term BDM performance observed in humans. The study focuses on dopamine neurons, whose reward prediction error signal is known to encode subjective reward value inferred from binary choices.
Literature Review
Previous research has established that dopamine neurons encode subjective reward value. This is supported by findings showing changes in dopamine concentration related to satiety and hunger, and quantified by choice indifference points, temporal discounting functions, and economic utility functions. However, these studies primarily focused on averaging reward value across multiple trials, neglecting potential trial-to-trial variability in subjective reward value. Other studies using trial-by-trial assessments have shown that dopamine reward prediction error responses vary with behavioral measures such as movement reaction time, learning of reward-predicting stimuli, expected reward time, decision confidence, and licking responses to probabilistic rewards. But none directly addressed how subjective reward value might reflect trial-to-trial stochasticity despite constant objective reward amounts. The BDM mechanism, used in this study, offers a solution by allowing participants to directly state their personal value for each choice, revealing the instantaneous fluctuations in subjective reward value. Human studies using BDM have successfully explored neural correlates of subjective value fluctuations for various rewards, including food and movie trailers.
Methodology
Two adult male rhesus monkeys, previously trained in various BDM tasks, participated. The monkeys bid for different volumes of juice reward in a BDM task against a computer opponent. The task involved a sequence of events: a trial start cue, presentation of one of three fractal images representing three distinct juice volumes, a bid space where monkeys adjusted their bids using a joystick, and finally, the computer's bid, followed by reward delivery based on the outcome of the auction. Extensive behavioral training ensured the monkeys understood the task and exhibited reliable bidding behavior closely approximating subjective reward values estimated by conventional methods. The monkeys' bids were analyzed using a lasso regression model to determine the factors influencing their bidding behavior, including reward magnitude, starting bid, total liquid consumed, previous computer bids, and previous bidding results (win/lose). Single-unit activity in the midbrain was recorded while the monkeys performed the BDM task. Putative dopamine neurons were identified based on their waveform characteristics and response properties. The analysis focused on the second, value-related component of the dopamine response to the reward-predicting cues, excluding the initial attentional component. Subjective value coding was assessed by analyzing the correlation between dopamine responses and the monkeys' bids, considering reward magnitude. To assess whether dopamine responses reflect subjective value independently of reward magnitude, the researchers compared responses for similar bids across different reward magnitudes. Support Vector Regression (SVR) was used to decode the monkeys' bids from the dopamine neuronal responses, evaluating the accuracy of decoding subjective value from neuronal activity.
Key Findings
The monkeys' bids were consistently rank-ordered and correlated with juice volume, demonstrating reliable bidding behavior. The bids varied trial-to-trial and day-to-day, reflecting fluctuations in subjective reward value. Lasso regression identified significant predictors of bidding behavior including reward magnitude, starting bid, total liquid consumed, previous computer bids, and previous results. Approximately half to two-thirds of dopamine neurons exhibited graded value responses to the reward cues, with a subset of these neurons showing significant correlations with the monkeys' bids. Dopamine responses increased with bids, independently of reward magnitude, even when the reward amount was constant. For similar bids across different reward magnitudes, dopamine responses were similar, further supporting the notion that they encoded subjective value rather than objective reward amount. Support Vector Regression (SVR) accurately predicted the monkeys' bids from the dopamine responses, with an accuracy of about 60% using 20-30 neurons. Decoding accuracy reached an asymptote with relatively few neurons, highlighting the high-fidelity encoding of subjective value in a small population of dopamine neurons.
Discussion
The findings demonstrate that phasic dopamine signals encode subjective reward value on a trial-by-trial basis, rather than simply reflecting average subjective or objective reward values. The consistent and systematic bidding behavior of the monkeys, coupled with the corresponding monotonic increase in dopamine responses with reward amounts and bids, supports the conclusion that the dopamine responses reflect subjective valuation. The accuracy of bid prediction using SVR further validates the neuronal code for instantaneous subjective value. The study addressed potential concerns about human BDM performance limitations through extensive training of the monkeys, ensuring reliable and meaningful bidding behavior. The results extend previous work showing that dopamine neurons code reward value as mathematically defined utility and temporal discounting functions, demonstrating that dopamine also codes subjective value in a more general sense—the satisfaction and benefit derived from rewards. The small effect of previous bidding outcomes might reflect short-term adaptation and warrants further investigation. Differences in satiety and win/lose streak between monkeys highlight the subjective nature of BDM valuation. The lower accuracy of SVR compared to binary classifiers might be due to its continuous nature and the binning of bids. However, the successful decoding of bids from dopamine responses confirms the validity of the dopamine signal for coding subjective reward value.
Conclusion
This study provides strong evidence that dopamine neurons encode subjective reward value on a trial-by-trial basis. The BDM auction provided a robust behavioral measure of subjective value, and the SVR decoding results confirm the precise neuronal code for instantaneous subjective reward value. Future research could explore whether dopamine neurons encode the value of affective or socially relevant stimuli in a similar manner, comparing them with physical nutrient rewards.
Limitations
The study used only two monkeys, limiting the generalizability of the findings. The use of only three reward magnitudes might have reduced bidding nonlinearities reflecting risk attitudes. While extensive training minimized potential task misperceptions, some residual effects from previous bidding outcomes or task misperceptions might influence bidding. The lower decoding accuracy of the SVR compared to binary classifiers might be due to the continuous nature of the SVR and the binning of the bids.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny