Biology

Striatal serotonin release signals reward value

M. G. Spring and K. M. Nautiyal

Using GRAB-5HT biosensors in mice, this study reveals that serotonin release in the dorsal medial striatum tracks reward anticipation, value, approach, and consumption—offering millisecond-scale insight into how serotonin shapes reward-related behavior. Research conducted by Mitchell G. Spring and Katherine M. Nautiyal.... show more

Introduction

The study investigates how serotonin release in the dorsomedial striatum (DMS) relates to reward processing, including anticipation, consumption, and value encoding. While dopamine has been widely implicated in motivation and goal-directed behavior, substantial evidence also implicates serotonin in value-driven action selection and reward-related behaviors. Prior manipulations suggest serotonin promotes goal-directed behavior over habits and may contribute to encoding reward value, but findings are heterogeneous: serotonin depletion reduces neural responsivity to rewards, yet increased synaptic serotonin does not necessarily alter hedonic liking or incentive wanting. Multiple serotonin receptor subtypes (1A, 1B, 2A, 2C, 3) are implicated in reward-related behavior and valuation. However, monitoring somatic activity of dorsal raphe nucleus (DRN) neurons is insufficient to understand functional serotonin signaling due to heterogeneity in co-transmission, projection targets, presynaptic modulation, and volume transmission. The striatum, including the DMS, receives strong serotonin input and expresses many serotonin receptors, with in vitro evidence that serotonin modulates DMS circuitry (effects on glutamatergic inputs, interneurons, MSN collateral inhibition, and dopamine release). Historically, in vivo characterization of serotonin release has been limited by microdialysis' low temporal resolution. With the advent of GRAB-5-HT biosensors, the authors aim to measure trial-resolved serotonin release in the DMS during reward consumption across varying reward values, internal states (thirst, satiety-induced devaluation), and Pavlovian conditioning, to clarify the temporal characteristics and functional role of serotonin in reward processing.

Literature Review

Prior work shows serotonin's broad role in reward-related behaviors and decision making. Brain-wide manipulations indicate serotonin biases toward goal-directed over habitual responding in rodents and humans (Ohmura et al., Worbe et al.). Serotonin depletion reduces reward-related neural responses (Seymour et al.), but increasing serotonin does not reliably change hedonic liking or incentive wanting (Caras; Treit & Berridge; Nonkes). At the receptor level, 1A, 1B, 2A, 2C, and 3 receptors influence reward-related behavior and valuation (Desrochers et al.; Nautiyal & Hen; Hayes & Greenshaw). DRN studies suggest diverse reward-related activity: increases to unexpected rewards and ramping during expectation (Cohen; Li; Zhong), prediction-error-like signals (Matias), and responses to innate/conditioned rewards and prospective value (Bromberg-Martin; Nakamura; Zhou). DRN serotonin neurons are heterogeneous by projection target and co-release glutamate (Ren; Okaty; Wang), and serotonin can signal via volume transmission (Agnati), making somatic activity an incomplete proxy for serotonin function. In striatum, serotonin modulates DMS circuitry: induces LTD at corticostriatal synapses (Mathur), excites cholinergic and fast-spiking interneurons (Virk; Blomeley & Bracci; Muñoz-Manchado), inhibits MSN collateral inhibition (Burke & Alvarez; Pommer), and regulates dopamine release (Nair). Dopamine studies in striatum show phasic cue-related signaling distinct from primary reward responses (Brown), with internal state strongly modulating dopamine (Cone; Hsu). The availability of GRAB-5-HT enables measuring serotonin release with high temporal resolution (Wan; Deng) to assess its role in reward anticipation, value, and consumption directly in target circuitry (DMS).

Methodology

Animals: C57BL/6J mice bred at Dartmouth College, weaned at P21, housed 2–4 per cage until fiber optic implantation, maintained on 12:12 light-dark cycle with ad libitum food and water until testing at 15–21 weeks. During training/testing, food provided daily to maintain 80–90% free-feeding weight, water ad libitum except 24 h withholding prior to thirst-state testing. Cohorts: Gustometer experiments N=11 (5 male, 6 female); Pavlovian approach N=16 naive (10 male, 6 female); internal state experiments N=15 (9 male, 6 female), including N=7 from Pavlovian and N=8 additional (4 male, 4 female) used in satiety-induced devaluation along with an additional N=8 (5 male, 3 female). Post-implantation mice singly housed with enrichment.

Behavioral paradigms:

Gustometer (Davis Rig): Training with access to evaporated milk; sessions progressed to intermittent access (shutter-gated). Standard inter-trial and consumption windows defined. Testing included 2-bottle (water vs 100% milk) and 6-bottle (0–100% milk in 20% steps) paradigms with random presentation. Primary measure: number of licks per presentation in the 6-bottle task. Composite preference score for milk vs water computed from median licks, inter-lick interval (ILI), and latency metrics; mice with inadequate thirst-induced preference change initially excluded from signal analysis for internal-state comparisons.
Satiety-induced devaluation: Baseline consumption and GRAB-5-HT recorded for banana/chocolate-flavored 50% milk; next day, unrestricted 1 h access to one flavor (counterbalanced), then standard 2-bottle test; preference metric computed with respect to pre-fed (devalued) flavor.
Pavlovian conditioning (Bussey-Saksida touchscreen chambers): CS+ (8 s tone; white noise or pure 5 kHz, counterbalanced) predicted milk delivery (US); CS- unpaired tone interspersed; 15 CS+ and 15 CS- per session, one session/day for 10 days. Movement tracked via IR beams; approach quantified by time near reward receptacle (IR beam interruptions), binarized in 100 ms bins and normalized to 8 s pre-CS baseline. Approach magnitude averaged per mouse per session; used as covariate in ANCOVA of GRAB-5-HT.

GRAB-5-HT photometry:

Surgery: AAV9-hSyn-GRAB3.6 (WZBiosciences) injected into DMS (AP +0.8 mm; ML +1.6 mm; DV −3.0 mm; 0.5 µl over 10 min; titer 1.5×10^13 molecules/ml). Optic fiber (400 µm core, NA 0.48, flat tip) implanted at DV −2.8 mm at same AP/ML; affixed with dental cement and skull screws. Ketoprofen (100 mg/kg) administered on surgery day and for 4 days post-op.
Placement: Postmortem PFA fixation, cryosectioning at 35 µm, DAPI labeling; fiber placements localized (Fig 1A–B). One mouse in devaluation study lacked placement confirmation and was excluded from analysis.
Data collection: Recordings began 4 weeks post-surgery. Dual-wavelength excitation (465 nm for GRAB emission; 405 nm isosbestic) via Doric system; fluorescence detected by photoreceiver; sampling at 1017.2 Hz; lock-in demodulation; TTLs from behavioral apparatus recorded concurrently.

Signal preprocessing: 5 Hz Butterworth low-pass filter; downsample to 10 Hz; two-phase exponential decay detrending (emission and isosbestic separately) to correct photobleaching/autofluorescence; motion correction via regression of detrended isosbestic onto GRAB-5-HT; normalization to robust-median Z scores using session baseline for self-paced gustometer data and 8 s pre-trial baseline for Pavlovian trials.

Analysis:

Alignment: Gustometer TTL at first lick; shutter opening time inferred by subtracting latency to lick from TTL timestamp; one animal excluded for shutter-aligned analyses due to TTL mismatch. Pavlovian: TTLs at session start, CS onset, reward delivery, nosepokes. Average event-aligned signals computed per mouse; AUC via trapezoidal summation.
Peak/trough detection (shutter-aligned): Fifth-order smoothing; z-normalized first derivative thresholding to identify pre-consumption trough and peri-consumption peak; dynamic thresholding rules applied starting 1 s prior to lick.
Transient analysis: Events exceeding 0.5 MAD identified; start defined by derivative >1 MAD during rise; events within 0.5 s linked; events <0.5 s filtered; magnitude = mean signal during transient; random offsets (±20 min, 2000 repetitions) to assess chance overlap with licking.
Statistics: ANOVAs for omnibus; Holm-corrected post hocs; t-tests for simple comparisons; Wilcoxon rank-sum for non-normal transient metrics; mediation via structural equation modeling with bootstrap (2000 iterations) for 95% CIs; Pavlovian approach analyzed via mixed ANOVA (Epoch × CS-type × Training); GRAB-5-HT via ANCOVA with approach covariate. α=0.05; analyses in R.

Key Findings

DMS serotonin release precedes and tracks reward consumption: GRAB-5-HT signal aligned to first lick revealed ramping beginning ~2 s before consumption (median rise-time from pre-lick trough to lick-associated peak 1.9 s), with levels elevated during the 5 s licking window [t(10)=4.29, p=0.0016]. An audible shutter opening elicited a small increase in serotonin [t(10)=2.586, p=0.029], stronger when lick latencies were long; significant period × trial latency interaction [F(1,9)=17.367, p=0.0024]. Latency to initiate licking was strongly correlated with latency to peak GRAB-5-HT (R²=0.81, p<0.0001), whereas signal rise-time was not correlated with latency to lick (R²=0.0015, p=0.421), indicating anticipatory ramp preceding consumption. Sex differences: Males and females differed in lick rates [F(1,9)=12.64, p=0.0006], with males' rates declining sharply and females maintaining higher rates; no significant sex difference in GRAB-5-HT during the 5 s licking period [F(1,9)=1.90, p=0.201]. Reward value encoding: Increasing milk concentration increased licking [F(5,45)=10.55, p=0.001] and significantly increased GRAB-5-HT release [F(5,45)=18.06, p<0.001], with 60%, 80%, and 100% higher than water (0%) [Ψ(9)=3.87, Pholm=0.0038; Ψ(9)=5.32, Pholm=0.0005; Ψ(9)=5.45, Pholm=0.0004]. Latency to consume predicted latency to signal peak (BConsume=0.865, p<0.001); no significant effect of concentration on peak latency (BConcentration=−1.19, p=0.061; interaction p=0.886). Mediation analysis showed reward concentration affected both lick rate (95% CI 0.341–0.443) and GRAB-5-HT AUC (95% CI 0.066–0.129), but lick rate did not directly affect serotonin (95% CI −0.013–0.07) and did not mediate the concentration–serotonin relationship (95% CI −0.005–0.028). Water (0%) consumption did not increase serotonin versus baseline [t(9)=0.592, p=0.569]. Transient analysis: GRAB-5-HT transients were longer and larger during licking than outside consumption [duration: w=181430, p<0.001; magnitude: w=177645, p<0.001]; overlap of transients with licking exceeded chance [χ²(1)=247.46, p<0.0001]. In the 6-bottle task, transients during milk consumption were longer and of greater magnitude across concentrations (Kruskal–Wallis: HMagnitude(6, N=2073)=174.31, p<0.001; HDuration(6, N=2073)=122.44, p<0.001). Water consumption increased transient duration versus non-lick periods (W=640, Pholm=0.027) but not magnitude (W=981, Pholm=0.27). Internal state: Under fluid restriction, mice increased water consumption; across all mice, water consumption correlated with water-elicited GRAB-5-HT AUC (R²=0.345, p=0.021). In satiety-induced devaluation, most mice shifted preference toward the non-devalued flavor; GRAB-5-HT was reduced during consumption of the devalued flavor versus valued [t(15)=2.66, p=0.018], with a positive correlation between decreased preference and decreased 5-HT (R²=0.411, p=0.0075). Pavlovian conditioning: Approach to the reward port increased over training (main effect of training: F(1,15)=31.21, p<0.001), with greater approach during CS+ than CS− (F(1,15)=5.57, p=0.032) and an increasing CS × training interaction (F(1,15)=5.29, p=0.036). Controlling for approach behavior, GRAB-5-HT was higher in CS+ than CS− trials (F(1,83)=14.95, p=0.0002), with a cue × epoch interaction (F(1,83)=5.10, p=0.026): greater 5-HT during the reward period than cue period for CS+ (F(1,38)=6.081, p=0.018), not for CS− (F(1,38)=0.06, p=0.802). Cue-period 5-HT trended higher for CS+ vs CS− (F(1,38)=2.80, p=0.10). Cue-period serotonin correlated with approach during the cue (R²=0.112, p=0.01), indicating encoding of reward anticipation.

Discussion

The findings demonstrate that serotonin release in the DMS is temporally aligned with reward processing phases: it ramps in anticipation (~2 s prior), peaks around consumption onset, and remains elevated during consumption. Serotonin release is graded by external reward value independent of lick rate, and transient analysis shows preferential and quantitatively larger/longer events during reward consumption compared to non-consummatory periods or water. Internal state manipulations indicate that DMS serotonin tracks subjective value: fluid restriction increased water-related 5-HT in proportion to consumption, and satiety-induced devaluation reduced 5-HT for the devalued flavor commensurate with decreased preference. In Pavlovian conditioning, DMS serotonin increased during CS+ relative to CS− even when controlling for approach behavior, with stronger signals during reward delivery and a cue-period correlation with anticipatory approach, suggesting serotonin encodes expected outcome value and contributes to reward approach. Comparisons with dopamine suggest complementary roles: dopamine exhibits faster, phasic cue-onset signaling and is tightly modulated by homeostatic states, whereas serotonin shows prolonged signals during cues and robust responses to unexpected rewards, potentially supporting persistence toward anticipated beneficial outcomes. The longer 5-HT timescale is consistent with DRN somatic activity literature, though differences in biosensor kinetics complicate direct quantitative comparisons. The relatively modest modulation by internal thirst state may reflect insufficient change in subjective value under the 24 h fluid-deprivation protocol or an action-value representation in DMS that maintains preference hierarchies even in fixed-choice scenarios. The data align with theories that serotonin signals anticipated beneficial outcomes, persistence, and aspects of uncertainty/flexibility, and suggest that serotonergic modulation within DMS may shape goal-directed action selection and valuation.

Conclusion

This study advances understanding of striatal serotonin by directly measuring DMS serotonin release with GRAB-5-HT during discrete reward events. Serotonin release anticipates and accompanies reward consumption, scales with external reward value, and reflects subjective value changes due to internal state and devaluation. In Pavlovian conditioning, DMS serotonin encodes reward expectation and approach, with stronger signals for CS+ and during reward delivery. These findings position serotonin as a signal of anticipated outcome value in the DMS that can modulate reward-related behavior and approach. Future research should probe how environmental uncertainty and task contingencies jointly shape dopamine and serotonin dynamics in DMS, dissect receptor- and cell-type-specific mechanisms mediating value signals, and assess serotonin’s role across DMS-dependent action-outcome learning and flexible decision-making.

Limitations

Direct comparison of dopamine and serotonin signal kinetics is constrained by differences in biosensor properties, making precise timescale comparisons tentative. Internal state effects on serotonin were relatively modest under 24 h fluid deprivation, potentially limiting detection of strong subjective value modulation. One mouse lacked histological confirmation of fiber placement and was excluded. As an early release manuscript, final formatting and links to extended data may change. The heterogeneity of serotonergic projections and volume transmission may mean DMS signals reflect integrated inputs that are not directly attributable to specific DRN subpopulations.

Related Publications

Explore these studies to deepen your understanding of the subject.

Psychology

Dopamine and serotonin in human substantia nigra track social context and value signals during economic exchange

S. R. Batten, D. Bang, et al.

Psychology

Timing along the cardiac cycle modulates neural signals of reward-based learning

E. F. Fouragnan, B. Hosking, et al.

Biology

Dopamine transients follow a striatal gradient of reward time horizons

A. Mohebi, W. Wei, et al.

Psychology

Ventromedial prefrontal value signals and functional connectivity during decision-making in suicidal behavior and impulsivity

V. M. Brown, J. Wilson, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny