logo
ResearchBunny Logo
Introduction
Adolescence is characterized by significant behavioral changes, including increased risk-taking, impulsivity, and reward-seeking behavior. These behaviors are hypothesized to stem from neurobiological alterations leading to heightened reward learning. This period is also a high-risk time for the onset of mental disorders, particularly disruptive behavior disorders, which are strongly linked to impulsive behavior and reinforcement learning difficulties. Internalizing problems are also associated with reinforcement learning challenges. However, the interplay between reward and punishment learning, and response biases such as action initiation, during adolescence remains unclear. Existing research has limitations in terms of sample size and the ability to disentangle learning processes from response biases. This study addresses these gaps by employing computational modeling to separate learning processes (modifying behavior based on past experiences) from action initiation biases (impulsive actions regardless of consequences) in a large, diverse sample.
Literature Review
Previous research on adolescent learning has yielded inconsistent findings regarding reward and punishment learning. Some studies suggest an adolescent peak in reward learning, while others report better reward learning in adolescents compared to adults. Reversal learning tasks have shown varying results, with some indicating increased punishment learning in adolescents compared to adults and others finding a trough in punishment learning during mid-adolescence followed by an increase in early adulthood. These inconsistencies might stem from factors such as differing learning contexts, task demands, and small, non-diverse samples. One study examining both learning and action inhibition in children, adolescents, and adults found that adolescents showed attenuated 'go' and Pavlovian biases, but learning rate was not associated with age. Overall, the existing literature lacks a clear picture of how learning mechanisms and action initiation biases interact during adolescence, particularly in a large, diverse sample.
Methodology
This study utilized a large sample (N = 742) of 9-18 year olds from across Europe. Participants completed a reward and punishment learning task involving learning to associate abstract 3D objects with either reward (earning points) or punishment (losing points) through trial-and-error. Participants had to learn whether to respond ('go') or withhold response ('no-go') to each object. A hierarchical expectation maximization approach was used to fit a series of reinforcement learning models to the data. These models varied in terms of the inclusion of separate reward and punishment learning rates, action initiation biases (constant or initial), and sensitivity to the magnitude of points gained or lost. Bayesian model comparison methods were used to determine the best-fitting model. Following model fitting, generalized linear mixed models (GLMMs) were used to test the association between age, pubertal stage (measured using the Pubertal Developmental Scale), and model parameters. Spearman's rank correlations examined the relationship between model parameters and task performance. Parameter recovery and model identifiability procedures were also performed to validate the results. Additional analyses were conducted by age group (9–12, 13–15, and 16–18 years) and by excluding the first stimulus presentation to assess robustness.
Key Findings
The best-fitting computational model included separate reward and punishment learning rates and a constant action initiation bias. Punishment learning rates increased linearly with age (GLMM; B = 0.10 [0.05, 0.15], z = 4.12, p < 0.001, 2-tailed), while action initiation biases decreased linearly with age (GLMM; β = −0.20 [−0.28, −0.12], z = −4.91, p < 0.001, 2-tailed). Reward learning rates remained stable across adolescence (GLMM; B = 0.01 [−0.05, 0.07], z = 0.30, p = 0.77, 2-tailed). These findings held when chronological age was replaced with pubertal stage. Overall task performance was positively correlated with reward and punishment learning rates and negatively correlated with action initiation bias. The winning model accurately simulated the observed behavior.
Discussion
This study challenges the prevalent notion of heightened reward sensitivity in adolescence, demonstrating that reward learning rates remain stable while action initiation biases decline and punishment learning increases across adolescence. These findings highlight the importance of differentiating between learning processes and response biases. Apparent reward-oriented behavior might reflect action initiation biases rather than enhanced reward learning. These developmental changes in action initiation and punishment learning might be relevant to understanding the development of adolescent-onset psychopathologies such as conduct disorder, where disruptions in punishment learning and impulsivity are observed. A better understanding of these distinct processes could inform interventions aimed at reducing risky behavior in adolescents.
Conclusion
This large-scale study reveals developmental trajectories in punishment learning and action initiation, but not in reward learning. Punishment learning increases, and action initiation bias decreases across adolescence, indicating a shift towards greater behavioral inhibition and sensitivity to negative consequences. These findings necessitate a reconsideration of the role of reward sensitivity in adolescent behavior and highlight the importance of considering action biases in future research on adolescent brain development and psychopathology.
Limitations
This study's limitations include the lack of 'no-go to gain reward' and 'go to avoid punishment' conditions, preventing the assessment of Pavlovian action biases. The deterministic nature of outcomes differs from many previous studies, potentially influencing the relationship between learning rates and performance. While the study separated action initiation biases from learning, other potential parameters such as variable learning rates or choice stickiness were not included. The cross-sectional design limits causal inferences. Despite this, findings remained robust even when data from the first stimulus presentation was excluded.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny