Psychology
Discovering cognitive strategies with tiny recurrent neural networks
L. Ji-an, M. K. Benna, et al.
We introduce a novel modelling approach using recurrent neural networks to discover cognitive algorithms underlying biological decision-making. Small networks (1–4 units) often outperform classical models, match larger nets, and—interpretable via dynamical-systems analysis—reveal mechanisms and behavioural dimensionality. Research conducted by Li Ji-An, Marcus K. Benna, and Marcelo G. Mattar.
~3 min • Beginner • English
Introduction
The study addresses how animals and humans learn from rewards to make decisions, and how to model these processes accurately and interpretably. Classical normative frameworks, including Bayesian inference and reinforcement learning, capture key principles of adaptive behaviour but can be too simple or require extensive hand-crafted additions (for example, forgetting, biases, perseveration), which introduces subjectivity and can mischaracterize behaviour. Neural networks offer flexibility and strong predictive performance but are often difficult to interpret. The authors propose fitting very small recurrent neural networks (RNNs; 1–4 units) to individual subjects across several reward-learning tasks to combine predictive flexibility with mechanistic interpretability via dynamical systems analysis. The research aims to discover concise, interpretable cognitive algorithms that govern choice behaviour, compare them with classical models, estimate behavioural dimensionality, and extend insights to artificial agents trained with meta-reinforcement learning.
Literature Review
The paper situates its contribution within a progression from early symbolic cognitive models to connectionist approaches, and highlights the influence of Bayesian inference and reinforcement learning as frameworks for learning and decision-making supported by prefrontal and striatal circuits. Prior neural network work optimized task performance to explain neural activity (vision, navigation, memory, planning) or behaviour, achieving accuracy but sacrificing interpretability. Established cognitive models often employ small sets of parameters and can be augmented with components such as perseveration and state-transition learning, yet may still provide biased or incomplete behavioural characterizations. The paper references debates on the two-stage task (model-free vs model-based control), meta-RL interpretations, and broader efforts in extracting computational mechanisms, dimensionality reduction in neural data, and knowledge distillation for model compression.
Methodology
Design and datasets: The authors analyse six widely studied reward-learning tasks across eight datasets: animal tasks (reversal learning; two-stage; transition-reversal two-stage) and human tasks (three-armed reversal learning; four-armed drifting bandit; original two-stage). Animal datasets include 2 monkeys (Bartolo), 4 rats (Miller), and 10 and 17 mice (Akam across tasks), totalling hundreds of thousands of trials. Human datasets include 1,010 participants (three-armed reversal), 975 participants (four-armed bandit, 57 excluded for missing trials >10%), and 1,961 participants (original two-stage). Task structures involve binary or multi-armed choices, probabilistic transitions to second-stage states, and reward probabilities that switch or drift over time.
Models: The core behavioural models are tiny recurrent neural networks using gated recurrent units (GRUs). Each network unit is treated as a dynamical variable summarizing past inputs (previous action, state, reward) and evolves via GRU equations; outputs are action probabilities via a softmax readout. Two architectures are used: vanilla GRU and switching GRU (input-dependent recurrent weights/biases). Switching GRU consistently outperforms vanilla GRU for d=1 in animal datasets, while vanilla GRU performs similarly for d≥2. A switching linear neural network (SLIN) is also introduced for interpretability.
Inputs and outputs: Inputs comprise previous action a_{t-1}, second-stage state s_{t-1}, and reward r_{t-1} (with adjustments for reversal learning). Outputs are two or more action logits read via fully connected or diagonal readout layers and converted to action probabilities via softmax.
Training and validation: RNNs and classical cognitive models are trained via maximum likelihood (cross-entropy), with Adam (learning rate 0.005), L1 regularization on recurrent weights, and early stopping. Due to higher parameter counts in RNNs, nested cross-validation is employed: an outer loop for held-out test folds, and an inner loop to select hyperparameters on validation folds. Performance is reported as cross-validated trial-averaged negative log-likelihood (lower is better). The authors argue AIC/BIC are unsuitable for singular models like neural networks and prefer cross-validation for unbiased generalization estimates.
Classical cognitive models: Over 30 cognitive models are implemented across tasks, spanning model-free RL, model-based RL, Bayesian inference, and hybrids, with variants accounting for forgetting, perseveration, biases, eligibility traces, and learned transition probabilities. For human tasks, additional model-free variants include value forgetting, action perseveration, and RNN-inspired mechanisms (for example, unchosen value updating, reward utilities, reference points).
Behavioural dimensionality: Dimensionality (d*) is estimated as the minimal number of dynamical variables needed to optimize predictive performance, using statistical criteria over cross-validation folds.
Interpretability: The authors develop a dynamical systems framework for interpretation. For d=1, phase portraits of policy logit L(t) and logit change ΔL(t) reveal fixed points, learning rates, and qualitative signatures (for example, RL vs Bayesian). Preference setpoints summarize input effects across tasks. For d>1, two-dimensional vector fields visualize trial-by-trial state updates and attractors; dynamical regression linearly approximates state-update dynamics and yields input-dependent state-transition matrices. Symbolic regression (PySR) is used to extract concise update rules learned by RNNs (for example, drift-to-the-other).
Knowledge distillation: To reduce data requirements, a teacher-student framework is introduced. A 20-unit teacher RNN is trained across subjects with subject embeddings input, then tiny student RNNs are trained to match teacher output probabilities for single subjects. An interspersed split protocol ensures identical train/validation/test distributions within single blocks of human trials. Cross-subject split protocols test zero-shot generalization.
Meta-RL agents: The authors train LSTM-based meta-RL agents in a two-stage task using Advantage Actor-Critic to near-optimal performance and analyse their learned strategies via phase portraits and comparisons to cognitive models.
Key Findings
Across all animal datasets and tasks, tiny RNNs (often 1–4 units) outperform classical cognitive models of equal dimensionality in predicting choices, and perform comparably to significantly larger RNNs. Two-unit RNNs are typically optimal for reversal learning and two-stage tasks; four-unit RNNs for the transition-reversal two-stage task. RNNs also outperform ideal Bayesian observer models, indicating suboptimal animal behaviour.
RNNs reproduce canonical behavioural metrics (for example, choice probability around reversals; stay probabilities conditioned on reward and transition type) while delivering lower trial-averaged negative log-likelihood. The best-performing RNN for a subject often has low dimensionality (d*=1–2 for reversal learning and two-stage; d*=1–4 for transition-reversal), suggesting low-dimensional behavioural dynamics in these tasks.
Data requirements: RNN predictive performance improves with data and can exceed cognitive models once training data surpasses about 500–3,000 trials per subject; with scarce data, simpler models can outperform due to data efficiency.
Knowledge distillation: Student RNNs trained to match a multi-subject teacher RNN’s probabilities outperform the best matched-dimensional cognitive models with far fewer trials. In a representative mouse transition-reversal dataset, student RNNs trained via distillation surpass model-free RL with just ~350 trials per subject, versus ~3,000 trials required by solo RNNs.
Interpretability and novel signatures: Phase portraits of tiny RNNs reveal signatures beyond classical models, including state-dependent learning rates, state-dependent perseveration, reward-dependent choice biases, and reward-induced indifference (rewards following rare transitions yield indifference rather than preference). In multiple rats and mice, reward-induced indifference strength correlates with better task performance (ρ=0.62, P=0.008). Two-dimensional vector fields and dynamical regression reveal “drift-to-the-other” forgetting (unchosen values drifting toward the chosen alternative’s value) and cross-action influences in multi-armed tasks; augmenting cognitive models with RNN-derived mechanisms improves fit.
Model recovery and synthetic evaluations show tiny RNNs match the predictive performance of ground-truth cognitive agents, indicating RNNs form a superset of classical models and that cognitive strategies are identifiable without overfitting.
Meta-RL analysis: Task-optimized meta-RL agents exhibit dynamics closer to a one-dimensional Bayesian inference strategy than model-based RL, with a history-dependent distortion (parallel logit-change curves), diverging from animal patterns in the same task.
Discussion
The findings demonstrate that very small RNNs can serve as mechanistically transparent models of decision-making across species and tasks, surpassing classical cognitive models and aligning with larger networks while requiring few dynamical variables. The dynamical systems framework for interpretation—phase portraits, preference setpoints, vector fields, and dynamical regression—enables unified, model-agnostic comparison and reveals previously overlooked behavioural mechanisms (for example, variable learning rates, reward-induced indifference, drift-to-the-other forgetting, state-dependent perseveration and biases). Knowledge distillation leverages group-level data to achieve high performance with typical human trial counts, extending applicability to cognitive neuroscience and computational psychiatry contexts.
The approach also provides a principled estimate of behavioural dimensionality via the minimal number of dynamical variables required to optimize predictions, highlighting low-dimensional dynamics in standard reward-learning tasks. The distinction between dynamical dimensionality and model expressivity underscores the complementary roles of variable count, nonlinearity, and parameterization in capturing behaviour. The framework can analyse and compare biological strategies with task-optimized artificial agents, revealing strategy-level similarities and differences (for example, Bayesian-like meta-RL dynamics with history effects) and offering a route to bridging computational and neurobiological levels.
Conclusion
Tiny RNNs (often 1–4 units) provide flexible, accurate, and interpretable models of adaptive decision-making across animal and human tasks, outperforming classical cognitive models of matched dimensionality and often matching larger neural networks. By applying dynamical systems analyses, the authors discover generalizable cognitive signatures and propose targeted augmentations to classical models that improve performance. Knowledge distillation enables effective modelling in low-data human settings. The framework offers a foundation for studying individual differences and computational psychiatry, and can be extended to more complex tasks, perceptual decision-making, memory, and neural prediction. Future work should explore alternative architectures for richer dynamics, improved training and regularization schemes, and enhanced interpretability techniques to scale to more complex, higher-dimensional behaviours.
Limitations
Tiny RNNs, while flexible, require substantially more data than simpler cognitive models; with scarce training data, RNNs can underperform due to inadequate parameter constraints, necessitating techniques like knowledge distillation. The GRU updating equation may limit the complexity of dynamics captured by very small networks, potentially requiring alternative architectures for complex behaviours. Phase portrait interpretation becomes challenging when tasks have many or continuous inputs, motivating summarized metrics (for example, preference setpoints) and alternative analyses. Dimensionality estimates can be biased by limited data or very slow latent variables (underestimation), or by embedding nonlinear dynamics (overestimation). Choices in training, regularization, and architecture reflect trade-offs between flexibility and generalization and may face scaling challenges in more complex tasks.
Related Publications
Explore these studies to deepen your understanding of the subject.

