
Interdisciplinary Studies
A simple model for Behavioral Time Scale Synaptic Plasticity (BTSP) provides content addressable memory with binary synapses and one-shot learning
Y. Wu and W. Maass
Recent experiments reveal a one-shot learning rule—Behavioral Time scale Synaptic Plasticity—that instantaneously forms memory traces in hippocampal area CA1. We present a transparent model and theory showing how it creates content-addressable memory, reproduces memory-trace repulsion, and suggests energy-efficient implementation in memristor crossbar arrays. Research conducted by Yujie Wu and Wolfgang Maass.
~3 min • Beginner • English
Introduction
The study addresses how the brain can create content-addressable memory (CAM) via one-shot learning for episodic and conjunctive memories. Recent experiments in awake, behaving animals revealed BTSP in hippocampal area CA1 as a key mechanism. Unlike Hebbian and STDP rules, BTSP is gated by plateau potentials originating from entorhinal cortex input, acts on a seconds-long time scale, operates effectively in single trials, and depends on prior synaptic weight values rather than postsynaptic firing. The work poses the question whether core features of experimentally observed BTSP can be captured with a simple stochastic rule using binary synaptic weights, and whether such a rule can support high-capacity CAM with robust recall from partial cues, handle overlapping memories, and account for human-like repulsion of similar memories. It also explores implications for neuromorphic implementations with memristors.
Literature Review
Prior models of BTSP (e.g., Milstein et al., 2021) used continuous-time differential equations with continuous weights and no noise, focusing on place field induction. Experimental data show substantial trial-to-trial variability, motivating stochastic formulations and even binary-weight models. Classical CAMs such as Hopfield networks (HFNs) achieve high capacity with many distinct weight values; however, capacity drops substantially when quantized to binary weights, and online learning rules for binary HFN CAMs are lacking. Sparse coding in the brain (≈0.5% activity) is beneficial for energy efficiency and desirable for neuromorphic systems. Random projection strategies (including the fly algorithm) use fixed random binary weights and often require large expansion and global competition, which are biologically implausible for CA1 scale. Previous neuromorphic CAM implementations based on HFNs required off-chip training and multi-level synapses. The present work situates BTSP as a biologically grounded, stochastic, one-shot plasticity rule capable of building CAM with binary synapses, potentially overcoming HFN and RP limitations and reproducing human memory phenomena such as repulsion between similar items.
Methodology
- BTSP rule (binary, stochastic): Within a 10 s plasticity window around a stochastically generated plateau potential in a CA1 neuron, each active presynaptic input bit x_i=1 triggers with probability 0.5 either LTP (if w_i=0 → w_i=1) or LTD (if w_i=1 → w_i=0). Outside the window, no change. Plateau potentials occur independently per memory neuron with probability f_q per presented pattern. A core variant absorbs the 0.5 factor into f_q, yielding similar results.
- Network model: Two-layer feedforward network mapping input (CA3) to memory neurons (CA1), modeled as disconnected McCulloch-Pitts units with thresholding. No lateral CA1 recurrence. Default parameters motivated by biology: m=25,000 input neurons; n=39,000 memory neurons; connection probability f_w=0.6; input sparsity f_p=0.005. Each memory item is presented once (one-shot learning). Thresholds are set by grid search to optimize recall from partial cues.
- Learning phase: For each input pattern x (binary, sparse), a random subset of memory neurons draws plateau potentials with probability f_q, opening seconds-long plasticity windows for BTSP weight updates from active inputs.
- Recall phase: Present either original x or masked x' (a fraction of 1s set to 0; also tested two-sided noise). Measure similarity between memory traces z(x) and z(x') by Hamming distance (HD) and relative dissimilarity HD(z(x),z(x'))/HD(z(x_a),z(x_b)).
- Theory: Develops an analytical framework using binomial distributions and a parity lemma to handle stochastic LTP/LTD toggling. Derives probabilities of synapses remaining active after M items, neuron firing probabilities conditioned on plateau presence, expected HDs for original vs masked and for different items, and extensions to overlapping items via modeling common 1s and parity of update counts. Predicts trace sizes, active weight fractions, and recall performance as functions of M and f_q.
- Comparisons:
• Random Projections (RP) with same overall density of 1 weights as BTSP after learning; thresholds optimized for masked-cue recall.
• Hopfield networks (HFNs): classical formulation with continuous weights trained in one shot on sparse inputs and partial connectivity f_w=0.6; iterative retrieval (100 steps); also examined binarized variants.
- Feedback for input reconstruction: Adds random feedback connections (density 0.6) from memory to input neurons trained by a one-shot Hebbian-like rule (binary saturation). During recall, present masked x', then overwrite inputs with reconstructed r(x') from feedback at next step. Evaluate scaled reconstruction error HD(x,r(x'))/HD(x,x').
- Place-field benchmarks: Simulations reproduce instantaneous place field induction and speed-dependent spatial extent consistent with seconds-long BTSP window.
- Overlapping memories and repulsion: Design pairs with controlled overlap (e.g., 40%) among otherwise unrelated items to quantify repulsion in memory traces; define repulsion index as overlap(unrelated)/overlap(similar). Explore dependence on f_q and LTD probability.
- Data/code: Public repositories provided for reproducibility.
Key Findings
- Simple binary BTSP rule yields high-capacity CAM via one-shot learning. It requires only binary synapses and stochastic plateau gating.
- Robust recall from partial cues: With biologically suggested f_q≈0.005, memory traces can be recalled reliably with up to ≈33% of 1s masked in the cue. Larger f_q (e.g., 0.01) increases trace overlap and degrades masked-cue recall; smaller f_q can further improve recall but weakens repulsion.
- Theory–simulation agreement: Analytical predictions for trace size, fraction of active weights, and relative dissimilarity closely match simulations across M up to 30,000.
- Advantage over Random Projections: BTSP induces a bimodal distribution of input sums per neuron (plateau vs no plateau groups), enabling thresholds that are robust to masking. RPs lack this separation and fail under even modest masking.
- Comparable to or better than continuous-weight HFNs in sparse, partially connected regimes: Despite using binary weights, BTSP CAMs perform close to, and sometimes better than, HFNs with continuous weights on masked-cue retrieval and downstream linear classification robustness, especially as M grows.
- Overlapping items: BTSP stores and separates items with up to ≈30% common 1s while maintaining good recall from 33% masked cues.
- Input reconstruction (CAM property): Adding simple Hebbian-trained random feedback enables immediate input completion; scaled reconstruction error HD(x,r(x'))/HD(x,x') is substantially reduced and comparable to HFNs with continuous weights.
- Repulsion effect: BTSP reproduces human-like repulsion where traces for highly similar items (e.g., 40% overlap) become less overlapping than traces for unrelated items; strength depends critically on LTD and on f_q (values below ≈0.005 do not produce repulsion; increasing LTD probability strengthens repulsion).
- Time scale and gating probability linkage: The seconds-long plasticity window with plateau rate ≈0.0005/s yields an effective probability ≈0.005 that inputs fall into a plasticity window, near an optimal regime for trace robustness vs overlap.
- Scalability: Theory predicts excellent recall (even with up to ≈2/3 masking) for up to ≈800,000 items in scaled-up settings (human-CA3-sized inputs), though such large simulations were not executed.
- Neuromorphic relevance: BTSP CAMs with binary memristors (two resistance states) can match or exceed continuous-weight HFN CAMs while enabling on-chip, one-shot learning and instant recall.
Discussion
The work shows that BTSP, a biologically validated plasticity mechanism in CA1, provides an effective solution to content-addressable memory formation via one-shot learning. By exploiting stochastic gating via plateau potentials rather than postsynaptic firing, BTSP spreads memory allocation across neurons and avoids overuse of specific units, thus preserving earlier traces during continual learning. The resulting memory traces exhibit attractor-like robustness: masked cues map close to the original traces, enabling reliable recall and downstream classification. The bi-modality of synaptic input sums (plateau vs no plateau) provides a mechanistic basis for threshold separability that RPs lack. Compared with HFNs, BTSP achieves comparable CAM quality without requiring high-precision synaptic weights, iterative retrieval, or offline training, and remains effective for sparse, partially connected networks with overlapping items. The LTD component, often treated as a constraint, here delivers a functional benefit: a repulsion effect that separates traces of similar items, aligning with human hippocampal observations and supporting differential downstream processing. The seconds-long BTSP window, combined with the observed plateau rate, naturally sets f_q near a sweet spot (~0.005) balancing trace size, overlap, and recall robustness. These findings link synaptic plasticity mechanisms in CA1 to systems-level memory functions and suggest practical implementations using binary synapses in neuromorphic hardware.
Conclusion
This study introduces a simple, analytically tractable BTSP-based model that, with binary synapses and stochastic gating, creates a high-capacity content-addressable memory via one-shot learning. It achieves robust recall from partial cues, supports overlapping memories, matches or surpasses classical HFNs with continuous weights in sparse regimes, reproduces the human-like repulsion of similar memories, and enables immediate input reconstruction when combined with simple feedback learning. The seconds-long plasticity window and the resulting effective gating probability near 0.005 are critical for optimal performance. The results offer a biologically grounded, hardware-friendly alternative to HFN-based CAM, enabling on-chip learning with two-state memristors and instant recall. Future directions include exploring biologically detailed variants (e.g., multiple release sites vs binary weights), refining feedback pathways and learning rules for reconstruction, scaling hardware prototypes, and investigating broader regimes of f_q/LTD settings and network connectivity for optimized capacity and repulsion.
Limitations
- Simplified binary synapse model: Real synapses may have multiple discrete or continuous efficacy levels; the number of release sites per CA1 synapse is unknown.
- Independence assumptions: Arrival times of inputs and plateau potentials are assumed statistically independent; theoretical analysis assumes independent per-synapse updates (probability 0.5) to simplify parity handling.
- Threshold selection and architecture simplifications: Neurons are McCulloch-Pitts units without lateral CA1 connectivity; firing thresholds are optimized via grid search; biological thresholds for CA1 pyramidal cells are not well constrained.
- Scaling constraints: Simulations use downscaled network sizes; predictions for very large networks (e.g., up to ~800,000 items and 2/3 masking) are theoretical; full-scale simulations were not performed.
- Feedback modeling: Input reconstruction uses random feedback connectivity with a simple one-shot Hebbian-like rule; biological details of backprojections and their plasticity remain unclear.
- Parameter sensitivity: Performance, including repulsion strength, depends on plateau probability f_q and LTD probability; overly large f_q increases overlap and degrades recall, while too small f_q weakens repulsion.
- External comparisons: While HFNs with continuous weights are considered, binary HFN alternatives perform poorly; broader classes of modern associative memories requiring high-precision or higher-order interactions are not directly comparable within the same binary, online-learning constraints.
Related Publications
Explore these studies to deepen your understanding of the subject.