Computer Science
Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip
M. Yao, O. Richter, et al.
Experience Speck — an asynchronous sensing–computing SoC with an always-on design (processor resting power 0.42 mW, total power down to 0.70 mW) that leverages an attention-based dynamic framework to fix SNN “dynamic imbalance” and enable input-dependent, ultra-low-latency, energy-adaptive operation. Research conducted by the authors present in the Authors tag.
~3 min • Beginner • English
Introduction
Resource and energy constraints limit deployment of conventional AI on edge platforms. Neuromorphic computing offers low-power advantages by abstracting brain computations at neuron and synapse levels, demonstrated in platforms like BrainScales, SpiNNaker, Neurogrid, TrueNorth, Darwin, Loihi, and Tianjic. Beyond low-level abstractions, the brain dynamically allocates resources via attention, implying two properties for dynamic computing: minimal energy without input and strongly input-dependent energy consumption. The paper investigates how to incorporate such high-level dynamic mechanisms into neuromorphic systems. It identifies the dynamic imbalance in vanilla SNNs—near-constant network spiking firing rates over time and similar activated subnetworks across inputs—undermining the expected energy variance. The work’s goal is to realize dynamic computing via co-design: an asynchronous Speck SoC with event-driven sensing and computing, and an attention-based dynamic SNN framework that modulates spiking based on input importance.
Literature Review
The paper contrasts neuromorphic and traditional AI from a dynamic computing perspective. SNNs naturally use dynamic computational graphs with sparse spike-based activation, whereas ANNs on GPUs typically execute static graphs regardless of zero inputs. Dynamic ANNs have been proposed to adapt computation to inputs, but practical efficiency lags due to hardware mismatches on general-purpose chips. Existing neuromorphic platforms demonstrate energy-efficient spiking computation, yet high-level brain mechanisms (e.g., attention) remain underexplored. The authors discuss resting power vs. running power, emphasizing that high resting power can negate algorithmic energy savings. They also note SNNs’ spatio-temporal invariance (shared weights across space and time) as a cause of dynamic imbalance, referencing prior attention-SNN works that improve performance and sparsity but lack a unified framework for deployment on neuromorphic chips.
Methodology
Hardware (Speck SoC): Speck integrates a 128×128 Dynamic Vision Sensor (DVS) with a fully asynchronous, spike-based neuromorphic processor on a single die (6.1 mm × 4.9 mm). It comprises a pixel-parallel event sensor, event pre-processing core, a Network-on-Chip (NoC) router, nine SNN cores (328k neurons; up to 11,000 neurons/mm²), and a readout core. Asynchronous digital logic uses 4-phase handshakes and QDI dual-rail encoding, eliminating global clocks and minimizing idle power. The event-driven pipeline processes spikes with per-layer latency in the 120 ns–7 μs range and end-to-end minimum delay of 3.36 μs per spike. Event-driven convolution maps each incoming event (address and polarity) to affected output neurons via address mapping, synaptic kernel memory read (8-bit weights), compute-in-memory IF neurons (16-bit states), bias/leak updates, pooling, and routing. The DVS features per-pixel handshake buffers, arbitration trees, and AER outputs; typical active data rates are 50–100 Mb/s on-chip.
Software toolchain: The Sinabs framework (PyTorch-based) supports SNN training and model mapping to Speck (quantization and configuration), with plugins like Sinabs-Speck and EXODUS; Tonic for dataset management; Samna for device APIs, runtime, and event stream filtering with JIT support.
Spiking neuron models: LIF and IF neurons are supported on Speck; an M-IF (Multi-spike IF) model is introduced in software to mitigate accuracy loss from synchronous training and asynchronous deployment by allowing multiple spikes per timestep when membrane potential exceeds threshold. Training uses frame-based representations aggregated from event streams with small dt and larger T.
Attention-based dynamic framework: A general plug-and-play framework optimizes membrane potential U via attention-based refinement and masking: U ← U − W_{τ(θ,U)} ⊙ U, where W_{τ(θ,U)} are input-dependent factors from policy τ. Temporal-wise and channel-wise attention capture global information via pooling, model long-range dependencies with shared two-layer FC networks (sigmoid outputs), and either refine (scale) features or mask (winner-take-all top-K). Implementations broadcast attention vectors across dimensions. This optimization is mathematically equivalent to input-dependent weight modulation, mitigating spatio-temporal invariance.
Evaluation setup: Four event-based benchmarks—DVS128 Gesture, DVS128 Gait-day, DVS128 Gait-night, and HAR-DVS—are used. Baseline networks include lightweight 3-layer and 5-layer Conv-based LIF-SNNs and Res-SNN-18 for HAR-DVS. Ablations vary attention dimensions (temporal/channel) and masking ratios (α, β) and compare vanilla vs. dynamic SNNs. Power and latency on Speck are measured via real-time on-chip monitors (RAM and Logic rails). GPU baselines use RTX 3090 with batch size 1 and matched time windows (540 ms). On Speck deployment, temporal-wise attention masking is applied at the input layer (mask ratio 0.5) to enable online operation.
Key Findings
Hardware: Speck’s processor resting power is 0.42 mW; end-to-end minimum spike latency is 3.36 μs; system demonstrates real-time total power as low as 0.70 mW in dynamic operation.
Dynamic imbalance: Vanilla SNNs exhibit near-constant NSFR across timesteps, implying similar activated subnetwork sizes regardless of input. The attention-based framework significantly reduces NASFR and balances spiking responses.
Accuracy and sparsity gains (Ablations): On Gesture/Gait-day/Gait-night, attention-based refine (AR) improves accuracy and reduces spiking; combined temporal+channel AR (TC-AR) yields best accuracy. Example reported approximations: Gesture accuracy rises from ~88% (vanilla) to ~94% (TC-AR), NASFR drops from ~0.24 to ~0.06; Gait-day accuracy ~79%→~84%, NASFR ~0.08→~0.03; Gait-night accuracy ~93%→~97%, NASFR ~0.20→~0.12. Masking increases sparsity but can reduce accuracy depending on α, β.
On-chip deployment (Table 1): Compared to GPU (resting power ~30 W dominating total power), Speck achieves mW-level total power and <0.1 ms per-sample latency. On Speck, dynamic SNNs reduce total power and spike counts substantially while improving accuracy: Gesture accuracy 81.0%→90.0% (+9.0%), total power 9.5 mW→3.8 mW (−60.0%), spikes 1.0×10³→0.4×10³ (−60.0%); Gait-day accuracy 86.0%→90.0% (+4.0%), power 16.1 mW→7.3 mW (−54.7%), spikes 2.9×10³→1.2×10³ (−58.6%); Gait-night accuracy 86.0%→91.0% (+5.0%), power 46.8 mW→12.3 mW (−73.7%), spikes 3.3×10³→1.5×10³ (−54.5%). Lowest observed sample total power is 0.70 mW. On GPU, despite accuracy gains (e.g., Gesture 82.3%→92.0%), total power remains ~30,079 mW due to high resting power.
Real-time power: Speck’s power traces show near-zero resting power and clear fluctuation with event activity; dynamic input masking produces lower peak and total power than vanilla operation.
Large-scale dataset: On HAR-DVS with Res-SNN-18, dynamic SNN achieves 46.7% accuracy, approaching ANN results, demonstrating scalability of the framework.
Discussion
The study demonstrates that realizing dynamic computing’s energy-efficiency requires co-design across algorithms, software, and hardware. High resting power can negate dynamic algorithm gains; Speck’s asynchronous design satisfies the hardware prerequisite of minimal idle power. At the algorithmic level, attention mitigates SNNs’ dynamic imbalance rooted in spatio-temporal invariance, enabling discriminative, input-dependent spiking and energy variance. Structurally and functionally, dynamic SNNs correspond to brain attention mechanisms across circuit, area, neuron, and synaptic levels, focusing on salient information while suppressing noise spikes to simultaneously improve accuracy and reduce energy. The integrated sensing–computing architecture triggers computation only when events occur, delivering mW-level power and ms-level latency suitable for edge scenarios. The work suggests neuromorphic computing should incorporate higher-level brain abstractions beyond neuron-level models to control network responses effectively.
Conclusion
The paper introduces Speck, an eye–brain integrated, fully asynchronous sensing–computing neuromorphic SoC, and a unified attention-based dynamic SNN framework. Together they demonstrate practical dynamic computing with low resting power, input-dependent energy, and improved accuracy at ultra-low latency. Deployments on public event datasets validate 3× energy reductions and significant accuracy gains on Speck. Future directions include expanding supported network types, mixed-precision computing, improving modeling flexibility and precision under energy constraints, and integrating more sophisticated high-level brain mechanisms into neuromorphic hardware for broader real-world applications.
Limitations
Reported Speck power measurements exclude the DVS camera power in benchmark tables, focusing on the processor. The medium-scale architecture prioritizes energy efficiency over modeling flexibility and precision, which may limit supported network types or on-chip complexity. Synchronous training vs. asynchronous deployment can cause accuracy discrepancies; mitigation via M-IF and careful dt/T settings still requires trade-offs. Attention-based masking reduces spikes but can degrade accuracy depending on masking ratios. Large deep-network performance (e.g., HAR-DVS) remains below top ANN results. Dynamic algorithm benefits depend on low resting power; platforms with high idle power will not realize practical energy savings.
Related Publications
Explore these studies to deepen your understanding of the subject.

