Spiking attention is a universal approximator of permutation-equivariant functions with ε-approximation requiring Ω(L_f² nd / ε²) spikes, but low effective dimensions (47-89) allow T=4 timesteps in practice.
Networks of spiking neurons: the third generation of neural network models.Neural Networks, 10(9):1659–1671
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9verdicts
UNVERDICTED 9roles
background 2polarities
background 2representative citing papers
CQP fuses magnitude and criticality into an importance metric for iterative SNN pruning, delivering 95.6% MNIST accuracy at 90% sparsity and 73% energy reduction at 70% sparsity.
A crossing activation function combined with virtual noise fields allows one neural network to learn multiple functions assigned to different noise locations, with capacity rising when noise arrangement matches function proximity.
Formulates pre-hoc fine-tuning prediction as stochastic estimation, proves lower bound on optimization variance decay rate, and introduces a three-regime predictability phase diagram.
SupraSNN introduces a superscalar-inspired SNN accelerator with decoupled synapse and neuron units, multi-cast/merge trees, and partitioning/scheduling that reports 47.6% lower latency and 5.6x better energy efficiency than prior FPGA SNN designs on MNIST and SHD tasks.
A 194M-parameter spiking dual-path model trained on 3B Chinese-English tokens achieves held-out PPL 8.88-8.93 at >89% per-element sparsity, trailing GPT-2 201M by 7.7% while showing that LIF temporal integration outperforms simple top-k masking at matched sparsity.
A neuron-astrocyte network with dual-timescale memory reduces median path lengths up to sixfold in partially observable grid-world navigation tasks.
TUNEAHEAD predicts fine-tuning performance from meta-features and short probes, reporting RMSE 1.47 and 95.1% of predictions within 3 points on 370 held-out runs of Qwen2.5-7B.
Joint sparse coding and temporal dynamics in mPFC and computational networks reduce cross-context interference and enhance separability, enabling better retention in lifelong learning without extra heuristics.
citing papers explorer
-
Closing the Theory-Practice Gap in Spiking Transformers via Effective Dimension
Spiking attention is a universal approximator of permutation-equivariant functions with ε-approximation requiring Ω(L_f² nd / ε²) spikes, but low effective dimensions (47-89) allow T=4 timesteps in practice.
-
Criticality-Constrained Iterative Pruning for Energy-Efficient Spiking Neural Networks via Combined Importance Scoring
CQP fuses magnitude and criticality into an importance metric for iterative SNN pruning, delivering 95.6% MNIST accuracy at 90% sparsity and 73% energy reduction at 70% sparsity.
-
Spatial Partial Functionalization of Neural Networks based on Noise Fields
A crossing activation function combined with virtual noise fields allows one neural network to learn multiple functions assigned to different noise locations, with capacity rising when noise arrangement matches function proximity.
-
A Risk Decomposition Framework for Pre-Hoc Fine-Tuning Prediction
Formulates pre-hoc fine-tuning prediction as stochastic estimation, proves lower bound on optimization variance decay rate, and introduces a three-regime predictability phase diagram.
-
SupraSNN: Exploiting Synapse-Level Parallelism in Spiking Neural Network Accelerators through Co-Optimized Mapping and Scheduling
SupraSNN introduces a superscalar-inspired SNN accelerator with decoupled synapse and neuron units, multi-cast/merge trees, and partitioning/scheduling that reports 47.6% lower latency and 5.6x better energy efficiency than prior FPGA SNN designs on MNIST and SHD tasks.
-
SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence
A 194M-parameter spiking dual-path model trained on 3B Chinese-English tokens achieves held-out PPL 8.88-8.93 at >89% per-element sparsity, trailing GPT-2 201M by 7.7% while showing that LIF temporal integration outperforms simple top-k masking at matched sparsity.
-
Dual-Timescale Memory in a Spiking Neuron-Astrocyte Network for Efficient Navigation
A neuron-astrocyte network with dual-timescale memory reduces median path lengths up to sixfold in partially observable grid-world navigation tasks.
-
TuneAhead: Predicting Fine-tuning Performance Before Full Training Begins
TUNEAHEAD predicts fine-tuning performance from meta-features and short probes, reporting RMSE 1.47 and 95.1% of predictions within 3 points on 370 held-out runs of Qwen2.5-7B.
-
Joint sparse coding and temporal dynamics support context reconfiguration
Joint sparse coding and temporal dynamics in mPFC and computational networks reduce cross-context interference and enhance separability, enabling better retention in lifelong learning without extra heuristics.