arxiv: 2604.15997 · v1 · submitted 2026-04-17 · 💻 cs.NE

Recognition: unknown

Combining Convolution and Delay Learning in Recurrent Spiking Neural Networks

Eleonora Cicciarella, L\'ucio Folly Sanches Zebendo, Michele Rossi

Pith reviewed 2026-05-10 07:07 UTC · model grok-4.3

classification 💻 cs.NE

keywords spiking neural networksrecurrent connectionsconvolutional layersaxonal delay learningaudio classificationparameter efficiencyedge systemstemporal modeling

0 comments

The pith

Convolutional recurrent connections paired with learned axonal delays reduce recurrent parameters by 99% and speed inference 52 times in spiking networks while retaining accuracy on audio classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends a recurrent spiking neural network approach that learns axonal delays by replacing fully connected recurrent layers with convolutional ones. Tests on audio classification show the change preserves accuracy while cutting recurrent parameters dramatically and accelerating inference. A sympathetic reader cares because spiking networks target resource-limited edge devices, and this makes recurrent versions practical for handling temporal signals without the full cost of dense connections. The work demonstrates that local connectivity patterns suffice to keep the benefits of delay learning intact in this setting.

Core claim

Integrating convolutional recurrent connections with the delay learning mechanism produces a streamlined recurrent spiking neural network architecture that matches the accuracy of the fully connected baseline on audio classification tasks but requires far fewer recurrent parameters and runs substantially faster at inference time.

What carries the argument

Convolutional recurrent connections that replace dense recurrent layers while combined with runtime learning of axonal delays, enabling parameter sharing for temporal modeling in spiking networks.

If this is right

Recurrent spiking networks become viable for memory-constrained edge hardware due to the sharp drop in recurrent parameters.
Inference becomes fast enough for real-time audio processing while keeping the advantages of learned delays.
The modeling benefits of delay learning survive the switch to convolutional connectivity on temporal classification tasks.
Overall network design simplifies without loss of performance for the tested audio scenario.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same convolutional replacement could be tested on other sequential data types where local structure is present.
This points toward hybrid architectures that mix convolution and delay learning for broader sensor applications.
Efficiency gains may compound if the approach is combined with other spiking network optimizations on hardware.

Load-bearing premise

Convolutional recurrent connections can capture the same temporal dependencies as fully connected recurrent layers when axonal delays are also learned.

What would settle it

A side-by-side accuracy comparison on the audio classification task where the convolutional version falls short of the fully connected delay-learning baseline by more than a small margin.

Figures

Figures reproduced from arXiv: 2604.15997 by Eleonora Cicciarella, L\'ucio Folly Sanches Zebendo, Michele Rossi.

**Figure 2.** Figure 2: Design of the convolutional recurrent delay unit. The feedforward [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Spiking neural networks (SNNs) are rapidly gaining momentum as an alternative to conventional artificial neural networks in resource constrained edge systems. In this work, we continue a recent research line on recurrent SNNs where axonal delays are learned at runtime along with the other network parameters. The first proposed approach, dubbed DelRec, demonstrated the benefit of recurrent delay learning in SNNs. Here, we extend it by advocating the use of convolutional recurrent connections in conjunction with the DelRec delay learning mechanism. According to our tests on an audio classification task, this leads to a streamlined architecture with smaller memory footprint (around 99% savings in terms of number of recurrent parameters) and a much faster (52x) inference time, while retaining DelRec's accuracy. Our code is available at: https://github.com/luciozebendo/delrec_snn/tree/conv_delays

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The conv-DelRec extension delivers the claimed 99% recurrent parameter cut and 52x speedup on audio classification while matching prior accuracy, but the write-up gives almost no experimental controls or ablations to show why.

read the letter

The core move is replacing the dense recurrent matrix in DelRec with convolutional recurrent connections while still learning axonal delays. On their audio task this produces the reported efficiency numbers without an obvious accuracy drop. That combination is new enough to be worth noting for anyone building recurrent SNNs for edge hardware. The public code link is also a plus; it lets others check the implementation directly. Beyond that the paper stays close to the earlier DelRec line rather than introducing new theory or training methods. The main weakness is the lack of supporting detail. The abstract states the 99% and 52x figures but does not describe the network sizes, spike encoding, training protocol, baselines, or any ablation that isolates the convolutional substitution. Without those it is impossible to judge whether the accuracy holds because the convolutional structure truly preserves the useful delay dynamics or because the task is forgiving. The stress-test worry about shared kernels reducing per-synapse delay expressivity is therefore still open; the paper does not address it. The work is honest and incremental. It will interest researchers already working on efficient recurrent SNNs who need concrete efficiency numbers on a real task. For a broader audience the missing controls make it hard to assess how far the result generalizes. I would send it to peer review once the authors add the experimental section with baselines, ablations, and statistical reporting. The underlying idea is clear enough that referees can evaluate it properly after those additions.

Referee Report

2 major / 2 minor

Summary. The manuscript extends the DelRec recurrent spiking neural network, which learns axonal delays alongside weights, by replacing fully-connected recurrent layers with convolutional recurrent connections. On an audio classification task, the resulting ConvDelRec architecture is reported to achieve ~99% reduction in recurrent parameters and 52x faster inference while retaining DelRec accuracy. The code is released at a public GitHub repository.

Significance. If the empirical results hold under rigorous validation, the work would be significant for neuromorphic and edge-computing applications of SNNs, as it shows how convolutional parameter sharing can be combined with delay learning to obtain large efficiency gains without apparent accuracy loss. The public code release is a clear strength that supports reproducibility and future extensions.

major comments (2)

[§4 and Table 1] §4 (Experiments) and Table 1: the headline claim of retained accuracy with 99% recurrent-parameter reduction rests on a direct comparison to DelRec, yet no ablation is presented that holds neuron count, spike encoding, and training protocol fixed while swapping only the recurrent connectivity type. Without this, it is impossible to determine whether convolutional recurrence preserves the per-synapse delay-learning expressivity of the original fully-connected DelRec or whether the audio task simply does not expose any loss of modeling power.
[§3.2] §3.2 (Convolutional recurrent layer with delays): the description does not specify whether axonal delays are learned independently per spatial position or are shared across the convolutional kernel. If delays are shared, the claimed parameter savings are achieved only by reducing the very expressivity that DelRec was designed to provide; if delays remain per-position, most of the 99% savings disappear. This architectural detail is load-bearing for the central efficiency claim.

minor comments (2)

[Abstract and §4] The abstract and §4 do not name the specific audio dataset, its size, or the exact train/validation/test splits, making it difficult to contextualize the reported accuracy numbers or to reproduce the experiments from the provided code link alone.
[§4] No mention is made of the number of independent runs, standard deviations, or statistical tests supporting the accuracy-retention claim; adding these would strengthen the empirical section without altering the core contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments identify two important areas where the manuscript can be strengthened: the need for a controlled ablation of recurrent connectivity and explicit clarification of how delays are handled in the convolutional layers. We address both points below and will incorporate revisions to improve the rigor and clarity of the presentation.

read point-by-point responses

Referee: [§4 and Table 1] §4 (Experiments) and Table 1: the headline claim of retained accuracy with 99% recurrent-parameter reduction rests on a direct comparison to DelRec, yet no ablation is presented that holds neuron count, spike encoding, and training protocol fixed while swapping only the recurrent connectivity type. Without this, it is impossible to determine whether convolutional recurrence preserves the per-synapse delay-learning expressivity of the original fully-connected DelRec or whether the audio task simply does not expose any loss of modeling power.

Authors: We agree that a controlled ablation isolating only the change from fully-connected to convolutional recurrent connectivity would provide stronger evidence. Our current experiments compare the complete ConvDelRec model against the published DelRec baseline on the same audio task, using comparable neuron counts and the same spike encoding and training protocol. However, this does not fully decouple the connectivity change from other implementation details. In the revised manuscript we will add a new ablation experiment that starts from the DelRec architecture and replaces only its recurrent layers with the convolutional-delay version while keeping neuron count, spike encoding, and training protocol identical. The results of this ablation will be reported in an updated Table 1 and discussed in §4. revision: yes
Referee: [§3.2] §3.2 (Convolutional recurrent layer with delays): the description does not specify whether axonal delays are learned independently per spatial position or are shared across the convolutional kernel. If delays are shared, the claimed parameter savings are achieved only by reducing the very expressivity that DelRec was designed to provide; if delays remain per-position, most of the 99% savings disappear. This architectural detail is load-bearing for the central efficiency claim.

Authors: We thank the referee for pointing out this ambiguity. In the ConvDelRec architecture the axonal delays are learned once per convolutional kernel and are therefore shared across all spatial positions to which the kernel is applied. This sharing is the mechanism that produces the reported ~99 % reduction in recurrent parameters. We acknowledge that the design trades per-position delay expressivity for parameter efficiency, and that the audio task may not require the full per-position diversity present in fully-connected DelRec. We will revise §3.2 to state this sharing rule explicitly, include the corresponding parameter-count formulas, and add a short discussion of the expressivity trade-off and the conditions under which per-position delays might be preferable. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical architecture extension with no derivations

full rationale

The paper proposes combining convolutional recurrent connections with the existing DelRec delay-learning mechanism and validates the resulting architecture solely through experiments on an audio classification task. No mathematical derivations, first-principles predictions, or equations appear in the provided text; claims of 99% recurrent-parameter reduction, 52x faster inference, and retained accuracy are presented as direct outcomes of the reported tests rather than reductions to fitted inputs or self-citations. The central modeling assumption (that convolutional recurrence preserves DelRec's temporal integration power) is treated as an empirical question to be checked, not as a definitional or self-referential premise. No load-bearing self-citation chains or ansatz smuggling are present.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical machine-learning study focused on architectural modification and benchmarking. No free parameters, axioms, or invented entities are described in the abstract.

pith-pipeline@v0.9.0 · 5457 in / 1096 out tokens · 76089 ms · 2026-05-10T07:07:10.240968+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references

[1]

DelRec: learning delays in recurrent spiking neural networks,

A. Queant, U. Ranc ¸on, B. R. Cottereau, and T. Masquelier, “DelRec: learning delays in recurrent spiking neural networks,” 2025

2025
[2]

Surrogate gradient learning in spiking neural networks,

E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate gradient learning in spiking neural networks,”IEEE Signal Process. Mag., vol. 36, 2019

2019
[3]

A surrogate gradient spiking baseline for speech command recognition,

A. Bittar and P. N. Garner, “A surrogate gradient spiking baseline for speech command recognition,”Frontiers in Neuroscience, vol. 16, 2022

2022
[4]

Advancing spatio-temporal processing through adaptation in spiking neural net- works,

M. Baronig, R. Ferrand, S. Sabathiel, and R. Legenstein, “Advancing spatio-temporal processing through adaptation in spiking neural net- works,”Nature Communications, vol. 16, 2025

2025
[5]

Co-learning synaptic delays, weights and adaptation in spiking neural networks,

L. Deckers, L. Van Damme, W. Van Leekwijck, I. J. Tsang, and S. Latr ´e, “Co-learning synaptic delays, weights and adaptation in spiking neural networks,”Frontiers in Neuroscience, vol. 18, 2024

2024
[6]

Delgrad: Exact event-based gradients for training delays and weights on spiking neuromorphic hardware,

J. G ¨oltzet al., “Delgrad: Exact event-based gradients for training delays and weights on spiking neuromorphic hardware,”Nature Communica- tions, vol. 16, 2025

2025
[7]

ASRC-SNN: Adaptive skip recurrent connection spiking neural network,

S. Xu, J. Zhang, Z. Wang, R. Jiang, R. Yan, and H. Tang, “ASRC-SNN: Adaptive skip recurrent connection spiking neural network,” 2025

2025
[8]

Training spiking neural networks using lessons from deep learning,

J. K. Eshraghian, M. Ward, E. O. Neftci, X. Wang, G. Lenz, and G. Dwivedi, “Training spiking neural networks using lessons from deep learning,”Proceedings of the IEEE, vol. 111, 2023

2023
[9]

Neuromorphic silicon neurons and large- scale neural networks: challenges and opportunities,

C.-S. Poon and K. Zhou, “Neuromorphic silicon neurons and large- scale neural networks: challenges and opportunities,”Frontiers in Neu- roscience, vol. 5, 2011

2011
[10]

Learning delays in spiking neural networks using dilated convolutions with learnable spacings,

I. Hammouamri, I. Khalfaoui-Hassani, and T. Masquelier, “Learning delays in spiking neural networks using dilated convolutions with learnable spacings,” inInt. Conf. Learn. Represent. (ICLR), 2024

2024
[11]

The Heidelberg spiking data sets for the systematic evaluation of spiking neural net- works,

B. Cramer, Y . Stradmann, J. Schemmel, and F. Zenke, “The Heidelberg spiking data sets for the systematic evaluation of spiking neural net- works,”IEEE Transactions on Neural Networks and Learning Systems, vol. 33, 2022

2022