Recognition: unknown
NeuroRing: Scaling Spiking Neural Networks via Multi-FPGA Bidirectional Ring Topologies and Stream-Dataflow Architectures
Pith reviewed 2026-05-07 06:13 UTC · model grok-4.3
The pith
A bidirectional ring topology and stream-dataflow architecture on FPGAs enables scalable faster-than-real-time execution of large spiking neural networks while preserving activity statistics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NeuroRing implements a stream-dataflow architecture over a bidirectional ring topology on programmable FPGAs to accelerate spiking neural networks. The approach supports modular single- and multi-FPGA deployment and preserves the activity statistics of the reference cortical microcircuit model while achieving a real-time factor of 0.83 on two devices, along with strong and weak scaling and competitive energy efficiency.
What carries the argument
Bidirectional ring topology for inter-device communication combined with stream-dataflow architecture for efficient handling of sparse spike events.
If this is right
- The modular design allows adding more FPGAs to handle larger networks while keeping the same performance characteristics.
- Integration with existing simulation tools enables direct use in neuroscience workflows without major code changes.
- The achieved real-time factor of 0.83 on two devices demonstrates that reconfigurable hardware can outperform real-time biological timescales for full-scale models.
- Competitive energy efficiency positions the approach as viable for sustained large-scale event-driven computations.
Where Pith is reading between the lines
- The ring topology's benefits for sparse communication could extend to other event-driven hardware if the dataflow principles prove platform-independent.
- Testing with varied sparsity patterns would reveal whether the architecture needs tuning for different application domains beyond neural circuits.
- Hybrid systems combining this FPGA setup with conventional processors might handle mixed workloads more flexibly than either alone.
Load-bearing premise
The bidirectional ring topology and stream-dataflow architecture continue to avoid communication bottlenecks and synchronization overhead when scaled beyond two FPGAs or applied to networks with different spike sparsity patterns.
What would settle it
Running the cortical microcircuit on three or more FPGAs or on a network with denser spikes and observing a real-time factor above 1.0 or clear deviations in activity statistics would show the design fails to scale as claimed.
Figures
read the original abstract
Spiking neural networks (SNNs) are a promising paradigm for energy-efficient event-driven computation, but large-scale SNN execution remains challenging because sparse spike communication and synchronization can dominate runtime. Existing solutions across CPU, GPU, ASIC, and FPGA platforms offer different trade-offs between programmability, efficiency, and scalability. To address this gap, we present NeuroRing, a modular and scalable SNN accelerator based on a stream-dataflow architecture and a bidirectional ring topology, implemented in High-Level Synthesis (HLS) on programmable FPGAs. NeuroRing supports modular single- and multi-FPGA deployment and is compatible with existing SNN workflows through integration with the NEST simulator. We evaluate NeuroRing on the cortical microcircuit benchmark and a Sudoku constraint-satisfaction workload. Results show that NeuroRing preserves the key activity statistics of the NEST reference model, achieves faster-than-real-time execution of the full-scale cortical microcircuit with a real-time factor (RTF) of 0.83, exhibits meaningful strong and weak scaling, and provides competitive energy efficiency on two programmable FPGAs. These results position NeuroRing as a flexible and scalable platform for both neuroscience simulation and broader event-driven applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents NeuroRing, a modular SNN accelerator based on a bidirectional ring topology and stream-dataflow architecture implemented in HLS on programmable FPGAs. It integrates with the NEST simulator and is evaluated on the cortical microcircuit benchmark and a Sudoku workload, claiming preservation of key activity statistics from the NEST reference, faster-than-real-time execution with RTF of 0.83, meaningful strong and weak scaling, and competitive energy efficiency on a two-FPGA deployment.
Significance. If the multi-FPGA scaling claims hold, NeuroRing would offer a flexible, programmable platform that combines the workflow compatibility of software simulators like NEST with hardware acceleration for event-driven SNNs, potentially advancing large-scale neuroscience modeling and broader event-driven applications. The HLS implementation and modular design are strengths that support reproducibility and adaptability.
major comments (1)
- [§5 (Evaluation)] §5 (Evaluation): All reported quantitative results (RTF of 0.83, energy efficiency, and claims of 'meaningful strong and weak scaling' plus 'modular single- and multi-FPGA deployment') derive exclusively from a two-FPGA configuration. No analytic bound on communication volume, measured per-FPGA hop latency, or weak-scaling experiments that vary FPGA count while holding neurons per FPGA constant are provided. This is load-bearing for the central claim, as a bidirectional ring has O(n) worst-case hop count and cortical microcircuit spike traffic is sparse but temporally correlated, risking accumulated delays that could alter RTF or spike timing for n>2.
minor comments (2)
- [Abstract] Abstract: The claim that NeuroRing 'preserves the key activity statistics' should explicitly name the compared measures (e.g., firing rates, burst statistics, or pairwise correlations) rather than leaving them implicit.
- [§3 (Architecture)] §3 (Architecture): The stream-dataflow fabric description would benefit from a diagram or pseudocode clarifying how spike packets are serialized and routed to avoid ambiguity in the bidirectional ring implementation.
Simulated Author's Rebuttal
We thank the referee for the careful reading of the manuscript and the focused comment on the evaluation of scaling behavior. We address the points raised below and indicate the revisions we will make to strengthen the presentation of results.
read point-by-point responses
-
Referee: [§5 (Evaluation)] §5 (Evaluation): All reported quantitative results (RTF of 0.83, energy efficiency, and claims of 'meaningful strong and weak scaling' plus 'modular single- and multi-FPGA deployment') derive exclusively from a two-FPGA configuration. No analytic bound on communication volume, measured per-FPGA hop latency, or weak-scaling experiments that vary FPGA count while holding neurons per FPGA constant are provided. This is load-bearing for the central claim, as a bidirectional ring has O(n) worst-case hop count and cortical microcircuit spike traffic is sparse but temporally correlated, risking accumulated delays that could alter RTF or spike timing for n>2.
Authors: We agree that the primary quantitative results, including the RTF of 0.83 and energy-efficiency measurements, are reported for the two-FPGA configuration; this was the largest setup available in our laboratory. The manuscript does demonstrate modularity by directly comparing single-FPGA and two-FPGA executions of the identical cortical microcircuit, which constitutes a strong-scaling result for a fixed total neuron count. For weak scaling, the Sudoku workload experiments vary problem size on the available hardware, but we did not perform additional runs that increase FPGA count while holding neurons per FPGA constant beyond the two-device case. We will revise §5 to explicitly state the scope of the evaluated configurations and to qualify the scaling claims accordingly. We will also add an analytic bound on communication volume under the bidirectional ring and stream-dataflow model, together with estimated per-FPGA hop latencies derived from the topology and the pipelined spike transfer mechanism. On the risk of accumulated delays for n>2, the architecture pipelines transfers in both directions of the ring; because cortical connectivity is sparse and predominantly local, the average hop distance remains small even as ring size grows. We will include a short discussion of these factors and their implications for spike timing in the revised manuscript. revision: partial
Circularity Check
No circularity: empirical hardware evaluation with direct measurements
full rationale
The paper describes a concrete FPGA implementation of a stream-dataflow SNN accelerator using a bidirectional ring topology, evaluated empirically on the cortical microcircuit and Sudoku workloads. All quantitative claims (RTF of 0.83, preservation of NEST activity statistics, strong/weak scaling, energy efficiency) are presented as measured outcomes from the two-FPGA prototype rather than as predictions derived from equations, fitted parameters, or self-citations. No derivation chain, ansatz, uniqueness theorem, or renaming of known results appears; the central results are self-contained experimental data and do not reduce to their own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption NEST cortical microcircuit model produces biologically plausible activity statistics that serve as ground truth
invented entities (1)
-
NeuroRing architecture
no independent evidence
Reference graph
Works this paper leans on
-
[1]
In: Borah, S., Gandhi, T.K., Piuri, V
Agarwal, R., Ghosal, P., Murmu, N., Nandi, D.: Spiking Neural Network in Com- puter Vision: Techniques, Tools and Trends. In: Borah, S., Gandhi, T.K., Piuri, V. (eds.) Advanced Computational and Communication Paradigms. pp. 201–209. Springer Nature, Singapore (2023)
2023
-
[2]
Nest (neural simulation tool),
Gewaltig, M.O., Diesmann, M.: NEST (neural simulation tool). Scholarpedia2(4), 1430 (2007). https://doi.org/10.4249/scholarpedia.1430
-
[3]
International Journal of Neural Systems19(04), 295–308 (Aug 2009)
Ghosh-Dastidar, S., Adeli, H.: Spiking neural networks. International Journal of Neural Systems19(04), 295–308 (Aug 2009)
2009
-
[4]
Frontiers in Computational Neuroscience15(Feb 2021)
Golosio, B., Tiddia, G., De Luca, C., Pastorelli, E., Simula, F., Paolucci, P.S.: Fast Simulations of Highly-Connected Spiking Cortical Models Using GPUs. Frontiers in Computational Neuroscience15(Feb 2021)
2021
-
[5]
Frontiers in Neuroscience15 (Jan 2022)
Heittmann, A., Psychou, G., Trensch, G., Cox, C.E., Wilcke, W.W., Diesmann, M., Noll, T.G.: Simulating the Cortical Microcircuit Significantly Faster Than Real Time on the IBM INC-3000 Neural Supercomputer. Frontiers in Neuroscience15 (Jan 2022)
2022
-
[6]
J Gille, S Furber, A Rowley: Script controlling the simulation of a sin- gle game of Sudoku — NEST Simulator Documentation, https://nest- simulator.readthedocs.io/en/latest/auto_examples/sudoku/sudoku_solver.html
-
[7]
Kauth, K., Stadtmann, T., Sobhani, V., Gemmeke, T.: neuroAIx-Framework: de- sign of future neuroscience simulation systems exhibiting execution of the cortical microcircuitmodel20×fasterthanbiologicalreal-time.FrontiersinComputational Neuroscience17(Apr 2023)
2023
-
[8]
Neuromorphic Computing and Engineering2(2), 021001 (Mar 2022)
Kurth, A.C., Senk, J., Terhorst, D., Finnerty, J., Diesmann, M.: Sub-realtime sim- ulation of a neuronal network of natural density. Neuromorphic Computing and Engineering2(2), 021001 (Mar 2022)
2022
-
[9]
IEEE Access12, 150334–150353 (2024)
Lindqvist, B.A., Podobas, A.: Algorithms for Fast Spiking Neural Network Simu- lation on FPGAs. IEEE Access12, 150334–150353 (2024)
2024
-
[10]
IEEE Transactions on Artificial Intelligence6(8), 2061–2072 (Aug 2025)
Pignari, R., Fra, V., Macii, E., Urgese, G.: Efficient Solution Validation of Con- straint Satisfaction Problems on Neuromorphic Hardware: The Case of Sudoku Puzzles. IEEE Transactions on Artificial Intelligence6(8), 2061–2072 (Aug 2025)
2061
-
[11]
Cerebral Cortex (New York, NY)24(3), 785–806 (Mar 2014)
Potjans, T.C., Diesmann, M.: The Cell-Type Specific Cortical Microcircuit: Relat- ing Structure and Activity in a Full-Scale Spiking Network Model. Cerebral Cortex (New York, NY)24(3), 785–806 (Mar 2014)
2014
-
[12]
Philosoph- ical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences378(2164), 20190160 (Dec 2019)
Rhodes, O., Peres, L., Rowley, A.G.D., Gait, A., Plana, L.A., Brenninkmeijer, C., Furber, S.B.: Real-time cortical simulation on neuromorphic hardware. Philosoph- ical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences378(2164), 20190160 (Dec 2019)
2019
- [13]
-
[14]
Vreeken, J.: Spiking neural networks, an introduction (2023), https://webdoc.sub.gwdg.de/ebook/serien/ah/UU-CS/2003-008.pdf
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.