pith. sign in

arxiv: 2605.15355 · v1 · pith:A7NOOQKBnew · submitted 2026-05-14 · 💻 cs.LG

Federated Learning of Spiking Neural Networks under Heterogeneous Temporal Resolutions

Pith reviewed 2026-05-19 16:50 UTC · model grok-4.3

classification 💻 cs.LG
keywords federated learningspiking neural networkstemporal resolutionheterogeneous datamodel aggregationedge devicestime series
0
0 comments X p. Extension
pith:A7NOOQKB Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{A7NOOQKB}

Prints a linked pith:A7NOOQKB badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

Adaptation methods allow federated spiking neural networks to recover accuracy lost when clients sample data at different temporal resolutions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework for federated learning of spiking neural networks where edge devices collect time-series data at varying temporal resolutions due to hardware constraints. It shows that standard federated averaging of parameters performs poorly under such mismatch, yet specific adaptation techniques for integrating neuron parameters and states can substantially restore the lost accuracy. A sympathetic reader cares because this setup lets resource-limited devices keep their efficient local sampling rates, collaborate without sharing raw data, and retain the sparse spiking that makes SNNs energy-efficient. The evaluation covers two standard SNN datasets across multiple heterogeneity scenarios to demonstrate the recovery.

Core claim

The central claim is that by designing adaptation methods to integrate neuron parameters learned from data at different temporal resolutions during model aggregation, the federated process can remain effective for SNNs, enabling each client to train at its local resolution while producing a compatible global model that recovers most accuracy otherwise lost to the temporal mismatch.

What carries the argument

Adaptation rules for rescaling and combining stateful neuron parameters during federated averaging to account for differences in temporal resolution.

If this is right

  • Each client can train using only its native temporal resolution without forcing data resampling or uniform hardware.
  • The resulting global model remains compatible and effective for inference regardless of the resolutions used by individual clients.
  • The sparse spike communication and associated energy savings of SNNs are preserved under the adapted aggregation.
  • Performance approaches that of homogeneous-resolution federated training on the tested audio and gesture datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same adaptation idea could apply to federated learning of other stateful models such as certain recurrent networks.
  • Clients might deliberately select lower resolutions to conserve energy with only small effects on global performance if the rules are applied.
  • The framework could be extended to cases where resolution changes dynamically during the training process.

Load-bearing premise

Neuron parameters learned at one temporal resolution can be meaningfully integrated with those from another through the proposed aggregation rules without requiring changes to the underlying SNN architecture or loss of spike sparsity benefits.

What would settle it

An experiment showing that the adaptation methods produce no meaningful accuracy gain over naive averaging when resolution differences are large on the SHD or DVS-Gesture datasets would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.15355 by Ay\c{c}a \"Oz\c{c}elikkale, Sanja Karilanova, Subhrakanti Dey.

Figure 1
Figure 1. Figure 1: Overview of the federated learning setup, illustrated with a smartwatch example. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustrative Network Architecture clients back. Local models are naturally tied to temporal resolution: a model trained in Tk may not generalize directly to data sampled in Tj ̸= Tk. Naive parameter averaging therefore conflates models operating at different temporal resolutions, which can degrade the global model. This article focuses on the question: How should models, which are trained using data with d… view at source ↗
read the original abstract

Spiking neural networks (SNNs) are biologically inspired energy-efficient models that use sparse binary spike-based communication between neurons, making them attractive for resource-constrained edge devices. Federated learning enables such devices to train collaboratively without sharing raw data. In time-series applications, edge devices often collect data at different time resolutions due to hardware and energy constraints. This temporal heterogeneity poses a fundamental challenge for federated learning: parameters learned at one temporal resolution do not necessarily transfer directly to another, which might result in the naive federated averaging being ineffective. Targeting SNNs and, more broadly, deep networks with stateful neurons, we propose a federated learning framework that addresses this temporal resolution mismatch. We investigate how neuron parameters learned from data at different temporal resolutions and model aggregation should be integrated. We evaluate the proposed framework across two SNN-native benchmark datasets (SHD and DVS-Gesture) under a range of resolution heterogeneity scenarios. Our results show that the proposed adaptation methods can substantially recover accuracy lost due to temporal mismatch, hence enabling each client to train at their local temporal resolution while remaining compatible with the global model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a federated learning framework for spiking neural networks (SNNs) that addresses temporal resolution heterogeneity across clients in time-series applications. It investigates methods for integrating neuron parameters (weights, thresholds, and stateful dynamics) learned at different temporal resolutions and evaluates adaptation rules during model aggregation. Experiments on SHD and DVS-Gesture benchmarks under varied heterogeneity scenarios claim that the proposed adaptations substantially recover accuracy lost from naive federated averaging, allowing clients to train locally at native resolutions while maintaining global model compatibility.

Significance. If the central claim holds with rigorous validation, the result would be significant for practical deployment of energy-efficient SNNs on heterogeneous edge devices, where temporal sampling rates vary due to hardware constraints. The focus on stateful neuron dynamics in FL is timely, and use of public SNN benchmarks strengthens reproducibility. However, the absence of explicit handling for discretization effects in the provided description limits immediate impact assessment.

major comments (2)
  1. [Methods (adaptation and aggregation)] Methods section on adaptation rules: The description of how neuron parameters from mismatched temporal resolutions are integrated during aggregation does not specify whether leak factors, time constants, or thresholds are rescaled to account for different Δt. Since SNN membrane dynamics follow discretized differential equations (e.g., leak factor α = exp(-Δt/τ)), simple averaging or adaptation without explicit rescaling risks invalidating the dynamics and undermining the compatibility claim. Please provide the exact mathematical formulation of the adaptation rules and any invariance arguments.
  2. [Experiments and results] Experimental results (likely §5 or tables): The abstract asserts that adaptation methods 'substantially recover accuracy' but provides no quantitative metrics, error bars, baseline comparisons (e.g., vs. naive FedAvg or resolution-normalized variants), or heterogeneity schedule details. This makes it impossible to evaluate whether recovered accuracy is general or an artifact of specific setups. Include full tables with per-scenario accuracies, standard deviations over runs, and ablation on the adaptation components.
minor comments (2)
  1. [Introduction and methods] Clarify notation for temporal resolution (e.g., consistent use of Δt vs. sampling rate) throughout to avoid ambiguity in the heterogeneity scenarios.
  2. [Figures] Ensure all figures plotting accuracy vs. heterogeneity levels include error bars and label the exact adaptation variants compared.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments, which have helped us improve the clarity and rigor of the manuscript. We address each major comment point by point below and have made corresponding revisions to strengthen the presentation of the adaptation rules and experimental results.

read point-by-point responses
  1. Referee: [Methods (adaptation and aggregation)] Methods section on adaptation rules: The description of how neuron parameters from mismatched temporal resolutions are integrated during aggregation does not specify whether leak factors, time constants, or thresholds are rescaled to account for different Δt. Since SNN membrane dynamics follow discretized differential equations (e.g., leak factor α = exp(-Δt/τ)), simple averaging or adaptation without explicit rescaling risks invalidating the dynamics and undermining the compatibility claim. Please provide the exact mathematical formulation of the adaptation rules and any invariance arguments.

    Authors: We appreciate the referee's emphasis on preserving the underlying dynamics. The original manuscript described the adaptation at a conceptual level in Section 4, focusing on parameter integration strategies without fully expanding the rescaling mechanics for temporal parameters. We agree this requires explicit treatment. In the revised manuscript, we now include the precise formulations: for a client operating at timestep Δt_i, the leak factor is rescaled as α_i = exp(-Δt_i / τ) with the time constant τ held invariant; thresholds are adjusted proportionally to the effective integration window, and membrane states are normalized by the ratio of time constants. We also add a brief invariance argument demonstrating that these rescalings maintain equivalent spike-timing behavior under linear time transformations, ensuring aggregation produces a compatible global model. These additions directly address the discretization concern. revision: yes

  2. Referee: [Experiments and results] Experimental results (likely §5 or tables): The abstract asserts that adaptation methods 'substantially recover accuracy' but provides no quantitative metrics, error bars, baseline comparisons (e.g., vs. naive FedAvg or resolution-normalized variants), or heterogeneity schedule details. This makes it impossible to evaluate whether recovered accuracy is general or an artifact of specific setups. Include full tables with per-scenario accuracies, standard deviations over runs, and ablation on the adaptation components.

    Authors: We acknowledge that the abstract and results section were presented concisely, which limited the ability to assess the quantitative strength of the claims. In the revised manuscript, we have expanded Section 5 with full tables reporting mean accuracy and standard deviation (over 5 random seeds) for each heterogeneity scenario on both SHD and DVS-Gesture. These tables now include direct comparisons to naive FedAvg, resolution-normalized baselines, and per-client local training. We also detail the exact heterogeneity schedules (specific Δt values assigned to clients) and provide an ablation study isolating the contribution of each adaptation component (weights, thresholds, and stateful dynamics). This allows readers to evaluate the generality of the accuracy recovery. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evaluation on public benchmarks

full rationale

The paper proposes adaptation methods for federated SNN training under temporal resolution heterogeneity and evaluates them directly on SHD and DVS-Gesture datasets. No derivation chain, first-principles result, or prediction is shown to reduce by construction to fitted inputs or self-citations. The central claims rest on experimental recovery of accuracy rather than any self-definitional, fitted-input, or uniqueness-imported step. This is a standard empirical study that remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are described. The central claim rests on the unstated assumption that temporal mismatch can be mitigated by parameter adaptation without further architectural constraints.

pith-pipeline@v0.9.0 · 5737 in / 1098 out tokens · 38913 ms · 2026-05-19T16:50:40.580981+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages

  1. [1]

    Brendan McMahan

    Peter Kairouz and H. Brendan McMahan. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1-2):1–210, 06 2021. 9

  2. [2]

    Parizi, and Fahad Saeed

    Mohammed Aledhari, Rehma Razzak, Reza M. Parizi, and Fahad Saeed. Federated learning: A survey on enabling technologies, protocols, and applications.IEEE Access, 8:140699–140725, 2020

  3. [3]

    Data centers on wheels: Emissions from computing onboard autonomous vehicles.IEEE Micro, 43(1):29–39, 2023

    Soumya Sudhakar, Vivienne Sze, and Sertac Karaman. Data centers on wheels: Emissions from computing onboard autonomous vehicles.IEEE Micro, 43(1):29–39, 2023

  4. [4]

    Bipin Rajendran, Abu Sebastian, Michael Schmuker, Narayan Srinivasa, and Evangelos Eleft- heriou. Low-power neuromorphic hardware for signal processing applications: A review of architectural and system-level design approaches.IEEE Signal Processing Magazine, 36(6):97– 110, 2019

  5. [5]

    Fonseca Guerra, Prasad Joshi, Philipp Plank, and Sumedh R

    Mike Davies, Andreas Wild, Garrick Orchard, Yulia Sandamirskaya, Gabriel A. Fonseca Guerra, Prasad Joshi, Philipp Plank, and Sumedh R. Risbud. Advancing neuromorphic computing with Loihi: A survey of results and outlook.Proceedings of the IEEE, 109(5):911–934, 2021

  6. [6]

    Jason Yik and et. al. Neurobench: A framework for benchmarking neuromorphic computing algorithms and systems.Nature Communications, 16, 2025

  7. [7]

    Dieter, S

    W.R. Dieter, S. Datta, and Wong Key Kai. Power reduction by varying sampling rate. InProc. of the Inter. Symp. on Low Power Electronics and Design, pages 227–232, 2005

  8. [8]

    Big data reduction methods: a survey.Data Science and Engineering, 1:265–284, 2016

    Muhammad Habib ur Rehman, Chee Sun Liew, Assad Abbas, Prem Prakash Jayaraman, Teh Ying Wah, and Samee U Khan. Big data reduction methods: a survey.Data Science and Engineering, 1:265–284, 2016

  9. [9]

    Distributed learning in wireless networks: Recent progress and future challenges.IEEE Journal on Selected Areas in Communications, 39(12):3579–3605, 2021

    Mingzhe Chen, Deniz Gündüz, Kaibin Huang, Walid Saad, Mehdi Bennis, Aneta Vulgarakis Feljan, and H Vincent Poor. Distributed learning in wireless networks: Recent progress and future challenges.IEEE Journal on Selected Areas in Communications, 39(12):3579–3605, 2021

  10. [10]

    Communication-efficient and distributed learning over wireless networks: Principles and applications.Proceedings of the IEEE, 109(5):796–819, 2021

    Jihong Park, Sumudu Samarakoon, Anis Elgabli, Joongheon Kim, Mehdi Bennis, Seong-Lyun Kim, and Mérouane Debbah. Communication-efficient and distributed learning over wireless networks: Principles and applications.Proceedings of the IEEE, 109(5):796–819, 2021

  11. [11]

    State space models for event cameras

    Nikola Zubic, Mathias Gehrig, and Davide Scaramuzza. State space models for event cameras. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5819–5828, June 2024

  12. [12]

    Low-power event-based face detection with asynchronous neuromorphic hardware.Inter

    Caterina Caccavella, Federico Paredes-Vall’es, Marco Cannici, and Lyes Khacef. Low-power event-based face detection with asynchronous neuromorphic hardware.Inter. Joint Conf. on Neural Networks (IJCNN), pages 1–10, 2024

  13. [13]

    Zero-shot temporal resolution domain adaptation for spiking neural networks.Neural Networks, 199:108483, 2026

    Sanja Karilanova, Maxime Fabre, Emre Neftci, and Ayça Özçelikkale. Zero-shot temporal resolution domain adaptation for spiking neural networks.Neural Networks, 199:108483, 2026

  14. [14]

    Communication-efficient learning of deep networks from decentralized data

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. InArtificial intelligence and statistics, pages 1273–1282. PMLR, 2017

  15. [15]

    Navigating data heterogeneity in federated learning: a semi-supervised federated object detection

    Taehyeon Kim, Eric Lin, Junu Lee, Christian Lau, and Vaikkunth Mugunthan. Navigating data heterogeneity in federated learning: a semi-supervised federated object detection. InProc. of the 37th International Conference on Neural Information Processing Systems, 2023

  16. [16]

    Soft-consensual federated learning for data heterogeneity via multiple paths

    Sheng Huang, Lele Fu, Fanghua Ye, Tianchi Liao, Bowen Deng, Chuanfu Zhang, and Chuan Chen. Soft-consensual federated learning for data heterogeneity via multiple paths. InThe Thirty-ninth Annual Conf,on Neural Information Processing Systems, 2026

  17. [17]

    One-pass distri- bution sketch for measuring data heterogeneity in federated learning

    Zichang Liu, Zhaozhuo Xu, Benjamin Coleman, and Anshumali Shrivastava. One-pass distri- bution sketch for measuring data heterogeneity in federated learning. InAdvances in Neural Information Processing Systems, volume 36, 2023

  18. [18]

    A review of federated learning methods in heterogeneous scenarios.IEEE Transactions on Consumer Electronics, 70(3):5983–5999, 2024

    Jiaming Pei, Wenxuan Liu, Jinhai Li, Lukun Wang, and Chao Liu. A review of federated learning methods in heterogeneous scenarios.IEEE Transactions on Consumer Electronics, 70(3):5983–5999, 2024. 10

  19. [19]

    Efficiently modeling long sequences with structured state spaces.ICLR, 2022

    Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces.ICLR, 2022

  20. [20]

    On the parameterization and initialization of diagonal state space models.Advances in Neural Information Processing Systems (NeurIPS), 35:35971–35983, 2022

    Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. On the parameterization and initialization of diagonal state space models.Advances in Neural Information Processing Systems (NeurIPS), 35:35971–35983, 2022

  21. [21]

    Smith, Andrew Warrington, and Scott Linderman

    Jimmy T.H. Smith, Andrew Warrington, and Scott Linderman. Simplified state space layers for sequence modeling. InThe Eleventh Inter. Conf. on Learning Representations (ICLR), 2023

  22. [22]

    Hippo: Recurrent memory with optimal polynomial projections.Advances in neural information processing systems, 33:1474–1487, 2020

    Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Ré. Hippo: Recurrent memory with optimal polynomial projections.Advances in neural information processing systems, 33:1474–1487, 2020

  23. [23]

    Scalable event-by-event processing of neuromorphic sensory signals with deep state-space models

    Mark Schöne, Neeraj Mohan Sushma, Jingyue Zhuge, Christian Mayr, Anand Subramoney, and David Kappel. Scalable event-by-event processing of neuromorphic sensory signals with deep state-space models. In2024 International Conference on Neuromorphic Systems (ICONS), pages 124–131, 2024

  24. [24]

    Xiaoyu Zhang, Mingtao Hu, Sen Lu, Soohyeon Kim, Eric Yeu-Jer Lee, Yuyang Liu, and Wei D. Lu. Compute-in-memory implementation of state space models for event sequence processing. Nature Communications, 17, 2026

  25. [25]

    A diagonal structured state space model on loihi 2 for efficient streaming sequence processing

    Svea Marie Meyer, Philipp Weidel, Philipp Plank, Leobardo Campos-Macias, Sumit Bam Shreshta, Philipp Stratmann, Jonathan Timcheck, and Mathis Richter. A diagonal structured state space model on loihi 2 for efficient streaming sequence processing. In2025 Neuro Inspired Computational Elements (NICE), pages 1–9, 2025

  26. [26]

    Learning long sequences in spiking neural networks

    Matei-Ioan Stan and Oliver Rhodes. Learning long sequences in spiking neural networks. Scientific Reports, 14(1):21957, 2024

  27. [27]

    State-space model inspired multiple- input multiple-output spiking neurons

    Sanja Karilanova, Subhrakanti Dey, and Ayça Özçelikkale. State-space model inspired multiple- input multiple-output spiking neurons. In2025 Neuro Inspired Computational Elements (NICE), pages 1–9, 2025

  28. [28]

    Federated learning with spiking neural networks.IEEE Transactions on Signal Processing, 69:6183–6194, 2021

    Yeshwanth Venkatesha, Youngeun Kim, Leandros Tassiulas, and Priyadarshini Panda. Federated learning with spiking neural networks.IEEE Transactions on Signal Processing, 69:6183–6194, 2021

  29. [29]

    Efficient asynchronous federated neuromorphic learning of spiking neural networks.Neurocomputing, 557:126686, 2023

    Yuan Wang, Shukai Duan, and Feng Chen. Efficient asynchronous federated neuromorphic learning of spiking neural networks.Neurocomputing, 557:126686, 2023

  30. [30]

    Federal snn distil- lation: A low-communication-cost federated learning framework for spiking neural networks

    Zhetong Liu, Qiugang Zhan, Xiurui Xie, Bingchao Wang, and Guisong Liu. Federal snn distil- lation: A low-communication-cost federated learning framework for spiking neural networks. Journal of Physics: Conference Series, 2216(1):012078, mar 2022

  31. [31]

    Comparing snns and rnns on neuromorphic vision datasets: Similarities and differences.Neural Networks, 132:108–120, 2020

    Weihua He, YuJie Wu, Lei Deng, Guoqi Li, Haoyu Wang, Yang Tian, Wei Ding, Wenhui Wang, and Yuan Xie. Comparing snns and rnns on neuromorphic vision datasets: Similarities and differences.Neural Networks, 132:108–120, 2020

  32. [32]

    A surrogate gradient spiking baseline for speech command recognition.Frontiers in Neuroscience, 16, 2022

    Alexandre Bittar and Philip N Garner. A surrogate gradient spiking baseline for speech command recognition.Frontiers in Neuroscience, 16, 2022

  33. [33]

    Silif: Structured state space model dynamics and parametrization for spiking neural networks, 2026

    Maxime Fabre, Lyubov Dudchenko, and Emre Neftci. Silif: Structured state space model dynamics and parametrization for spiking neural networks, 2026

  34. [34]

    Prentice Hall PTR, 1987

    Lennart Ljung.System Identification: Theory for the user. Prentice Hall PTR, 1987

  35. [35]

    The heidelberg spiking data sets for the systematic evaluation of spiking neural networks.IEEE Trans

    Benjamin Cramer, Yannik Stradmann, Johannes Schemmel, and Friedemann Zenke. The heidelberg spiking data sets for the systematic evaluation of spiking neural networks.IEEE Trans. on Neural Networks and Learning Systems, 33(7):2744–2757, 2022. 11

  36. [36]

    A low power, fully event-based gesture recognition system

    Arnon Amir, Brian Taba, David Berg, Timothy Melano, Jeffrey McKinstry, Carmelo Di Nolfo, Tapan Nayak, Alexander Andreopoulos, Guillaume Garreau, Marcela Mendoza, Jeff Kusnitz, Michael Debole, Steve Esser, Tobi Delbruck, Myron Flickner, and Dharmendra Modha. A low power, fully event-based gesture recognition system. InProceedings of the IEEE Conference on ...

  37. [37]

    Decoupled weight decay regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InICLR, 2019

  38. [38]

    Batch normalization: accelerating deep network training by reducing internal covariate shift

    Sergey Ioffe and Christian Szegedy. Batch normalization: accelerating deep network training by reducing internal covariate shift. InICML, volume 37, 2015

  39. [39]

    Learning delays in spiking neural networks using dilated convolutions with learnable spacings, 2023

    Ilyass Hammouamri, Ismail Khalfaoui-Hassani, and Timothée Masquelier. Learning delays in spiking neural networks using dilated convolutions with learnable spacings, 2023

  40. [40]

    Spike- driven transformer.Advances in neural information processing systems, 36:64043–64058, 2023

    Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, and Guoqi Li. Spike- driven transformer.Advances in neural information processing systems, 36:64043–64058, 2023

  41. [41]

    Tonic: event-based datasets and transformations., 2021

    Gregor Lenz, Kenneth Chaney, Sumit Bam Shrestha, Omar Oubari, Serge Picaud, and Guido Zarrella. Tonic: event-based datasets and transformations., 2021. https://tonic.readthedocs.io

  42. [42]

    Co-learning synaptic delays, weights and adaptation in spiking neural networks.Frontiers in Neuroscience, 18, 2024

    Lucas Deckers, Laurens Van Damme, Werner Van Leekwijck, Ing Jyh Tsang, and Steven Latré. Co-learning synaptic delays, weights and adaptation in spiking neural networks.Frontiers in Neuroscience, 18, 2024

  43. [43]

    Horowitz

    M. Horowitz. 1.1 computing’s energy problem (and what we can do about it). InIEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10–14, 2014

  44. [44]

    Sed- former: Event-synchronous spiking transformers for irregular telemetry time series forecasting, 2026

    Ziyu Zhou, Yuchen Fang, Weilin Ruan, Shiyu Wang, James Kwok, and Yuxuan Liang. Sed- former: Event-synchronous spiking transformers for irregular telemetry time series forecasting, 2026

  45. [45]

    An analytical estimation of spiking neural networks energy efficiency

    Edgar Lemaire, Loïc Cordone, Andrea Castagnetti, Pierre-Emmanuel Novac, Jonathan Courtois, and Benoît Miramond. An analytical estimation of spiking neural networks energy efficiency. InNeural Information Processing: 29th International Conference, ICONIP 2022, Virtual Event, November 22–26, 2022, Proceedings, Part I, page 574–587, Berlin, Heidelberg, 2022....

  46. [46]

    Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training

    Simon Wiedemann, Temesgen Mehari, Kevin Kepp, and Wojciech Samek. Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 720–721, 2020

  47. [47]

    Efficient large-scale language model training on gpu clusters using megatron-lm

    Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, et al. Efficient large-scale language model training on gpu clusters using megatron-lm. In Proceedings of the international conference for high performance computing, networking, s...