QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Chenlin Zhou; Han Zhang; Huihui Zhou; Liutao Yu; Liwei Huang; Li Yuan; Xiaopeng Fan; Yonghong Tian; Zhaokun Zhou; Zhengyu Ma

arxiv: 2403.16552 · v2 · pith:WXDS3T2Vnew · submitted 2024-03-25 · 💻 cs.NE · cs.AI· cs.CV

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Chenlin Zhou , Han Zhang , Zhaokun Zhou , Liutao Yu , Liwei Huang , Xiaopeng Fan , Li Yuan , Zhengyu Ma

show 2 more authors

Huihui Zhou Yonghong Tian

This is my paper

Pith reviewed 2026-05-24 03:42 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.CV

keywords spiking neural networkstransformerQ-K attentionImageNet classificationhierarchical architecturedirect trainingenergy efficient models

0 comments

The pith

QKFormer reaches 85.65 percent top-1 accuracy on ImageNet-1K using a hierarchical spiking transformer with Q-K attention.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents QKFormer, which combines spiking neural networks with transformer architectures using a new Q-K attention. This attention uses binary spike vectors to model token and channel importance with linear complexity. The model also uses a hierarchical structure for multi-scale features and a special patch embedding with deformed shortcut. With these changes, it achieves 85.65 percent top-1 accuracy on ImageNet-1K using 64.96 million parameters, more than 10 points above the previous best direct-trained spiking model of similar size. A sympathetic reader would care because spiking networks promise lower energy use while now reaching high accuracy on large-scale image tasks.

Core claim

We introduce a spike-form Q-K attention mechanism tailored for SNNs that efficiently models the importance of token or channel dimensions through binary vectors with linear complexity. We incorporate the hierarchical structure into spiking transformers to obtain multi-scale spiking representation and design a versatile patch embedding module with a deformed shortcut. Together these form QKFormer, a hierarchical spiking transformer based on Q-K attention with direct training, which achieves 85.65 percent top-1 accuracy on ImageNet-1k with 64.96 million parameters, outperforming Spikformer by 10.84 percent and marking the first time directly trained SNNs exceed 85 percent on ImageNet-1K.

What carries the argument

The spike-form Q-K attention mechanism, which models importance of token or channel dimensions through binary vectors with linear complexity.

If this is right

QKFormer outperforms existing state-of-the-art SNN models on various mainstream datasets.
The hierarchical structure provides multi-scale spiking representations that improve performance.
The deformed shortcut in the patch embedding module supports better performance in spiking transformers.
Direct training of SNNs can now exceed 85 percent top-1 accuracy on ImageNet-1K.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The linear-complexity Q-K attention could be tested in non-spiking transformers to check whether binary importance vectors reduce compute without accuracy loss.
Running QKFormer on actual neuromorphic chips would test whether the reported accuracy gains produce measurable energy savings compared with standard transformers.
The same hierarchical spiking design with Q-K attention could be applied to video or detection tasks to check whether multi-scale spiking features transfer.

Load-bearing premise

The large accuracy gains are caused by the Q-K attention, hierarchical design, and deformed shortcut rather than differences in training schedule, data augmentation, optimizer settings, or other experimental details.

What would settle it

Re-training the same QKFormer architecture without the Q-K attention or hierarchy and still reaching 85 percent or higher on ImageNet-1K would show the gains do not depend on these elements.

Figures

Figures reproduced from arXiv: 2403.16552 by Chenlin Zhou, Han Zhang, Huihui Zhou, Liutao Yu, Liwei Huang, Li Yuan, Xiaopeng Fan, Yonghong Tian, Zhaokun Zhou, Zhengyu Ma.

**Figure 2.** Figure 2: The overview of QKFormer, a hierarchical spiking transformer with Q-K attention. Overall Hierarchical Architecture. The overview of QKFormer is presented in Figure 2. The input form can be formulated as (T0×H ×W ×n). In static RGB image datasets, T0 = 1 and n = 3. In temporal neuromorphic datasets, the input T0 = T, while n = 2. In our implementation, we use a patch size of 4 × 4 and thus the input featu… view at source ↗

**Figure 3.** Figure 3: The visualization and memory consumption of QKTA. (a) is the visualization of Q-K token [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: (a) shows the variance and expectation of SSA, (b) shows the variance and expectation of [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: (a) Spiking Patch Splitting (SPS) module in Spikformer. (b) Spiking Patch Embedding with [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Training loss, test loss, top-1 and top-5 test accuracy of QKFormer on ImageNet-1K. The [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

read the original abstract

Spiking Transformers, which integrate Spiking Neural Networks (SNNs) with Transformer architectures, have attracted significant attention due to their potential for energy efficiency and high performance. However, existing models in this domain still suffer from suboptimal performance. We introduce several innovations to improve the performance: i) We propose a novel spike-form Q-K attention mechanism, tailored for SNNs, which efficiently models the importance of token or channel dimensions through binary vectors with linear complexity. ii) We incorporate the hierarchical structure, which significantly benefits the performance of both the brain and artificial neural networks, into spiking transformers to obtain multi-scale spiking representation. iii) We design a versatile and powerful patch embedding module with a deformed shortcut specifically for spiking transformers. Together, we develop QKFormer, a hierarchical spiking transformer based on Q-K attention with direct training. QKFormer shows significantly superior performance over existing state-of-the-art SNN models on various mainstream datasets. Notably, with comparable size to Spikformer (66.34 M, 74.81%), QKFormer (64.96 M) achieves a groundbreaking top-1 accuracy of 85.65% on ImageNet-1k, substantially outperforming Spikformer by 10.84%. To our best knowledge, this is the first time that directly training SNNs have exceeded 85% accuracy on ImageNet-1K. The code and models are publicly available at https://github.com/zhouchenlin2096/QKFormer

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The 85.65% ImageNet claim is the headline number, but it requires proof that training details match the Spikformer baseline exactly.

read the letter

The paper's main contribution is a spike-form Q-K attention that operates on binary vectors with linear complexity, combined with a hierarchical layout and a deformed shortcut in the patch embedding. These are specific adaptations for spiking transformers, and the public code release lets others inspect the implementation directly. The reported lift to 85.65% top-1 on ImageNet-1K with a 65M model is the clearest empirical statement, and it exceeds the cited Spikformer result by a wide margin if the comparison holds.

Referee Report

2 major / 1 minor

Summary. The paper introduces QKFormer, a hierarchical spiking transformer that incorporates a novel spike-form Q-K attention mechanism (binary vectors with linear complexity), a hierarchical structure for multi-scale spiking representations, and a deformed shortcut patch embedding. It claims state-of-the-art results on multiple datasets, with the headline result being 85.65% top-1 accuracy on ImageNet-1K using 64.96M parameters, outperforming Spikformer (66.34M parameters, 74.81% accuracy) by 10.84 percentage points and marking the first time directly trained SNNs exceed 85% on this dataset. The code and models are released publicly.

Significance. If the accuracy gains hold under matched training conditions, the work would represent a notable advance for spiking transformers by showing that architectural changes can push directly trained SNNs into a new performance regime on ImageNet-1K. The public code release is a clear strength that enables direct verification and future extensions.

major comments (2)

[Abstract and experimental results section] Abstract and experimental results section: the central claim attributes the 10.84 pp gain (85.65% vs. 74.81%) to the Q-K attention, hierarchy, and deformed shortcut, yet no table or subsection confirms that timestep count, surrogate-gradient function, data augmentation, optimizer schedule, and other hyperparameters are identical to the Spikformer baseline; without this isolation the attribution to the proposed mechanisms remains unverified.
[Section describing the Q-K attention mechanism] Section describing the Q-K attention mechanism: the statement that the mechanism 'efficiently models the importance of token or channel dimensions through binary vectors with linear complexity' requires an explicit derivation or complexity analysis showing how the binary spike-form vectors are produced and propagated while preserving the claimed linear scaling; this is load-bearing for both the performance and energy-efficiency assertions.

minor comments (1)

[Abstract] The abstract contains a minor grammatical issue: 'directly training SNNs have exceeded' should read 'directly trained SNNs have exceeded'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of our results and methods.

read point-by-point responses

Referee: [Abstract and experimental results section] Abstract and experimental results section: the central claim attributes the 10.84 pp gain (85.65% vs. 74.81%) to the Q-K attention, hierarchy, and deformed shortcut, yet no table or subsection confirms that timestep count, surrogate-gradient function, data augmentation, optimizer schedule, and other hyperparameters are identical to the Spikformer baseline; without this isolation the attribution to the proposed mechanisms remains unverified.

Authors: We agree that explicit confirmation of matched training conditions is necessary for a fair comparison. In the revised manuscript we will add a dedicated table (and accompanying text in Section 4) listing all key hyperparameters for QKFormer alongside those reported for Spikformer (timesteps T=4, arctan surrogate gradient, identical data augmentation pipeline, AdamW optimizer with the same learning-rate schedule and weight decay, etc.). These settings were taken directly from the Spikformer paper and codebase to ensure the performance difference can be attributed to the architectural innovations. revision: yes
Referee: [Section describing the Q-K attention mechanism] Section describing the Q-K attention mechanism: the statement that the mechanism 'efficiently models the importance of token or channel dimensions through binary vectors with linear complexity' requires an explicit derivation or complexity analysis showing how the binary spike-form vectors are produced and propagated while preserving the claimed linear scaling; this is load-bearing for both the performance and energy-efficiency assertions.

Authors: We acknowledge that the current description would benefit from a formal derivation. In the revised manuscript we will expand the Q-K attention subsection with a step-by-step derivation: (1) generation of binary spike vectors Q_s and K_s via the spiking neuron, (2) the attention computation reducing to an element-wise product and summation that counts matching spikes, and (3) the resulting per-layer complexity of O(N) for sequence length N (versus O(N^2) for standard softmax attention). We will also include a small complexity table comparing FLOPs and spike operations. revision: yes

Circularity Check

0 steps flagged

No circularity: central claim is empirical accuracy on public benchmark

full rationale

The paper presents architectural innovations (Q-K attention, hierarchical design, deformed shortcut) and reports an empirical top-1 accuracy of 85.65% on ImageNet-1k. No derivation chain exists that reduces predictions or uniqueness claims to fitted parameters, self-citations, or ansatzes by construction. The accuracy number is obtained from direct training experiments on a standard public dataset and does not equate to any input by the paper's own equations. Self-citations, if present, are not load-bearing for the performance claim.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The performance claim rests on the effectiveness of three newly introduced architectural components whose benefits are shown only through the reported experiments; the model contains numerous tunable design choices typical of deep networks.

free parameters (1)

Architecture and training hyperparameters
Choices such as number of layers, embedding dimensions, learning rate schedules, and augmentation parameters are selected to achieve the stated accuracy.

axioms (1)

domain assumption Gradient-based optimization converges effectively when applied directly to the spiking network loss.
Direct training of SNNs assumes backpropagation through the non-differentiable spike function yields useful gradients.

invented entities (2)

Spike-form Q-K attention no independent evidence
purpose: Models token or channel importance via binary vectors with linear complexity inside an SNN.
New mechanism introduced by the authors; effectiveness shown only in the paper's results.
Deformed shortcut patch embedding no independent evidence
purpose: Improves input representation specifically for spiking transformers.
Novel design element proposed for this architecture.

pith-pipeline@v0.9.0 · 5834 in / 1416 out tokens · 57955 ms · 2026-05-24T03:42:56.045120+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Temporal-Aware Spiking Transformer Hashing Based on 3D-DWT
cs.CV 2025-01 unverdicted novelty 7.0

Spikinghash combines 3D-DWT Spiking WaveMixer, Spiking Self-Attention, and a dynamic soft similarity loss to produce energy-efficient hash codes for DVS data retrieval.
Image Classification via Random Dilated Convolution with Multi-Branch Feature Extraction and Context Excitation
cs.CV 2026-04 unverdicted novelty 3.0

RDCNet reports state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, Imagenette, and Imagewoof by combining random dilated convolutions with multi-branch and attention modules.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · cited by 2 Pith papers · 2 internal anchors

[1]

Networks of spiking neurons: the third generation of neural network models

Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models. Neural networks, 10(9):1659–1671, 1997

work page 1997
[2]

Towards spike-based machine intelligence with neuromorphic computing

Kaushik Roy, Akhilesh Jaiswal, and Priyadarshini Panda. Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784):607–617, 2019

work page 2019
[3]

Attention is all you need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Proceedings of the Interna- tional Conference on Neural Information Processing Systems (NeurIPS), volume 30, 2017. 10

work page 2017
[4]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representa- tions (ICLR), 2020

work page 2020
[5]

Tokens-to-token vit: Training vision transformers from scratch on imagenet

Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zi-Hang Jiang, Francis EH Tay, Jiashi Feng, and Shuicheng Yan. Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 558–567, 2021

work page 2021
[6]

End-to-end object detection with transformers

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision (ECCV), pages 213–229. Springer, 2020

work page 2020
[7]

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[8]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10012–10022, 2021

work page 2021
[9]

Pyramid vision transformer: A versatile backbone for dense prediction without convolutions

Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 568–578, 2021

work page 2021
[10]

V olo: Vision outlooker for visual recognition

Li Yuan, Qibin Hou, Zihang Jiang, Jiashi Feng, and Shuicheng Yan. V olo: Vision outlooker for visual recognition. arXiv preprint arXiv:2106.13112, 2021

work page arXiv 2021
[11]

Spikformer: When spiking neural network meets transformer

Zhaokun Zhou, Yuesheng Zhu, Chao He, Yaowei Wang, Shuicheng YAN, Yonghong Tian, and Li Yuan. Spikformer: When spiking neural network meets transformer. In The Eleventh International Conference on Learning Representations, 2023

work page 2023
[12]

Spikingformer: Spike-driven residual learning for transformer-based spiking neural network, 2023

Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Han Zhang, Zhengyu Ma, Huihui Zhou, and Yonghong Tian. Spikingformer: Spike-driven residual learning for transformer-based spiking neural network, 2023

work page 2023
[13]

Spike- driven transformer, 2023

Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, and Guoqi Li. Spike- driven transformer, 2023

work page 2023
[14]

Enhancing the performance of transformer-based spiking neural networks by improved downsampling with precise gradient backpropagation, 2023

Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Zhengyu Ma, Huihui Zhou, Xiaopeng Fan, and Yonghong Tian. Enhancing the performance of transformer-based spiking neural networks by improved downsampling with precise gradient backpropagation, 2023

work page 2023
[15]

Spatial- temporal self-attention for asynchronous spiking neural networks

Yuchen Wang, Kexin Shi, Chengzhuo Lu, Yuguo Liu, Malu Zhang, and Hong Qu. Spatial- temporal self-attention for asynchronous spiking neural networks. In Edith Elkind, editor, Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pages 3085–3093. International Joint Conferences on Artificial Intelligence Organ...

work page 2023
[16]

Segregation, integration, and balance of large-scale resting brain networks configure different cognitive abilities

Rong Wang, Mianxin Liu, Xinhong Cheng, Ying Wu, Andrea Hildebrandt, and Changsong Zhou. Segregation, integration, and balance of large-scale resting brain networks configure different cognitive abilities. Proceedings of the National Academy of Sciences, 118(23):e2022288118, 2021

work page 2021
[17]

Spiking deep convolutional neural networks for energy-efficient object recognition

Yongqiang Cao, Yang Chen, and Deepak Khosla. Spiking deep convolutional neural networks for energy-efficient object recognition. International Journal of Computer Vision, 113(1):54–66, 2015

work page 2015
[18]

Spiking Deep Networks with LIF Neurons

Eric Hunsberger and Chris Eliasmith. Spiking deep networks with lif neurons. arXiv preprint arXiv:1510.08829, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[19]

Optimal ann- snn conversion for high-accuracy and ultra-low-latency spiking neural networks

Tong Bu, Wei Fang, Jianhao Ding, PengLin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ann- snn conversion for high-accuracy and ultra-low-latency spiking neural networks. InInternational Conference on Learning Representations (ICLR), 2021

work page 2021
[20]

A free lunch from ann: Towards efficient, accurate spiking neural networks calibration

Yuhang Li, Shi-Wee Deng, Xin Dong, Ruihao Gong, and Shi Gu. A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. ArXiv, abs/2106.06984, 2021. 11

work page arXiv 2021
[21]

Rmp-snn: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network

Bing Han, Gopalakrishnan Srinivasan, and Kaushik Roy. Rmp-snn: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13558–13567, 2020

work page 2020
[22]

Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks

Tong Bu, Wei Fang, Jianhao Ding, PengLin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks. arXiv preprint arXiv:2303.04347, 2023

work page arXiv 2023
[23]

Masked spiking transformer

Ziqing Wang, Yuetong Fang, Jiahang Cao, Qiang Zhang, Zhongrui Wang, and Renjing Xu. Masked spiking transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1761–1771, 2023

work page 2023
[24]

Spatio-temporal backpropagation for training high-performance spiking neural networks

Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, and Luping Shi. Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience, 12:331, 2018

work page 2018
[25]

Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks

Emre O Neftci, Hesham Mostafa, and Friedemann Zenke. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019

work page 2019
[26]

Training feedback spiking neural networks by implicit differentiation on the equilibrium state

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Yisen Wang, and Zhouchen Lin. Training feedback spiking neural networks by implicit differentiation on the equilibrium state. In Pro- ceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 34, pages 14516–14528, 2021

work page 2021
[27]

Slayer: Spike layer error reassignment in time

Sumit B Shrestha and Garrick Orchard. Slayer: Spike layer error reassignment in time. In Pro- ceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 31, 2018

work page 2018
[28]

Deep Residual Learning in Spiking Neural Networks

Wei Fang, Zhaofei Yu, Yanqi Chen, Tiejun Huang, Timothée Masquelier, and Yonghong Tian. Deep Residual Learning in Spiking Neural Networks. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 34, pages 21056– 21069, 2021

work page 2021
[29]

Event-driven spiking convolutional neural network, June 16 2022

Ole Juri Richter, QIAO Ning, Qian Liu, and Sadique Ul Ameen Sheik. Event-driven spiking convolutional neural network, June 16 2022. US Patent App. 17/601,939

work page 2022
[30]

Towards artificial general intelligence with hybrid tianjic chip architecture

Jing Pei, Lei Deng, Sen Song, Mingguo Zhao, Youhui Zhang, Shuang Wu, Guanrui Wang, Zhe Zou, Zhenzhi Wu, Wei He, et al. Towards artificial general intelligence with hybrid tianjic chip architecture. Nature, 572(7767):106–111, 2019

work page 2019
[31]

Advancing residual learning towards powerful deep spiking neural networks

Yifan Hu, Yujie Wu, Lei Deng, and Guoqi Li. Advancing residual learning towards powerful deep spiking neural networks. arXiv preprint arXiv:2112.08954, 2021

work page arXiv 2021
[32]

Training data-efficient image transformers & distillation through attention

Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Hervé Jégou. Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021

work page 2021
[33]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022

work page 2022
[34]

Randaugment: Practical automated data augmentation with a reduced search space

Ekin D Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V Le. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 702–703, 2020

work page 2020
[35]

Random erasing data augmentation

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. Random erasing data augmentation. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 13001–13008, 2020

work page 2020
[36]

Deep networks with stochastic depth

Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q Weinberger. Deep networks with stochastic depth. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 646–661. Springer, 2016

work page 2016
[37]

Direct training high-performance deep spiking neural networks: A review of theories and methods

Chenlin Zhou, Han Zhang, Liutao Yu, Yumin Ye, Zhaokun Zhou, Liwei Huang, Zhengyu Ma, Xiaopeng Fan, Huihui Zhou, and Yonghong Tian. Direct training high-performance deep spiking neural networks: A review of theories and methods. arXiv preprint arXiv:2405.04289, 2024. 12

work page arXiv 2024
[38]

Incorporating learnable membrane time constant to enhance learning of spiking neural networks

Wei Fang, Zhaofei Yu, Yanqi Chen, Timothée Masquelier, Tiejun Huang, and Yonghong Tian. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 2661–2671, 2021

work page 2021
[39]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009

work page 2009
[40]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009

work page 2009
[41]

Cifar10-dvs: an event- stream dataset for object classification

Hongmin Li, Hanchao Liu, Xiangyang Ji, Guoqi Li, and Luping Shi. Cifar10-dvs: an event- stream dataset for object classification. Frontiers in neuroscience, 11:309, 2017

work page 2017
[42]

A low power, fully event-based gesture recognition system

Arnon Amir, Brian Taba, David Berg, Timothy Melano, Jeffrey McKinstry, Carmelo Di Nolfo, Tapan Nayak, Alexander Andreopoulos, Guillaume Garreau, Marcela Mendoza, Jeff Kusnitz, Michael Debole, Steve Esser, Tobi Delbruck, Myron Flickner, and Dharmendra Modha. A low power, fully event-based gesture recognition system. In Proceedings of the IEEE/CVF Conferenc...

work page 2017
[43]

Going Deeper With Directly- Trained Larger Spiking Neural Networks

Hanle Zheng, Yujie Wu, Lei Deng, Yifan Hu, and Guoqi Li. Going Deeper With Directly- Trained Larger Spiking Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 11062–11070, 2021

work page 2021
[44]

Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting

Shikuang Deng, Yuhang Li, Shanghang Zhang, and Shi Gu. Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting. In International Conference on Learning Representations (ICLR), 2021

work page 2021
[45]

Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks

Xiaohan Ding, Yuchen Guo, Guiguang Ding, and Jungong Han. Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1911–1920, 2019

work page 1911
[46]

Repvgg: Making vgg-style convnets great again

Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, and Jian Sun. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13733–13742, 2021

work page 2021
[47]

Spiking deep residual networks

Yangfan Hu, Huajin Tang, and Gang Pan. Spiking deep residual networks. IEEE Transactions on Neural Networks and Learning Systems, 2021

work page 2021
[48]

Training full spike neural networks via auxiliary accumulation pathway

Guangyao Chen, Peixi Peng, Guoqi Li, and Yonghong Tian. Training full spike neural networks via auxiliary accumulation pathway. arXiv preprint arXiv:2301.11929, 2023

work page arXiv 2023
[49]

Hire-snn: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noise

Souvik Kundu, Massoud Pedram, and Peter A Beerel. Hire-snn: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noise. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5209–5218, 2021

work page 2021
[50]

1.1 computing’s energy problem (and what we can do about it)

Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10–14. IEEE, 2014

work page 2014
[51]

Toward scalable, efficient, and accurate deep spiking neural networks with backward residual connections, stochastic softmax, and hybridization

Priyadarshini Panda, Sai Aparna Aketi, and Kaushik Roy. Toward scalable, efficient, and accurate deep spiking neural networks with backward residual connections, stochastic softmax, and hybridization. Frontiers in Neuroscience, 14:653, 2020

work page 2020
[52]

Attention spiking neural networks

Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, and Guoqi Li. Attention spiking neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

work page 2023
[53]

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 32, 2019

work page 2019
[54]

Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence

Wei Fang, Yanqi Chen, Jianhao Ding, Zhaofei Yu, Timothée Masquelier, Ding Chen, Liwei Huang, Huihui Zhou, Guoqi Li, and Yonghong Tian. Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence. Science Advances, 9(40):eadi1480, 2023

work page 2023
[55]

Pytorch image models

Ross Wightman. Pytorch image models. https://github.com/rwightman/ pytorch-image-models, 2019. 13 6 Appendix 6.1 Spiking Neuron Model Spiking neuron is the fundamental unit of SNNs, we choose the Leaky Integrate-and-Fire (LIF) model as the spiking neuron in our work. The dynamics of a LIF neuron can be formulated as follows: H[t] = V [t − 1] + 1 τ (X[t] −...

work page 2019

[1] [1]

Networks of spiking neurons: the third generation of neural network models

Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models. Neural networks, 10(9):1659–1671, 1997

work page 1997

[2] [2]

Towards spike-based machine intelligence with neuromorphic computing

Kaushik Roy, Akhilesh Jaiswal, and Priyadarshini Panda. Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784):607–617, 2019

work page 2019

[3] [3]

Attention is all you need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Proceedings of the Interna- tional Conference on Neural Information Processing Systems (NeurIPS), volume 30, 2017. 10

work page 2017

[4] [4]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representa- tions (ICLR), 2020

work page 2020

[5] [5]

Tokens-to-token vit: Training vision transformers from scratch on imagenet

Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zi-Hang Jiang, Francis EH Tay, Jiashi Feng, and Shuicheng Yan. Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 558–567, 2021

work page 2021

[6] [6]

End-to-end object detection with transformers

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision (ECCV), pages 213–229. Springer, 2020

work page 2020

[7] [7]

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[8] [8]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10012–10022, 2021

work page 2021

[9] [9]

Pyramid vision transformer: A versatile backbone for dense prediction without convolutions

Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 568–578, 2021

work page 2021

[10] [10]

V olo: Vision outlooker for visual recognition

Li Yuan, Qibin Hou, Zihang Jiang, Jiashi Feng, and Shuicheng Yan. V olo: Vision outlooker for visual recognition. arXiv preprint arXiv:2106.13112, 2021

work page arXiv 2021

[11] [11]

Spikformer: When spiking neural network meets transformer

Zhaokun Zhou, Yuesheng Zhu, Chao He, Yaowei Wang, Shuicheng YAN, Yonghong Tian, and Li Yuan. Spikformer: When spiking neural network meets transformer. In The Eleventh International Conference on Learning Representations, 2023

work page 2023

[12] [12]

Spikingformer: Spike-driven residual learning for transformer-based spiking neural network, 2023

Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Han Zhang, Zhengyu Ma, Huihui Zhou, and Yonghong Tian. Spikingformer: Spike-driven residual learning for transformer-based spiking neural network, 2023

work page 2023

[13] [13]

Spike- driven transformer, 2023

Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, and Guoqi Li. Spike- driven transformer, 2023

work page 2023

[14] [14]

Enhancing the performance of transformer-based spiking neural networks by improved downsampling with precise gradient backpropagation, 2023

Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Zhengyu Ma, Huihui Zhou, Xiaopeng Fan, and Yonghong Tian. Enhancing the performance of transformer-based spiking neural networks by improved downsampling with precise gradient backpropagation, 2023

work page 2023

[15] [15]

Spatial- temporal self-attention for asynchronous spiking neural networks

Yuchen Wang, Kexin Shi, Chengzhuo Lu, Yuguo Liu, Malu Zhang, and Hong Qu. Spatial- temporal self-attention for asynchronous spiking neural networks. In Edith Elkind, editor, Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pages 3085–3093. International Joint Conferences on Artificial Intelligence Organ...

work page 2023

[16] [16]

Segregation, integration, and balance of large-scale resting brain networks configure different cognitive abilities

Rong Wang, Mianxin Liu, Xinhong Cheng, Ying Wu, Andrea Hildebrandt, and Changsong Zhou. Segregation, integration, and balance of large-scale resting brain networks configure different cognitive abilities. Proceedings of the National Academy of Sciences, 118(23):e2022288118, 2021

work page 2021

[17] [17]

Spiking deep convolutional neural networks for energy-efficient object recognition

Yongqiang Cao, Yang Chen, and Deepak Khosla. Spiking deep convolutional neural networks for energy-efficient object recognition. International Journal of Computer Vision, 113(1):54–66, 2015

work page 2015

[18] [18]

Spiking Deep Networks with LIF Neurons

Eric Hunsberger and Chris Eliasmith. Spiking deep networks with lif neurons. arXiv preprint arXiv:1510.08829, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[19] [19]

Optimal ann- snn conversion for high-accuracy and ultra-low-latency spiking neural networks

Tong Bu, Wei Fang, Jianhao Ding, PengLin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ann- snn conversion for high-accuracy and ultra-low-latency spiking neural networks. InInternational Conference on Learning Representations (ICLR), 2021

work page 2021

[20] [20]

A free lunch from ann: Towards efficient, accurate spiking neural networks calibration

Yuhang Li, Shi-Wee Deng, Xin Dong, Ruihao Gong, and Shi Gu. A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. ArXiv, abs/2106.06984, 2021. 11

work page arXiv 2021

[21] [21]

Rmp-snn: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network

Bing Han, Gopalakrishnan Srinivasan, and Kaushik Roy. Rmp-snn: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13558–13567, 2020

work page 2020

[22] [22]

Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks

Tong Bu, Wei Fang, Jianhao Ding, PengLin Dai, Zhaofei Yu, and Tiejun Huang. Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks. arXiv preprint arXiv:2303.04347, 2023

work page arXiv 2023

[23] [23]

Masked spiking transformer

Ziqing Wang, Yuetong Fang, Jiahang Cao, Qiang Zhang, Zhongrui Wang, and Renjing Xu. Masked spiking transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1761–1771, 2023

work page 2023

[24] [24]

Spatio-temporal backpropagation for training high-performance spiking neural networks

Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, and Luping Shi. Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience, 12:331, 2018

work page 2018

[25] [25]

Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks

Emre O Neftci, Hesham Mostafa, and Friedemann Zenke. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019

work page 2019

[26] [26]

Training feedback spiking neural networks by implicit differentiation on the equilibrium state

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Yisen Wang, and Zhouchen Lin. Training feedback spiking neural networks by implicit differentiation on the equilibrium state. In Pro- ceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 34, pages 14516–14528, 2021

work page 2021

[27] [27]

Slayer: Spike layer error reassignment in time

Sumit B Shrestha and Garrick Orchard. Slayer: Spike layer error reassignment in time. In Pro- ceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 31, 2018

work page 2018

[28] [28]

Deep Residual Learning in Spiking Neural Networks

Wei Fang, Zhaofei Yu, Yanqi Chen, Tiejun Huang, Timothée Masquelier, and Yonghong Tian. Deep Residual Learning in Spiking Neural Networks. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 34, pages 21056– 21069, 2021

work page 2021

[29] [29]

Event-driven spiking convolutional neural network, June 16 2022

Ole Juri Richter, QIAO Ning, Qian Liu, and Sadique Ul Ameen Sheik. Event-driven spiking convolutional neural network, June 16 2022. US Patent App. 17/601,939

work page 2022

[30] [30]

Towards artificial general intelligence with hybrid tianjic chip architecture

Jing Pei, Lei Deng, Sen Song, Mingguo Zhao, Youhui Zhang, Shuang Wu, Guanrui Wang, Zhe Zou, Zhenzhi Wu, Wei He, et al. Towards artificial general intelligence with hybrid tianjic chip architecture. Nature, 572(7767):106–111, 2019

work page 2019

[31] [31]

Advancing residual learning towards powerful deep spiking neural networks

Yifan Hu, Yujie Wu, Lei Deng, and Guoqi Li. Advancing residual learning towards powerful deep spiking neural networks. arXiv preprint arXiv:2112.08954, 2021

work page arXiv 2021

[32] [32]

Training data-efficient image transformers & distillation through attention

Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Hervé Jégou. Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021

work page 2021

[33] [33]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022

work page 2022

[34] [34]

Randaugment: Practical automated data augmentation with a reduced search space

Ekin D Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V Le. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 702–703, 2020

work page 2020

[35] [35]

Random erasing data augmentation

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. Random erasing data augmentation. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 13001–13008, 2020

work page 2020

[36] [36]

Deep networks with stochastic depth

Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q Weinberger. Deep networks with stochastic depth. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 646–661. Springer, 2016

work page 2016

[37] [37]

Direct training high-performance deep spiking neural networks: A review of theories and methods

Chenlin Zhou, Han Zhang, Liutao Yu, Yumin Ye, Zhaokun Zhou, Liwei Huang, Zhengyu Ma, Xiaopeng Fan, Huihui Zhou, and Yonghong Tian. Direct training high-performance deep spiking neural networks: A review of theories and methods. arXiv preprint arXiv:2405.04289, 2024. 12

work page arXiv 2024

[38] [38]

Incorporating learnable membrane time constant to enhance learning of spiking neural networks

Wei Fang, Zhaofei Yu, Yanqi Chen, Timothée Masquelier, Tiejun Huang, and Yonghong Tian. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 2661–2671, 2021

work page 2021

[39] [39]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009

work page 2009

[40] [40]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009

work page 2009

[41] [41]

Cifar10-dvs: an event- stream dataset for object classification

Hongmin Li, Hanchao Liu, Xiangyang Ji, Guoqi Li, and Luping Shi. Cifar10-dvs: an event- stream dataset for object classification. Frontiers in neuroscience, 11:309, 2017

work page 2017

[42] [42]

A low power, fully event-based gesture recognition system

Arnon Amir, Brian Taba, David Berg, Timothy Melano, Jeffrey McKinstry, Carmelo Di Nolfo, Tapan Nayak, Alexander Andreopoulos, Guillaume Garreau, Marcela Mendoza, Jeff Kusnitz, Michael Debole, Steve Esser, Tobi Delbruck, Myron Flickner, and Dharmendra Modha. A low power, fully event-based gesture recognition system. In Proceedings of the IEEE/CVF Conferenc...

work page 2017

[43] [43]

Going Deeper With Directly- Trained Larger Spiking Neural Networks

Hanle Zheng, Yujie Wu, Lei Deng, Yifan Hu, and Guoqi Li. Going Deeper With Directly- Trained Larger Spiking Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 11062–11070, 2021

work page 2021

[44] [44]

Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting

Shikuang Deng, Yuhang Li, Shanghang Zhang, and Shi Gu. Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting. In International Conference on Learning Representations (ICLR), 2021

work page 2021

[45] [45]

Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks

Xiaohan Ding, Yuchen Guo, Guiguang Ding, and Jungong Han. Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1911–1920, 2019

work page 1911

[46] [46]

Repvgg: Making vgg-style convnets great again

Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, and Jian Sun. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13733–13742, 2021

work page 2021

[47] [47]

Spiking deep residual networks

Yangfan Hu, Huajin Tang, and Gang Pan. Spiking deep residual networks. IEEE Transactions on Neural Networks and Learning Systems, 2021

work page 2021

[48] [48]

Training full spike neural networks via auxiliary accumulation pathway

Guangyao Chen, Peixi Peng, Guoqi Li, and Yonghong Tian. Training full spike neural networks via auxiliary accumulation pathway. arXiv preprint arXiv:2301.11929, 2023

work page arXiv 2023

[49] [49]

Hire-snn: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noise

Souvik Kundu, Massoud Pedram, and Peter A Beerel. Hire-snn: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noise. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5209–5218, 2021

work page 2021

[50] [50]

1.1 computing’s energy problem (and what we can do about it)

Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10–14. IEEE, 2014

work page 2014

[51] [51]

Toward scalable, efficient, and accurate deep spiking neural networks with backward residual connections, stochastic softmax, and hybridization

Priyadarshini Panda, Sai Aparna Aketi, and Kaushik Roy. Toward scalable, efficient, and accurate deep spiking neural networks with backward residual connections, stochastic softmax, and hybridization. Frontiers in Neuroscience, 14:653, 2020

work page 2020

[52] [52]

Attention spiking neural networks

Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, and Guoqi Li. Attention spiking neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

work page 2023

[53] [53]

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS), volume 32, 2019

work page 2019

[54] [54]

Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence

Wei Fang, Yanqi Chen, Jianhao Ding, Zhaofei Yu, Timothée Masquelier, Ding Chen, Liwei Huang, Huihui Zhou, Guoqi Li, and Yonghong Tian. Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence. Science Advances, 9(40):eadi1480, 2023

work page 2023

[55] [55]

Pytorch image models

Ross Wightman. Pytorch image models. https://github.com/rwightman/ pytorch-image-models, 2019. 13 6 Appendix 6.1 Spiking Neuron Model Spiking neuron is the fundamental unit of SNNs, we choose the Leaky Integrate-and-Fire (LIF) model as the spiking neuron in our work. The dynamics of a LIF neuron can be formulated as follows: H[t] = V [t − 1] + 1 τ (X[t] −...

work page 2019