NeuDW-CIM: a 65-nm 0.8-pJ/Sop Reconfigurable Neuromorphic Compute-in-Memory Macro with Nonlinear Dendrites and K-Winners

An Guo; Arindam Basu; Biyan Zhou; Junyi Yang; Peng Zhou; Shuai Dong; Xin Si; Yahan Yang; Ye Ke; Zhengnan Fu

arxiv: 2606.08947 · v1 · pith:QZI67LRXnew · submitted 2026-06-08 · 💻 cs.AR

NeuDW-CIM: a 65-nm 0.8-pJ/Sop Reconfigurable Neuromorphic Compute-in-Memory Macro with Nonlinear Dendrites and K-Winners

Junyi Yang , Yahan Yang , Shuai Dong , Biyan Zhou , Ye Ke , Zhengnan Fu , Xin Si , An Guo

show 2 more authors

Peng Zhou Arindam Basu

This is my paper

Pith reviewed 2026-06-27 14:58 UTC · model grok-4.3

classification 💻 cs.AR

keywords neuromorphic computingcompute-in-memoryspiking neural networksnonlinear dendritesK-winner selectionenergy efficiency65nm CMOSevent-based vision

0 comments

The pith

A 65nm compute-in-memory macro uses nonlinear dendrite emulation and K-winner early stopping to reach 0.8 pJ per synaptic operation in spiking networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NeuDW-CIM as a neuromorphic CIM macro in 65nm CMOS that adds two reconfigurable modes to spiking neural network hardware. Nonlinear dendrite mode uses a custom in-memory ADC to mimic biological dendritic computation and delivers measured accuracies of 97.2 percent on N-MNIST and 95.5 percent on DVS Gesture. Top-K winner mode applies early stopping to cut IMA conversion time by 30 percent and digital neuron latency by 10 times, yielding an overall energy efficiency of 0.8 pJ per SOP with a 1.6 times gain over prior work. The design rests on a twin 9T bit-cell for ternary weights and sparse updates that exploit the winner mode. A reader would care because the results show how targeted hardware modes can raise both task accuracy and efficiency for event-driven sensing without changing the underlying SNN model.

Core claim

NeuDW-CIM is a 65-nm CMOS neuromorphic compute-in-memory macro that introduces a twin 9T bit-cell for ternary inputs and weights together with a reconfigurable non-linear in-memory ADC; the macro operates in nonlinear dendrite mode to emulate biological dendritic functions or in top-K winner mode with early stopping, producing measured accuracies of 97.2 percent on N-MNIST and 95.5 percent on DVS Gesture while reaching 0.8 pJ per SOP energy efficiency.

What carries the argument

Reconfigurable non-linear in-memory ADC that switches between dendritic nonlinearity emulation and early-stopping K-winner selection while the twin 9T bit-cell supplies ternary weights.

If this is right

Nonlinear dendrite mode produces 97.2 percent accuracy on N-MNIST and 95.5 percent accuracy on DVS Gesture.
K-winner mode reduces IMA conversion latency by 30 percent and digital LIF neuron latency by a factor of 10.
Sparse updates in K-winner mode contribute directly to the measured 0.8 pJ per SOP efficiency.
The same macro supports both modes through simple reconfiguration of the in-memory ADC.
The design maintains ternary weight precision via the custom 9T bit-cell throughout both operating modes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The reconfigurability could allow a single chip to switch between high-accuracy and low-power regimes for different edge-sensing tasks without retraining.
Early stopping in the winner mode may scale favorably to deeper networks where only a small fraction of neurons fire per timestep.
The dendritic nonlinearity emulation might be combined with existing SNN training algorithms to close the remaining accuracy gap to non-spiking networks on the same datasets.
Because the efficiency gain comes from both analog and digital latency reductions, the approach could be ported to other process nodes that also support ternary bit-cells.

Load-bearing premise

The reported accuracy and energy numbers were measured on silicon that behaves like typical fabricated chips without unaccounted test-setup artifacts or process variation.

What would settle it

Re-measuring energy per SOP and classification accuracy on a second set of fabricated dies or at a different supply voltage and temperature would show whether the 0.8 pJ/SOP and 97 percent accuracy figures hold beyond the single reported test condition.

Figures

Figures reproduced from arXiv: 2606.08947 by An Guo, Arindam Basu, Biyan Zhou, Junyi Yang, Peng Zhou, Shuai Dong, Xin Si, Yahan Yang, Ye Ke, Zhengnan Fu.

**Figure 2.** Figure 2: Overall architecture of the proposed macro and two modes. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 4.** Figure 4: (a) Circuit controller and (b) timing of the KWN selecting module. [PITH_FULL_IMAGE:figures/full_fig_p002_4.png] view at source ↗

**Figure 7.** Figure 7: Measurement results of NL-IMA for (a) NLQ and (b) NL activation. [PITH_FULL_IMAGE:figures/full_fig_p003_7.png] view at source ↗

**Figure 5.** Figure 5: (a) Flow chart and timing of the LIF module with noise. (b) Accuracy [PITH_FULL_IMAGE:figures/full_fig_p003_5.png] view at source ↗

**Figure 6.** Figure 6: (a) NL-IMA timing with calibration. (b) NL-IMA for NLQ. (c) [PITH_FULL_IMAGE:figures/full_fig_p003_6.png] view at source ↗

**Figure 9.** Figure 9: (a) Energy consumption breakdown for NLD and KWN modes. (b) [PITH_FULL_IMAGE:figures/full_fig_p004_9.png] view at source ↗

read the original abstract

This work presents NeuDW-CIM, a highly efficient neuromorphic Compute-in-Memory (CIM) macro for Spiking Neural Networks (SNNs) implemented in 65 nm CMOS. The design introduces a custom twin 9T bit-cell for ternary in-puts/weights and a reconfigurable non-linear In-Memory ADC (IMA). The macro supports two specialized modes: 1) Nonlinear Dendrite (NLD) mode, which utilizes reconfigurable IMA to emulate biological dendritic functions, achieving measured accuracies of 97.2% on N-MNIST and 95.5% on DVS Gesture; and 2) Top-K Winner (KWN) mode, featuring an early-stopping mechanism that reduces IMA conversion latency by 30% and digital LIF latency by 10x. Benefiting from the sparse update in KWN mode, NeuDW-CIM achieves a measured energy efficiency (EE) of 0.8 pJ/SOP (1.6x improvement).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NeuDW-CIM is a real 65 nm silicon prototype with a reconfigurable IMA for nonlinear dendrites and K-winner mode, but the measurement protocol details are missing from the abstract.

read the letter

The main thing to know is that this paper reports measured results from a fabricated 65 nm CIM macro for SNNs. It uses a twin 9T ternary bit-cell and a reconfigurable non-linear IMA that runs in two modes: nonlinear dendrite emulation and top-K winner early stopping. The KWN mode cuts IMA latency by 30% and digital LIF by 10x, and the chip hits 0.8 pJ/SOP with 97.2% on N-MNIST and 95.5% on DVS Gesture.

What is actually new is the combination of those two modes inside one macro plus the specific bit-cell and IMA design that supports both. The energy improvement over prior work is stated as 1.6x. For a hardware paper, shipping silicon with concrete numbers on event datasets is the right level of evidence.

The soft spot is the test conditions. The abstract gives no information on whether accuracy runs were fully on-chip end-to-end, how many dies were measured for variation, or exactly which blocks were included in the power number. Any of those gaps would make the quoted figures less representative. The stress-test concern lands until the full paper shows the protocol.

This is for circuit-level neuromorphic designers working on edge SNN hardware. A reader who needs examples of reconfigurable CIM or K-winner mechanisms will get usable design details. It is coherent enough to deserve peer review; the prototype exists and the claims can be checked against the silicon data.

Referee Report

1 major / 1 minor

Summary. The manuscript presents NeuDW-CIM, a 65-nm CMOS neuromorphic compute-in-memory macro for spiking neural networks. It introduces a twin 9T bit-cell supporting ternary inputs/weights and a reconfigurable nonlinear in-memory ADC (IMA). The design operates in two modes: Nonlinear Dendrite (NLD) mode that emulates biological dendritic functions, and Top-K Winner (KWN) mode with early-stopping to reduce IMA conversion latency by 30% and digital LIF latency by 10x. Measured results on fabricated silicon report 97.2% accuracy on N-MNIST and 95.5% accuracy on DVS Gesture in NLD mode, together with 0.8 pJ/SOP energy efficiency in KWN mode (claimed 1.6x improvement).

Significance. If the silicon measurements are shown to be representative of typical fabricated performance, the work demonstrates a reconfigurable CIM macro that integrates nonlinear dendritic emulation and sparsity-driven early stopping within a single 65-nm design. The dual-mode operation and concrete energy figure constitute a tangible contribution to energy-efficient neuromorphic hardware for edge SNN inference.

major comments (1)

[Abstract and experimental results section] Abstract and experimental results section: the headline claims rest entirely on measured accuracies (97.2% N-MNIST, 95.5% DVS Gesture) and 0.8 pJ/SOP efficiency. The manuscript must supply a complete measurement protocol that (a) confirms end-to-end on-chip inference without external post-processing, (b) reports the number of dies and dies-to-die variation statistics, and (c) itemizes all power components (array, IMA, digital LIF, leakage) under the exact conditions used for the accuracy runs. Absence of these details leaves the quoted figures unverifiable as representative silicon performance.

minor comments (1)

[Abstract] Abstract: the statement of '1.6x improvement' does not identify the reference design or prior result against which the factor is computed.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment on the measurement protocol below and will revise the manuscript to strengthen verifiability of the reported results.

read point-by-point responses

Referee: [Abstract and experimental results section] Abstract and experimental results section: the headline claims rest entirely on measured accuracies (97.2% N-MNIST, 95.5% DVS Gesture) and 0.8 pJ/SOP efficiency. The manuscript must supply a complete measurement protocol that (a) confirms end-to-end on-chip inference without external post-processing, (b) reports the number of dies and dies-to-die variation statistics, and (c) itemizes all power components (array, IMA, digital LIF, leakage) under the exact conditions used for the accuracy runs. Absence of these details leaves the quoted figures unverifiable as representative silicon performance.

Authors: We agree that the current version lacks sufficient detail on the measurement protocol, which is required to establish that the headline figures are representative of fabricated silicon performance. In the revised manuscript we will add a dedicated subsection 'Silicon Measurement Protocol' immediately preceding the accuracy results. This subsection will: (a) describe the end-to-end on-chip inference flow, confirming that spike inputs are applied directly to the macro and classification outputs are read from the on-chip digital LIF without any external post-processing; (b) state the number of dies measured and report dies-to-die variation (mean and standard deviation) for both accuracy and energy-efficiency metrics; and (c) provide a table that itemizes power consumption of the array, IMA, digital LIF, and leakage components measured under the exact voltage, frequency, and activity conditions used for the N-MNIST and DVS Gesture accuracy runs. These additions will be placed in the experimental results section and referenced from the abstract. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on direct silicon measurements, not derived quantities or self-referential equations

full rationale

This is a hardware implementation and measurement paper. The central results (97.2% N-MNIST accuracy, 95.5% DVS Gesture accuracy, 0.8 pJ/SOP efficiency) are reported as measured values from fabricated 65 nm silicon under the two operating modes. No mathematical derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing steps appear in the abstract or described content. The design choices (twin 9T cell, reconfigurable IMA, NLD and KWN modes) are presented as engineering decisions whose performance is validated by direct measurement rather than by reduction to prior equations within the paper. This matches the default expectation of a non-circular engineering report.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the assumption that the fabricated 65 nm silicon performs as measured; no free parameters, new axioms, or invented entities are introduced beyond standard CMOS circuit assumptions.

pith-pipeline@v0.9.1-grok · 5752 in / 1067 out tokens · 18704 ms · 2026-06-27T14:58:03.250335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references

[1]

2021 , publisher=

Jhang, Chuan-Jia and Xue, Cheng-Xin and Hung, Je-Min and Chang, Fu-Chun and Chang, Meng-Fan , journal=. 2021 , publisher=

2021
[2]

2022 , organization=

Jiang, Hongwu and Li, Wantong and Huang, Shanshi and Yu, Shimeng , booktitle=. 2022 , organization=

2022
[3]

2023 , organization=

Zhang, Jilin and others , booktitle=. 2023 , organization=

2023
[4]

2023 , organization=

Kim, Sangyeob and others , booktitle=. 2023 , organization=

2023
[5]

2023 , organization=

Niwa, Atsumi and others , booktitle=. 2023 , organization=

2023
[6]

2024 , organization=

Liu, Ying and others , booktitle=. 2024 , organization=

2024
[7]

2024 , organization=

Yang, Jiyue and others , booktitle=. 2024 , organization=

2024
[8]

2023 , publisher=

Kim, Sangyeob and others , journal=. 2023 , publisher=

2023
[9]

2025 , publisher=

Dong, Shuai and others , journal=. 2025 , publisher=

2025
[10]

2022 , publisher=

Yu, Chengshuo and others , journal=. 2022 , publisher=

2022
[11]

2022 , organization=

Frenkel, Charlotte and others , booktitle=. 2022 , organization=

2022
[12]

2025 , organization=

Fu, Haotian and others , booktitle=. 2025 , organization=

2025
[13]

2025 , publisher=

Akhoundi, Arash and others , journal=. 2025 , publisher=

2025
[14]

2025 , organization=

Sharma, Deepika and others , booktitle=. 2025 , organization=

2025
[15]

2024 , organization=

Choi, Byeongseon and others , booktitle=. 2024 , organization=

2024

[1] [1]

2021 , publisher=

Jhang, Chuan-Jia and Xue, Cheng-Xin and Hung, Je-Min and Chang, Fu-Chun and Chang, Meng-Fan , journal=. 2021 , publisher=

2021

[2] [2]

2022 , organization=

Jiang, Hongwu and Li, Wantong and Huang, Shanshi and Yu, Shimeng , booktitle=. 2022 , organization=

2022

[3] [3]

2023 , organization=

Zhang, Jilin and others , booktitle=. 2023 , organization=

2023

[4] [4]

2023 , organization=

Kim, Sangyeob and others , booktitle=. 2023 , organization=

2023

[5] [5]

2023 , organization=

Niwa, Atsumi and others , booktitle=. 2023 , organization=

2023

[6] [6]

2024 , organization=

Liu, Ying and others , booktitle=. 2024 , organization=

2024

[7] [7]

2024 , organization=

Yang, Jiyue and others , booktitle=. 2024 , organization=

2024

[8] [8]

2023 , publisher=

Kim, Sangyeob and others , journal=. 2023 , publisher=

2023

[9] [9]

2025 , publisher=

Dong, Shuai and others , journal=. 2025 , publisher=

2025

[10] [10]

2022 , publisher=

Yu, Chengshuo and others , journal=. 2022 , publisher=

2022

[11] [11]

2022 , organization=

Frenkel, Charlotte and others , booktitle=. 2022 , organization=

2022

[12] [12]

2025 , organization=

Fu, Haotian and others , booktitle=. 2025 , organization=

2025

[13] [13]

2025 , publisher=

Akhoundi, Arash and others , journal=. 2025 , publisher=

2025

[14] [14]

2025 , organization=

Sharma, Deepika and others , booktitle=. 2025 , organization=

2025

[15] [15]

2024 , organization=

Choi, Byeongseon and others , booktitle=. 2024 , organization=

2024