ReRAM-aware Model Finetuning addressing I-V Non-linearity and Retention Errors

Arnab Raha; Ching-Yi Lin; Sahil Shah; Shamik Kundu

arxiv: 2606.17471 · v1 · pith:P6O7I4QKnew · submitted 2026-06-16 · 💻 cs.LG · cs.SY· eess.SY

ReRAM-aware Model Finetuning addressing I-V Non-linearity and Retention Errors

Ching-Yi Lin , Shamik Kundu , Arnab Raha , Sahil Shah This is my paper

Pith reviewed 2026-06-27 01:56 UTC · model grok-4.3

classification 💻 cs.LG cs.SYeess.SY

keywords ReRAMIn-Memory ComputingHardware-aware trainingFinetuningI-V non-linearityRetention errorsDeep Neural Networks

0 comments

The pith

A finetuning method using range-shrunk sinh transformation and retention-error regularization lets large DNNs run on ReRAM hardware with near-base accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to show that existing large models can be adapted to ReRAM in-memory computing hardware through lightweight finetuning rather than full retraining from scratch. It applies a range-shrunk sinh transformation to handle I-V non-linearity and adds a regularization term that accounts for retention errors directly in the loss. The approach is tested on image classification models including ResNet18, DeiT-Tiny and MobileNetV3 plus a question-answering task on SQuAD v2. A sympathetic reader would care because it reduces the prohibitive compute cost of making modern models compatible with energy-efficient but imperfect ReRAM arrays. Results indicate accuracy stays close to the original model in most cases.

Core claim

By applying a range-shrunk sinh transformation to mitigate I-V non-linearity and incorporating retention errors into a regularization loss during finetuning, the method enables robust DNN deployment on ReRAM crossbar arrays with accuracy comparable to the base model on tasks like image classification and question answering.

What carries the argument

Range-shrunk sinh transformation for I-V non-linearity combined with retention-error regularization in the finetuning loss.

If this is right

ResNet18 and DeiT-Tiny maintain accuracy levels similar to their base models after finetuning.
MobileNetV3 variants on ImageNet experience less than 2 percent accuracy degradation.
SQuAD v2 question-answering sees only a 1-point drop in F-1 score.
The same finetuning procedure applies across both image classification and QA tasks with low overhead.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The regularization technique might transfer to other non-volatile memory types that share similar error profiles.
If the error models prove reliable, the method could shorten the time needed to port new models to ReRAM-based edge devices.
The approach suggests a general pattern for adding hardware-specific penalties to finetuning objectives rather than full retraining.

Load-bearing premise

The chosen range-shrunk sinh model of I-V non-linearity and the retention-error model used in the regularization loss accurately capture the actual hardware behavior that will be encountered at deployment time.

What would settle it

Running the finetuned model on physical ReRAM hardware and measuring accuracy that is substantially lower than the simulated results would falsify the effectiveness claim.

read the original abstract

Traditional CPU, GPU, and NPU architectures are increasingly limited by the von Neumann bottleneck. While In-Memory Computing (IMC) using ReRAM crossbar arrays offers a high-density, energy-efficient alternative, its practical deployment is constrained through their non-idealities. Existing hardware-aware training frameworks often require training from scratch, which is computationally prohibitive for modern large-scale models. In this work, we propose a finetuning-based hardware-aware training algorithm that enables robust DNN deployment on ReRAM with minimal training overhead. Our approach mitigates I-V non-linearity by applying a range-shrunk sinh transformation and incorporates retention errors directly into a regularization loss during the finetuning process. We evaluate our framework across models and tasks such as image classification and question-answering (QA). Experimental results demonstrate that our method achieves similar accuracy on large-scale models like ResNet18 and DeiT-Tiny as the base model. In-case of ImageNet for MobileNetV3 families the technique has only less than 2% accuracy degradation. Further, applying the technique on the SQuAD v2 dataset results in only 1 point degradation of F-1 score.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Finetuning with range-shrunk sinh and retention regularization is a reasonable practical step but rests on unvalidated hardware models.

read the letter

The main thing here is a finetuning recipe that injects a range-shrunk sinh transform for I-V nonlinearity and adds a retention-error term to the loss. It avoids full retraining from scratch, which matters for larger models, and the abstract claims near-baseline accuracy on ResNet18, DeiT-Tiny, MobileNetV3 on ImageNet, and SQuAD v2 with only small drops.

What stands out is the shift to finetuning plus the specific pairing of those two hardware effects in one procedure. Prior hardware-aware work often trains from scratch or handles one non-ideality at a time, so this combination is at least a modest extension.

The soft spots are straightforward. The abstract supplies no baselines, ablations, error bars, or hardware measurements, and the stress-test note is on point: the sinh and retention models are used directly in training but never shown to match measured ReRAM curves or drift data. If those functional forms are off, the regularization optimizes for the wrong distribution and the reported robustness does not transfer. Without those checks the central claim stays provisional.

This is for people already working on ReRAM or other analog IMC accelerators who need a lightweight way to adapt existing checkpoints. A reader already familiar with the hardware-aware training literature will see the incremental nature immediately.

I would bring it to a reading group for the discussion on model fidelity, but I would not cite it yet. It deserves a serious referee only if the full manuscript adds device measurements and ablations; otherwise the evidence is too thin to justify the time.

Referee Report

1 major / 1 minor

Summary. The paper claims to introduce a finetuning-based approach for hardware-aware training of deep neural networks on ReRAM crossbar arrays. By incorporating a range-shrunk sinh transformation to model I-V non-linearity and retention errors into the regularization loss, the method aims to mitigate these non-idealities with low computational overhead compared to training from scratch. Evaluations on models including ResNet18, DeiT-Tiny, MobileNetV3 on ImageNet, and SQuAD v2 show that the finetuned models achieve accuracy close to the base models, with less than 2% degradation in some cases and 1-point F1 drop in QA.

Significance. Should the simulated hardware models prove representative of physical ReRAM behavior, this work could significantly lower the barrier to deploying large-scale models on energy-efficient in-memory computing hardware. The emphasis on finetuning rather than full retraining is a practical strength that aligns with the needs of modern large models. The approach also provides a concrete way to integrate device non-idealities into the training process.

major comments (1)

[Abstract] The central accuracy claims rely on the fidelity of the range-shrunk sinh I-V model and the retention-error model used in the regularization loss. However, no validation or comparison against physical ReRAM device measurements (e.g., conductance-voltage curves or time-dependent retention data) is provided in the manuscript. This is load-bearing for the claim of addressing actual hardware non-idealities, as mismatch would mean the regularization optimizes for an incorrect error distribution.

minor comments (1)

Minor grammatical and phrasing issues in the abstract: 'In-case of' should be 'In the case of'; 'the technique has only less than 2% accuracy degradation' could be rephrased for clarity as 'the technique results in less than 2% accuracy degradation'.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the importance of model fidelity. We address the major comment below.

read point-by-point responses

Referee: [Abstract] The central accuracy claims rely on the fidelity of the range-shrunk sinh I-V model and the retention-error model used in the regularization loss. However, no validation or comparison against physical ReRAM device measurements (e.g., conductance-voltage curves or time-dependent retention data) is provided in the manuscript. This is load-bearing for the claim of addressing actual hardware non-idealities, as mismatch would mean the regularization optimizes for an incorrect error distribution.

Authors: We agree that the fidelity of the device models is central to the claims. The range-shrunk sinh I-V model and retention-error regularization are based on models previously reported and characterized in the ReRAM device literature rather than new physical measurements collected for this study. Our primary contribution is the finetuning algorithm that incorporates these models with low overhead. We acknowledge that the manuscript does not include direct comparisons to new physical device data, which limits the strength of the hardware-representativeness claim. In revision we will add a dedicated subsection (likely in Section 3 or 4) that (i) cites the specific prior device studies from which the sinh and retention models are taken, (ii) summarizes any published conductance-voltage and retention curves those studies provide, and (iii) explicitly states the simulation assumptions and the consequent limitations. This will allow readers to evaluate applicability to real hardware without requiring new experiments in the current work. revision: partial

Circularity Check

0 steps flagged

Empirical finetuning method with no circular derivations

full rationale

The paper describes a practical finetuning procedure that injects a chosen range-shrunk sinh model and retention-error term into the regularization loss, then reports accuracy on standard benchmarks (ResNet18, DeiT-Tiny, MobileNetV3, SQuAD). No equations or results are shown to reduce to fitted parameters by construction, no self-citation chains support load-bearing uniqueness claims, and the central contribution is an empirical technique rather than a derived prediction. The method is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no explicit free parameters, axioms, or invented entities are described.

pith-pipeline@v0.9.1-grok · 5751 in / 1002 out tokens · 21977 ms · 2026-06-27T01:56:42.066337+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references

[1]

1.1 Computing’s energy problem (and what we can do about it),

M. Horowitz, “1.1 Computing’s energy problem (and what we can do about it),” in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), Feb. 2014, pp. 10–14. LIN et al.: RERAM-A W ARE MODEL FINETUNING ADDRESSING I-V NON-LINEARITY AND RETENTION ERRORS 11

2014
[2]

LLM-NPU: Towards Eﬀicient Foundation Model Inference on Low-Power Neural Processing Units,

A. Raha, S. Kundu, S. N. Sridhar, S. Kundu, S. K. Ghosh, A. Palla, A. Das, D. Crews, and D. A. Mathaikutty, “LLM-NPU: Towards Eﬀicient Foundation Model Inference on Low-Power Neural Processing Units,” in 2025 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Aug. 2025, pp. 1–8, iSSN: 2996-5330. [Online]. A vailable:https://ieeexplor...

arXiv 2025
[3]

Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,

Y. Baek, B. Bae, H. Shin, C. Sonnadara, H. Cho, C.-Y. Lin, Y. Mu, C. Shen, S. Shah, G. Wang, and K. Lee, “Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,” npj Unconventional Computing, vol. 2, no. 1, p. 25, Oct. 2025. [Online]. A vailable: https://www.nature.com/articles/s44335-025-00040-6

2025
[4]

In-Memory Computing: Advances and Prospects,

N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, B. Zhang, and P. Deaville, “In-Memory Computing: Advances and Prospects,” IEEE Solid-State Circuits Magazine, vol. 11, no. 3, pp. 43–55, 2019. [Online]. A vailable: https: //ieeexplore.ieee.org/abstract/document/8811809

arXiv 2019
[5]

Memory devices and applications for in- memory computing,

A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, “Memory devices and applications for in- memory computing,” Nature Nanotechnology, vol. 15, no. 7, pp. 529–544, Jul. 2020. [Online]. A vailable: https://www. nature.com/articles/s41565-020-0655-z

2020
[6]

In-memory computing with resistive switching devices,

D. Ielmini and H.-S. P. Wong, “In-memory computing with resistive switching devices,” Nature Electronics, vol. 1, no. 6, pp. 333–343, Jun. 2018. [Online]. A vailable: https: //www.nature.com/articles/s41928-018-0092-2

2018
[7]

Eﬀicient nonlinear function approximation in analog resistive crossbars for recurrent neural networks,

J. Yang, R. Mao, M. Jiang, Y. Cheng, P.-S. V. Sun, S. Dong, G. Pedretti, X. Sheng, J. Ignowski, H. Li, C. Li, and A. Basu, “Eﬀicient nonlinear function approximation in analog resistive crossbars for recurrent neural networks,” Nature Communications, vol. 16, no. 1, p. 1136, Jan
[8]

A vailable: https://www.nature.com/articles/ s41467-025-56254-6

[Online]. A vailable: https://www.nature.com/articles/ s41467-025-56254-6
[9]

Characterization and Modeling of Multilevel Analog ReRAM Synapses in the Sky130 Process,

I. Didin, C. Brando, C.-Y. Lin, and S. Shah, “Characterization and Modeling of Multilevel Analog ReRAM Synapses in the Sky130 Process,” IEEE Journal on Exploratory Solid-State Computational Devices and Circuits, vol. 12, pp. 27–35,
[10]

A vailable: https://ieeexplore.ieee.org/abstract/ document/11421367

[Online]. A vailable: https://ieeexplore.ieee.org/abstract/ document/11421367

arXiv
[11]

Eﬀicient and optimized methods for alleviating the impacts of ir-drop and fault in rram based neural computing systems,

C. Huang, N. Xu, K. Qiu, Y. Zhu, D. Ma, and L. Fang, “Eﬀicient and optimized methods for alleviating the impacts of ir-drop and fault in rram based neural computing systems,” IEEE Journal of the Electron Devices Society, vol. 9, pp. 645–652, 2021

2021
[12]

Offset-canceling current-sampling sense amplifier for resistive nonvolatile memory in 65 nm cmos,

T. Na, B. Song, J. P. Kim, S. H. Kang, and S.-O. Jung, “Offset-canceling current-sampling sense amplifier for resistive nonvolatile memory in 65 nm cmos,” IEEE Journal of Solid- State Circuits, vol. 52, no. 2, pp. 496–504, 2016

2016
[13]

Mitigating the impact of reram iv nonlinearity and ir drop via fast offline network training,

S. Lee, M. E. Fouda, C. Quan, J. Lee, A. E. Eltawil, and F. Kurdahi, “Mitigating the impact of reram iv nonlinearity and ir drop via fast offline network training,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 44, no. 3, pp. 951–960, 2024

2024
[14]

Geniex: A generalized approach to emulating non-ideality in memristive xbars using neural networks,

I. Chakraborty, M. F. Ali, D. E. Kim, A. Ankit, and K. Roy, “Geniex: A generalized approach to emulating non-ideality in memristive xbars using neural networks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020, pp. 1–6

2020
[15]

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators,

M. J. Rasch, C. Mackin, M. Le Gallo, A. Chen, A. Fasoli, F. Odermatt, N. Li, S. Nandakumar, P. Narayanan, H. Tsai et al., “Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators,” Nature communications, vol. 14, no. 1, p. 5282, 2023

2023
[16]

Learning to train cnns on faulty reram-based manycore accelerators,

B. K. Joardar, J. R. Doppa, H. Li, K. Chakrabarty, and P. P. Pande, “Learning to train cnns on faulty reram-based manycore accelerators,” ACM Transactions on Embedded Computing Systems (TECS), vol. 20, no. 5s, pp. 1–23, 2021

2021
[17]

A compute- in-memory chip based on resistive random-access memory,

W. Wan, R. Kubendran, C. Schaefer, S. B. Eryilmaz, W. Zhang, D. Wu, S. Deiss, P. Raina, H. Qian, B. Gao et al., “A compute- in-memory chip based on resistive random-access memory,” Nature, vol. 608, no. 7923, pp. 504–512, 2022

2022
[18]

Current compliance-dependent nonlinearity in tio 2 reram,

F. Lentz, B. Roesgen, V. Rana, D. J. Wouters, and R. Waser, “Current compliance-dependent nonlinearity in tio 2 reram,” IEEE electron device letters, vol. 34, no. 8, pp. 996–998, 2013

2013
[19]

A data-driven verilog-a reram model,

I. Messaris, A. Serb, S. Stathopoulos, A. Khiat, S. Nikolaidis, and T. Prodromakis, “A data-driven verilog-a reram model,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 12, pp. 3151–3162, 2018

2018
[20]

High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm,

F. Alibart, L. Gao, B. D. Hoskins, and D. B. Strukov, “High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm,” Nanotechnology, vol. 23, no. 7, p. 075201, 2012

2012
[21]

Swipe: Enhancing robustness of reram crossbars for in-memory com- puting,

S. K. Gonugondla, A. D. Patil, and N. R. Shanbhag, “Swipe: Enhancing robustness of reram crossbars for in-memory com- puting,” in Proceedings of the 39th International Conference on Computer-Aided Design, 2020, pp. 1–9

2020
[22]

Online fault detection in reram- based computing systems for inferencing,

M. Liu and K. Chakrabarty, “Online fault detection in reram- based computing systems for inferencing,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 30, no. 4, pp. 392–405, 2022

2022
[23]

Craft: Criticality-aware fault-tolerance enhancement techniques for emerging memories-based deep neural networks,

T.-H. Nguyen, M. Imran, J. Choi, and J.-S. Yang, “Craft: Criticality-aware fault-tolerance enhancement techniques for emerging memories-based deep neural networks,” IEEE Trans- actions on Computer-Aided Design of Integrated Circuits and Systems, vol. 42, no. 10, pp. 3289–3300, 2023

2023
[24]

Fault-free: A framework for analysis and mitigation of stuck-at-fault on realistic reram- based dnn accelerators,

H. Shin, M. Kang, and L.-S. Kim, “Fault-free: A framework for analysis and mitigation of stuck-at-fault on realistic reram- based dnn accelerators,” IEEE Transactions on Computers, vol. 72, no. 7, pp. 2011–2024, 2022

2011
[25]

Accurate evaluation method for hrs retention of vcm reram,

N. Kopperberg, D. Wouters, R. Waser, S. Menzel, and S. Wiefels, “Accurate evaluation method for hrs retention of vcm reram,” APL Materials, vol. 12, no. 3, 2024

2024
[26]

Performance impacts of analog reram non-ideality on neuromorphic computing,

Y.-H. Lin, C.-H. Wang, M.-H. Lee, D.-Y. Lee, Y.-Y. Lin, F.-M. Lee, H.-L. Lung, K.-C. Wang, T.-Y. Tseng, and C.- Y. Lu, “Performance impacts of analog reram non-ideality on neuromorphic computing,” IEEE Transactions on Electron Devices, vol. 66, no. 3, pp. 1289–1295, 2019

2019
[27]

A comprehensive statistical study of the post-programming conductance drift in hfo2-based memristive devices,

D. Maldonado, C. Acal, H. Ortiz, A. Aguilera, J. E. Ruiz- Castro, A. Cantudo, A. Baroni, K. D. S. Reddy, S. Pechmann, M. Uhlmann et al., “A comprehensive statistical study of the post-programming conductance drift in hfo2-based memristive devices,” Materials Science in Semiconductor Processing, vol. 196, p. 109668, 2025

2025
[28]

Simulation of inference accuracy using realistic rram devices,

A. Mehonic, D. Joksas, W. H. Ng, M. Buckwell, and A. J. Kenyon, “Simulation of inference accuracy using realistic rram devices,” Frontiers in neuroscience, vol. 13, p. 593, 2019

2019
[29]

Nonideality-aware training for accurate and robust low-power memristive neural networks,

D. Joksas, E. Wang, N. Barmpatsalos, W. H. Ng, A. J. Kenyon, G. A. Constantinides, and A. Mehonic, “Nonideality-aware training for accurate and robust low-power memristive neural networks,” Advanced Science, vol. 9, no. 17, p. 2105784, 2022

2022
[30]

A quantized training framework for robust and accurate reram-based neural network accelerators,

C. Zhang and P. Zhou, “A quantized training framework for robust and accurate reram-based neural network accelerators,” in Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021, pp. 43–48

2021
[31]

Error-aware training for in-rram computing design considering non-ideal effects in rram crossbar array and peripheral circuits,

S.-H. Chang, R.-H. Yen, and C.-N. Liu, “Error-aware training for in-rram computing design considering non-ideal effects in rram crossbar array and peripheral circuits,” ACM Journal on Emerging Technologies in Computing Systems, vol. 21, no. 2, pp. 1–22, 2025

2025
[32]

Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory,

P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, “Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 27–39, 2016

2016
[33]

A voltage-mode sensing scheme with differential-row weight mapping for energy-eﬀicient rram-based in-memory computing,

W. Wan, R. Kubendran, B. Gao, S. Joshi, P. Raina, H. Wu, G. Cauwenberghs, and H. P. Wong, “A voltage-mode sensing scheme with differential-row weight mapping for energy-eﬀicient rram-based in-memory computing,” in 2020 IEEE Symposium on VLSI Technology. IEEE, 2020, pp. 1–2

2020
[34]

A fast iterative shrinkage- thresholding algorithm for linear inverse problems,

A. Beck and M. Teboulle, “A fast iterative shrinkage- thresholding algorithm for linear inverse problems,” SIAM journal on imaging sciences, vol. 2, no. 1, pp. 183–202, 2009

2009
[35]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

2009
[36]

Know what you don’t know: Unanswerable questions for squad,

P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for squad,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2018, pp. 784–789

2018
[37]

Mlperf inference benchmark,

V. J. Reddi, C. Cheng, D. Kanter, P. Mattson, G. Schmuelling, C.-J. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou et al., “Mlperf inference benchmark,” in 2020 ACM/IEEE 47th 12 IEEE TRANSACTIONS AND JOURNALS TEMPLATE Annual International Symposium on Computer Architecture (ISCA). IEEE, 2020, pp. 446–459

2020
[38]

Searching for mobilenetv3,

A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan et al., “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1314–1324

2019
[39]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

2016
[40]

Training data-eﬀicient image transformers & distillation through attention,

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-eﬀicient image transformers & distillation through attention,” in International conference on machine learning. PMLR, 2021, pp. 10 347–10 357

2021
[41]

Mobilebert: a compact task-agnostic bert for resource-limited devices,

Z. Sun, H. Yu, X. Song, R. Liu, Y. Yang, and D. Zhou, “Mobilebert: a compact task-agnostic bert for resource-limited devices,” in Proceedings of the 58th annual meeting of the association for computational linguistics, 2020, pp. 2158–2170

2020
[42]

Tinybert: Distilling bert for natural language understanding,

X. Jiao, Y. Yin, L. Shang, X. Jiang, X. Chen, L. Li, F. Wang, and Q. Liu, “Tinybert: Distilling bert for natural language understanding,” in Findings of the association for computational linguistics: EMNLP 2020, 2020, pp. 4163–4174

2020
[43]

Universal language model fine-tuning for text classification,

J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in Proceedings of the 56th Annual Meet- ing of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339

2018
[44]

A simple weight decay can improve generalization,

A. Krogh and J. Hertz, “A simple weight decay can improve generalization,” Advances in neural information processing sys- tems, vol. 4, 1991

1991
[45]

Ridge regression: Biased estimation for nonorthogonal problems,

A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970. 0.6 0.4 0.2 0.0 0.2 0.4 0.6 Weight 0 1 2 3 4 5Empirical pdf of w [fW] 50 25 0 25 50 Conductance G ( S) 0 .01 .02 .03Empirical pdf of g [f G] =0.001 =0.003 =0.006 =0.012 =0.024 Fig. 13 : Probability density functi...

1970

[1] [1]

1.1 Computing’s energy problem (and what we can do about it),

M. Horowitz, “1.1 Computing’s energy problem (and what we can do about it),” in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), Feb. 2014, pp. 10–14. LIN et al.: RERAM-A W ARE MODEL FINETUNING ADDRESSING I-V NON-LINEARITY AND RETENTION ERRORS 11

2014

[2] [2]

LLM-NPU: Towards Eﬀicient Foundation Model Inference on Low-Power Neural Processing Units,

A. Raha, S. Kundu, S. N. Sridhar, S. Kundu, S. K. Ghosh, A. Palla, A. Das, D. Crews, and D. A. Mathaikutty, “LLM-NPU: Towards Eﬀicient Foundation Model Inference on Low-Power Neural Processing Units,” in 2025 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Aug. 2025, pp. 1–8, iSSN: 2996-5330. [Online]. A vailable:https://ieeexplor...

arXiv 2025

[3] [3]

Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,

Y. Baek, B. Bae, H. Shin, C. Sonnadara, H. Cho, C.-Y. Lin, Y. Mu, C. Shen, S. Shah, G. Wang, and K. Lee, “Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,” npj Unconventional Computing, vol. 2, no. 1, p. 25, Oct. 2025. [Online]. A vailable: https://www.nature.com/articles/s44335-025-00040-6

2025

[4] [4]

In-Memory Computing: Advances and Prospects,

N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, B. Zhang, and P. Deaville, “In-Memory Computing: Advances and Prospects,” IEEE Solid-State Circuits Magazine, vol. 11, no. 3, pp. 43–55, 2019. [Online]. A vailable: https: //ieeexplore.ieee.org/abstract/document/8811809

arXiv 2019

[5] [5]

Memory devices and applications for in- memory computing,

A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, “Memory devices and applications for in- memory computing,” Nature Nanotechnology, vol. 15, no. 7, pp. 529–544, Jul. 2020. [Online]. A vailable: https://www. nature.com/articles/s41565-020-0655-z

2020

[6] [6]

In-memory computing with resistive switching devices,

D. Ielmini and H.-S. P. Wong, “In-memory computing with resistive switching devices,” Nature Electronics, vol. 1, no. 6, pp. 333–343, Jun. 2018. [Online]. A vailable: https: //www.nature.com/articles/s41928-018-0092-2

2018

[7] [7]

Eﬀicient nonlinear function approximation in analog resistive crossbars for recurrent neural networks,

J. Yang, R. Mao, M. Jiang, Y. Cheng, P.-S. V. Sun, S. Dong, G. Pedretti, X. Sheng, J. Ignowski, H. Li, C. Li, and A. Basu, “Eﬀicient nonlinear function approximation in analog resistive crossbars for recurrent neural networks,” Nature Communications, vol. 16, no. 1, p. 1136, Jan

[8] [8]

A vailable: https://www.nature.com/articles/ s41467-025-56254-6

[Online]. A vailable: https://www.nature.com/articles/ s41467-025-56254-6

[9] [9]

Characterization and Modeling of Multilevel Analog ReRAM Synapses in the Sky130 Process,

I. Didin, C. Brando, C.-Y. Lin, and S. Shah, “Characterization and Modeling of Multilevel Analog ReRAM Synapses in the Sky130 Process,” IEEE Journal on Exploratory Solid-State Computational Devices and Circuits, vol. 12, pp. 27–35,

[10] [10]

A vailable: https://ieeexplore.ieee.org/abstract/ document/11421367

[Online]. A vailable: https://ieeexplore.ieee.org/abstract/ document/11421367

arXiv

[11] [11]

Eﬀicient and optimized methods for alleviating the impacts of ir-drop and fault in rram based neural computing systems,

C. Huang, N. Xu, K. Qiu, Y. Zhu, D. Ma, and L. Fang, “Eﬀicient and optimized methods for alleviating the impacts of ir-drop and fault in rram based neural computing systems,” IEEE Journal of the Electron Devices Society, vol. 9, pp. 645–652, 2021

2021

[12] [12]

Offset-canceling current-sampling sense amplifier for resistive nonvolatile memory in 65 nm cmos,

T. Na, B. Song, J. P. Kim, S. H. Kang, and S.-O. Jung, “Offset-canceling current-sampling sense amplifier for resistive nonvolatile memory in 65 nm cmos,” IEEE Journal of Solid- State Circuits, vol. 52, no. 2, pp. 496–504, 2016

2016

[13] [13]

Mitigating the impact of reram iv nonlinearity and ir drop via fast offline network training,

S. Lee, M. E. Fouda, C. Quan, J. Lee, A. E. Eltawil, and F. Kurdahi, “Mitigating the impact of reram iv nonlinearity and ir drop via fast offline network training,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 44, no. 3, pp. 951–960, 2024

2024

[14] [14]

Geniex: A generalized approach to emulating non-ideality in memristive xbars using neural networks,

I. Chakraborty, M. F. Ali, D. E. Kim, A. Ankit, and K. Roy, “Geniex: A generalized approach to emulating non-ideality in memristive xbars using neural networks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020, pp. 1–6

2020

[15] [15]

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators,

M. J. Rasch, C. Mackin, M. Le Gallo, A. Chen, A. Fasoli, F. Odermatt, N. Li, S. Nandakumar, P. Narayanan, H. Tsai et al., “Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators,” Nature communications, vol. 14, no. 1, p. 5282, 2023

2023

[16] [16]

Learning to train cnns on faulty reram-based manycore accelerators,

B. K. Joardar, J. R. Doppa, H. Li, K. Chakrabarty, and P. P. Pande, “Learning to train cnns on faulty reram-based manycore accelerators,” ACM Transactions on Embedded Computing Systems (TECS), vol. 20, no. 5s, pp. 1–23, 2021

2021

[17] [17]

A compute- in-memory chip based on resistive random-access memory,

W. Wan, R. Kubendran, C. Schaefer, S. B. Eryilmaz, W. Zhang, D. Wu, S. Deiss, P. Raina, H. Qian, B. Gao et al., “A compute- in-memory chip based on resistive random-access memory,” Nature, vol. 608, no. 7923, pp. 504–512, 2022

2022

[18] [18]

Current compliance-dependent nonlinearity in tio 2 reram,

F. Lentz, B. Roesgen, V. Rana, D. J. Wouters, and R. Waser, “Current compliance-dependent nonlinearity in tio 2 reram,” IEEE electron device letters, vol. 34, no. 8, pp. 996–998, 2013

2013

[19] [19]

A data-driven verilog-a reram model,

I. Messaris, A. Serb, S. Stathopoulos, A. Khiat, S. Nikolaidis, and T. Prodromakis, “A data-driven verilog-a reram model,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 12, pp. 3151–3162, 2018

2018

[20] [20]

High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm,

F. Alibart, L. Gao, B. D. Hoskins, and D. B. Strukov, “High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm,” Nanotechnology, vol. 23, no. 7, p. 075201, 2012

2012

[21] [21]

Swipe: Enhancing robustness of reram crossbars for in-memory com- puting,

S. K. Gonugondla, A. D. Patil, and N. R. Shanbhag, “Swipe: Enhancing robustness of reram crossbars for in-memory com- puting,” in Proceedings of the 39th International Conference on Computer-Aided Design, 2020, pp. 1–9

2020

[22] [22]

Online fault detection in reram- based computing systems for inferencing,

M. Liu and K. Chakrabarty, “Online fault detection in reram- based computing systems for inferencing,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 30, no. 4, pp. 392–405, 2022

2022

[23] [23]

Craft: Criticality-aware fault-tolerance enhancement techniques for emerging memories-based deep neural networks,

T.-H. Nguyen, M. Imran, J. Choi, and J.-S. Yang, “Craft: Criticality-aware fault-tolerance enhancement techniques for emerging memories-based deep neural networks,” IEEE Trans- actions on Computer-Aided Design of Integrated Circuits and Systems, vol. 42, no. 10, pp. 3289–3300, 2023

2023

[24] [24]

Fault-free: A framework for analysis and mitigation of stuck-at-fault on realistic reram- based dnn accelerators,

H. Shin, M. Kang, and L.-S. Kim, “Fault-free: A framework for analysis and mitigation of stuck-at-fault on realistic reram- based dnn accelerators,” IEEE Transactions on Computers, vol. 72, no. 7, pp. 2011–2024, 2022

2011

[25] [25]

Accurate evaluation method for hrs retention of vcm reram,

N. Kopperberg, D. Wouters, R. Waser, S. Menzel, and S. Wiefels, “Accurate evaluation method for hrs retention of vcm reram,” APL Materials, vol. 12, no. 3, 2024

2024

[26] [26]

Performance impacts of analog reram non-ideality on neuromorphic computing,

Y.-H. Lin, C.-H. Wang, M.-H. Lee, D.-Y. Lee, Y.-Y. Lin, F.-M. Lee, H.-L. Lung, K.-C. Wang, T.-Y. Tseng, and C.- Y. Lu, “Performance impacts of analog reram non-ideality on neuromorphic computing,” IEEE Transactions on Electron Devices, vol. 66, no. 3, pp. 1289–1295, 2019

2019

[27] [27]

A comprehensive statistical study of the post-programming conductance drift in hfo2-based memristive devices,

D. Maldonado, C. Acal, H. Ortiz, A. Aguilera, J. E. Ruiz- Castro, A. Cantudo, A. Baroni, K. D. S. Reddy, S. Pechmann, M. Uhlmann et al., “A comprehensive statistical study of the post-programming conductance drift in hfo2-based memristive devices,” Materials Science in Semiconductor Processing, vol. 196, p. 109668, 2025

2025

[28] [28]

Simulation of inference accuracy using realistic rram devices,

A. Mehonic, D. Joksas, W. H. Ng, M. Buckwell, and A. J. Kenyon, “Simulation of inference accuracy using realistic rram devices,” Frontiers in neuroscience, vol. 13, p. 593, 2019

2019

[29] [29]

Nonideality-aware training for accurate and robust low-power memristive neural networks,

D. Joksas, E. Wang, N. Barmpatsalos, W. H. Ng, A. J. Kenyon, G. A. Constantinides, and A. Mehonic, “Nonideality-aware training for accurate and robust low-power memristive neural networks,” Advanced Science, vol. 9, no. 17, p. 2105784, 2022

2022

[30] [30]

A quantized training framework for robust and accurate reram-based neural network accelerators,

C. Zhang and P. Zhou, “A quantized training framework for robust and accurate reram-based neural network accelerators,” in Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021, pp. 43–48

2021

[31] [31]

Error-aware training for in-rram computing design considering non-ideal effects in rram crossbar array and peripheral circuits,

S.-H. Chang, R.-H. Yen, and C.-N. Liu, “Error-aware training for in-rram computing design considering non-ideal effects in rram crossbar array and peripheral circuits,” ACM Journal on Emerging Technologies in Computing Systems, vol. 21, no. 2, pp. 1–22, 2025

2025

[32] [32]

Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory,

P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, “Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 27–39, 2016

2016

[33] [33]

A voltage-mode sensing scheme with differential-row weight mapping for energy-eﬀicient rram-based in-memory computing,

W. Wan, R. Kubendran, B. Gao, S. Joshi, P. Raina, H. Wu, G. Cauwenberghs, and H. P. Wong, “A voltage-mode sensing scheme with differential-row weight mapping for energy-eﬀicient rram-based in-memory computing,” in 2020 IEEE Symposium on VLSI Technology. IEEE, 2020, pp. 1–2

2020

[34] [34]

A fast iterative shrinkage- thresholding algorithm for linear inverse problems,

A. Beck and M. Teboulle, “A fast iterative shrinkage- thresholding algorithm for linear inverse problems,” SIAM journal on imaging sciences, vol. 2, no. 1, pp. 183–202, 2009

2009

[35] [35]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

2009

[36] [36]

Know what you don’t know: Unanswerable questions for squad,

P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for squad,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2018, pp. 784–789

2018

[37] [37]

Mlperf inference benchmark,

V. J. Reddi, C. Cheng, D. Kanter, P. Mattson, G. Schmuelling, C.-J. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou et al., “Mlperf inference benchmark,” in 2020 ACM/IEEE 47th 12 IEEE TRANSACTIONS AND JOURNALS TEMPLATE Annual International Symposium on Computer Architecture (ISCA). IEEE, 2020, pp. 446–459

2020

[38] [38]

Searching for mobilenetv3,

A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan et al., “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1314–1324

2019

[39] [39]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

2016

[40] [40]

Training data-eﬀicient image transformers & distillation through attention,

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-eﬀicient image transformers & distillation through attention,” in International conference on machine learning. PMLR, 2021, pp. 10 347–10 357

2021

[41] [41]

Mobilebert: a compact task-agnostic bert for resource-limited devices,

Z. Sun, H. Yu, X. Song, R. Liu, Y. Yang, and D. Zhou, “Mobilebert: a compact task-agnostic bert for resource-limited devices,” in Proceedings of the 58th annual meeting of the association for computational linguistics, 2020, pp. 2158–2170

2020

[42] [42]

Tinybert: Distilling bert for natural language understanding,

X. Jiao, Y. Yin, L. Shang, X. Jiang, X. Chen, L. Li, F. Wang, and Q. Liu, “Tinybert: Distilling bert for natural language understanding,” in Findings of the association for computational linguistics: EMNLP 2020, 2020, pp. 4163–4174

2020

[43] [43]

Universal language model fine-tuning for text classification,

J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in Proceedings of the 56th Annual Meet- ing of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339

2018

[44] [44]

A simple weight decay can improve generalization,

A. Krogh and J. Hertz, “A simple weight decay can improve generalization,” Advances in neural information processing sys- tems, vol. 4, 1991

1991

[45] [45]

Ridge regression: Biased estimation for nonorthogonal problems,

A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970. 0.6 0.4 0.2 0.0 0.2 0.4 0.6 Weight 0 1 2 3 4 5Empirical pdf of w [fW] 50 25 0 25 50 Conductance G ( S) 0 .01 .02 .03Empirical pdf of g [f G] =0.001 =0.003 =0.006 =0.012 =0.024 Fig. 13 : Probability density functi...

1970