Defensive Quantization: When Efficiency Meets Robustness

· 2019 · cs.LG · arXiv 1904.08444

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Neural network quantization is becoming an industry standard to efficiently deploy deep learning models on hardware platforms, such as CPU, GPU, TPU, and FPGAs. However, we observe that the conventional quantization approaches are vulnerable to adversarial attacks. This paper aims to raise people's awareness about the security of the quantized models, and we designed a novel quantization methodology to jointly optimize the efficiency and robustness of deep learning models. We first conduct an empirical study to show that vanilla quantization suffers more from adversarial attacks. We observe that the inferior robustness comes from the error amplification effect, where the quantization operation further enlarges the distance caused by amplified noise. Then we propose a novel Defensive Quantization (DQ) method by controlling the Lipschitz constant of the network during quantization, such that the magnitude of the adversarial noise remains non-expansive during inference. Extensive experiments on CIFAR-10 and SVHN datasets demonstrate that our new quantization method can defend neural networks against adversarial examples, and even achieves superior robustness than their full-precision counterparts while maintaining the same hardware efficiency as vanilla quantization approaches. As a by-product, DQ can also improve the accuracy of quantized models without adversarial attack.

representative citing papers

Boundary-Aware Quantization: Finite-Scale Decision Geometry of Neural Classifiers

math.OC · 2026-07-01 · unverdicted · novelty 4.0

Quantization of neural classifiers produces measurable boundary shifts captured by Jaccard distances and flip rates that correlate between calibration and held-out sets across bit widths.

Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair

cs.SE · 2026-06-25 · unverdicted · novelty 3.0

Empirical evaluation of 13 quantization configurations on 6 LLMs for APR shows reduced memory (up to 85%) but increased inference time/energy, different repaired problem sets with little overlap, and 48% of configs strictly dominated.

citing papers explorer

Showing 2 of 2 citing papers.

Boundary-Aware Quantization: Finite-Scale Decision Geometry of Neural Classifiers math.OC · 2026-07-01 · unverdicted · none · ref 9 · internal anchor
Quantization of neural classifiers produces measurable boundary shifts captured by Jaccard distances and flip rates that correlate between calibration and held-out sets across bit widths.
Smaller Models, Unexpected Costs: Trade-offs in LLM Quantization for Automated Program Repair cs.SE · 2026-06-25 · unverdicted · none · ref 23 · internal anchor
Empirical evaluation of 13 quantization configurations on 6 LLMs for APR shows reduced memory (up to 85%) but increased inference time/energy, different repaired problem sets with little overlap, and 48% of configs strictly dominated.

Defensive Quantization: When Efficiency Meets Robustness

fields

years

verdicts

representative citing papers

citing papers explorer