Integer quantization for deep learning inference: Principles and empirical evaluation

Integer quantization for deep learning inference: Principles, empirical evaluation , author= · 2004 · arXiv 2004.09602

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

representative citing papers

Transformers Provably Learn to Internalize Chain-of-Thought

cs.LG · 2026-05-27 · unverdicted · novelty 8.0

L-layer transformers under Log-ICoT curriculum provably learn k-parity with poly(n) samples and log k stages, matching explicit CoT efficiency without inference overhead.

Learning through Internalization

cs.LG · 2026-06-18 · unverdicted · novelty 7.0

A simplified one-layer transformer provably learns parities first with explicit CoT supervision then internalizes to direct computation as CoT tokens are removed.

Quantamination: Dynamic Quantization Leaks Your Data Across the Batch

cs.CR · 2026-04-29 · conditional · novelty 7.0

Dynamic quantization creates side channels allowing partial or full recovery of other users' batched data in at least four popular ML frameworks.

DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

DharmaOCR models reach 0.925 and 0.911 extraction scores with 0.40% and 0.20% degeneration rates on a new benchmark covering printed, handwritten, and legal documents, outperforming open-source and commercial baselines via SFT plus DPO.

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

cs.LG · 2022-08-15 · conditional · novelty 7.0

LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.

QuantSR+: Pushing the Limit of Quantized Image Super-Resolution Networks

cs.CV · 2026-05-21 · unverdicted · novelty 6.0

QuantSR+ introduces RBD, QSA, and SFD techniques to achieve state-of-the-art accuracy-efficiency trade-offs in 2-4 bit quantized image super-resolution networks, with reported PSNR gains like 0.29 dB on Urban100 for SwinIR-S.

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

QuIDE defines the Intelligence Index I = (C × P) / log₂(T+1) as a unified score for the compression-accuracy-latency trade-off in quantized neural networks, with experiments showing task-dependent optimal bit widths.

FP8 Formats for Deep Learning

cs.LG · 2022-09-12 · unverdicted · novelty 6.0

FP8 formats E4M3 and E5M2 match 16-bit training accuracy on CNNs, RNNs, and Transformers up to 175B parameters without hyperparameter changes.

Edge AI for Automotive Vulnerable Road User Safety: Deployable Detection via Knowledge Distillation

cs.CV · 2026-04-29 · unverdicted · novelty 5.0

Knowledge distillation trains a 3.9x smaller YOLO student to retain 14.5% higher precision than direct training under INT8 quantization on BDD100K, exceeding the large teacher's FP32 precision while cutting false alarms.

Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential Privacy

cs.CV · 2026-04-25 · unverdicted · novelty 5.0

Adaptive bit-length schedulers plus Laplacian DP in non-IID FL reduce communicated data by up to 52.64% on MNIST and 45% on CIFAR-10 while keeping competitive accuracy and privacy.

citing papers explorer

Showing 5 of 5 citing papers after filters.

Transformers Provably Learn to Internalize Chain-of-Thought cs.LG · 2026-05-27 · unverdicted · none · ref 51
L-layer transformers under Log-ICoT curriculum provably learn k-parity with poly(n) samples and log k stages, matching explicit CoT efficiency without inference overhead.
Learning through Internalization cs.LG · 2026-06-18 · unverdicted · none · ref 28
A simplified one-layer transformer provably learns parities first with explicit CoT supervision then internalizes to direct computation as CoT tokens are removed.
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale cs.LG · 2022-08-15 · conditional · none · ref 170
LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.
QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization cs.LG · 2026-05-05 · unverdicted · none · ref 6
QuIDE defines the Intelligence Index I = (C × P) / log₂(T+1) as a unified score for the compression-accuracy-latency trade-off in quantized neural networks, with experiments showing task-dependent optimal bit widths.
FP8 Formats for Deep Learning cs.LG · 2022-09-12 · unverdicted · none · ref 23
FP8 formats E4M3 and E5M2 match 16-bit training accuracy on CNNs, RNNs, and Transformers up to 175B parameters without hyperparameter changes.

Integer quantization for deep learning inference: Principles and empirical evaluation

fields

years

verdicts

representative citing papers

citing papers explorer