pith. sign in

arxiv: 2508.19600 · v3 · submitted 2025-08-27 · 💻 cs.CV

Quantization Robustness to Input Degradations for Object Detection

Pith reviewed 2026-05-18 21:17 UTC · model grok-4.3

classification 💻 cs.CV
keywords post-training quantizationobject detectionYOLOmodel robustnessinput degradationsINT8 quantizationTensorRT calibration
0
0 comments X

The pith

Static INT8 quantization speeds up YOLO detectors but degradation-aware calibration rarely enhances their robustness to input degradations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether calibrating post-training quantized object detectors with a mix of clean and degraded images improves performance when real inputs include noise, blur, low contrast, or compression artifacts. It tests YOLO models ranging from nano to extra-large scales in FP32, FP16, and INT8 formats on the COCO dataset across seven degradation conditions plus a mixed case. Results show clear speed gains from static INT8 but find that the degradation-aware calibration strategy seldom outperforms standard clean-data calibration, except for larger models facing certain noise types. This matters because efficient detectors on edge devices must handle variable input quality in uncontrolled settings.

Core claim

Static INT8 TensorRT engines deliver substantial speedups of roughly 1.5 to 3.3 times with a moderate accuracy drop of 3 to 7 percent mAP50-95 on clean data, yet the proposed degradation-aware calibration strategy does not yield consistent, broad improvements in robustness over standard clean-data calibration across most models and degradations, with a notable exception for larger model scales under specific noise conditions.

What carries the argument

The degradation-aware calibration strategy for Static INT8 PTQ, which mixes clean and synthetically degraded images during the TensorRT calibration process.

If this is right

  • Quantized detectors achieve notable inference speedups at the cost of moderate accuracy loss on clean inputs.
  • Robustness to input degradations stays largely unchanged by mixing degraded images into calibration for most model sizes and degradation types.
  • Larger-scale models can show targeted robustness improvements under noise when using the mixed calibration approach.
  • Deploying quantized detectors in real environments with variable input quality requires awareness that standard calibration often suffices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Real captured degradations from sensors may produce different robustness patterns than the synthetic versions used here.
  • The influence of model capacity on calibration effectiveness suggests that scaling laws could be tested by comparing even larger architectures.
  • Pairing calibration adjustments with training-time augmentation for degradations might produce stronger combined robustness than either alone.

Load-bearing premise

Synthetic degradations applied to COCO validation images sufficiently represent the real-world input degradations that quantized detectors will encounter at deployment time.

What would settle it

Evaluating the same quantized models on a dataset of authentic degraded images captured by real cameras in uncontrolled conditions, such as motion-blurred video frames or noisy low-light photos, and checking whether mAP differences favor the degradation-aware calibration.

read the original abstract

Post-training quantization (PTQ) is crucial for deploying efficient object detection models, like YOLO, on resource-constrained devices. However, the impact of reduced precision on model robustness to real-world input degradations such as noise, blur, and compression artifacts is a significant concern. This paper presents a comprehensive empirical study evaluating the robustness of YOLO models (nano to extra-large scales) across multiple precision formats: FP32, FP16 (TensorRT), Dynamic UINT8 (ONNX), and Static INT8 (TensorRT). We introduce and evaluate a degradation-aware calibration strategy for Static INT8 PTQ, where the TensorRT calibration process is exposed to a mix of clean and synthetically degraded images. Models were benchmarked on the COCO dataset under seven distinct degradation conditions (including various types and levels of noise, blur, low contrast, and JPEG compression) and a mixed-degradation scenario. Results indicate that while Static INT8 TensorRT engines offer substantial speedups (~1.5-3.3x) with a moderate accuracy drop (~3-7% mAP50-95) on clean data, the proposed degradation-aware calibration did not yield consistent, broad improvements in robustness over standard clean-data calibration across most models and degradations. A notable exception was observed for larger model scales under specific noise conditions, suggesting model capacity may influence the efficacy of this calibration approach. These findings highlight the challenges in enhancing PTQ robustness and provide insights for deploying quantized detectors in uncontrolled environments. All code and evaluation tables are available at https://github.com/AllanK24/QRID.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports an empirical study of post-training quantization (PTQ) robustness for YOLO object detectors (nano to extra-large) under synthetic input degradations on COCO. It benchmarks FP32, FP16 (TensorRT), Dynamic UINT8 (ONNX), and Static INT8 (TensorRT) formats, introduces a degradation-aware calibration strategy for Static INT8 that mixes clean and degraded images during TensorRT calibration, and finds that this strategy yields no consistent robustness gains over standard clean-data calibration across most models and degradations, with a possible exception for larger models under specific noise conditions. Substantial speedups (1.5–3.3×) are reported alongside moderate clean mAP drops (3–7%). All code and tables are released publicly.

Significance. If the empirical findings hold, the work provides practical guidance on the limited benefits of degradation-aware PTQ calibration for quantized detectors and highlights model-scale dependence, which is useful for deployment decisions in uncontrolled environments. The public code release and use of standard COCO benchmarks with multiple backends are positive for reproducibility.

major comments (2)
  1. [§4 (Experimental Setup) and §5 (Results)] The central claim that degradation-aware calibration produces no consistent robustness gains rests on the synthetic degradation pipeline (noise, blur, JPEG, low contrast, mixed) applied independently to COCO validation images. No comparison to real-world degraded datasets or spatially correlated artifacts is presented, which directly affects whether the negative result generalizes beyond the chosen proxy distribution.
  2. [Results tables and §5.2] Table 3 (or equivalent results table) reports mAP drops without error bars or statistical significance tests across multiple random seeds or calibration runs. This weakens the interpretation of the “notable exception” for larger models under noise, as the observed differences could fall within run-to-run variability.
minor comments (2)
  1. [§3.2] Clarify the exact composition and sampling ratios of the mixed clean/degraded calibration set used for TensorRT Static INT8; the description in the methods is high-level.
  2. [Figures 2–4] Figure captions should explicitly state whether error bars represent standard deviation over seeds or over degradation levels.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below and describe the revisions we will incorporate to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§4 (Experimental Setup) and §5 (Results)] The central claim that degradation-aware calibration produces no consistent robustness gains rests on the synthetic degradation pipeline (noise, blur, JPEG, low contrast, mixed) applied independently to COCO validation images. No comparison to real-world degraded datasets or spatially correlated artifacts is presented, which directly affects whether the negative result generalizes beyond the chosen proxy distribution.

    Authors: We appreciate the referee highlighting this scope limitation. Our design uses synthetic degradations as controlled, reproducible proxies to isolate individual artifact effects across scales and levels, which is standard for such systematic PTQ studies. We agree this does not directly address real-world degradations with spatial correlations or natural combinations. In the revision we will add an explicit limitations paragraph in §5 and the conclusions that acknowledges the proxy nature of the pipeline, clarifies that our negative finding on degradation-aware calibration applies to these synthetic conditions, and identifies validation on real-world degraded datasets as an important direction for future work. revision: yes

  2. Referee: [Results tables and §5.2] Table 3 (or equivalent results table) reports mAP drops without error bars or statistical significance tests across multiple random seeds or calibration runs. This weakens the interpretation of the “notable exception” for larger models under noise, as the observed differences could fall within run-to-run variability.

    Authors: We agree that error bars and variability analysis would make the interpretation of the larger-model noise exception more rigorous. TensorRT calibration is largely deterministic for a fixed dataset, yet we will introduce controlled variability by repeating calibration with different random shuffles of the calibration images. In the revised tables we will report mean mAP with standard deviations for the noise conditions, and we will update §5.2 to discuss whether the observed differences exceed this variability. revision: yes

Circularity Check

0 steps flagged

Purely empirical benchmarking with no derivations or self-referential predictions

full rationale

The paper is a direct empirical study measuring mAP drops and speedups for YOLO models under synthetic degradations on COCO validation images. It reports experimental outcomes from running FP32, FP16, Dynamic UINT8, and Static INT8 engines with clean versus degradation-aware calibration; no equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear. All claims reduce to tabulated measurements rather than any chain that collapses to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central empirical claim rests on the assumption that the chosen synthetic degradations and COCO-based test protocol capture deployment-relevant conditions; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Synthetic degradations (noise, blur, low contrast, JPEG) applied to COCO images adequately proxy real-world input degradations
    The paper evaluates robustness exclusively on these constructed test conditions.

pith-pipeline@v0.9.0 · 5822 in / 1153 out tokens · 43818 ms · 2026-05-18T21:17:15.743409+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Edge AI for Automotive Vulnerable Road User Safety: Deployable Detection via Knowledge Distillation

    cs.CV 2026-04 unverdicted novelty 5.0

    Knowledge distillation trains a 3.9x smaller YOLO student to retain 14.5% higher precision than direct training under INT8 quantization on BDD100K, exceeding the large teacher's FP32 precision while cutting false alarms.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · cited by 1 Pith paper · 4 internal anchors

  1. [1]

    W., & Keutzer, K

    Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M. W., & Keutzer, K. (2022). A survey of quantization methods for efficient neural network inference. In Low- power computer vision (pp. 291 -326). Chapman and Hall/CRC

  2. [2]

    Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real -time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788)

  3. [3]

    & Kalenichenko , D

    Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., ... & Kalenichenko , D. (2018). Quantization and training of neural networks for efficient integer - arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2704-2713)

  4. [4]

    Shi, W., Cao, J., Zhang, Q., Li, Y., & Xu, L. (2016). Edge computing: Vision and challenges. IEEE internet of things journal, 3(5), 637-646

  5. [5]

    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft

  6. [6]

    Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding,

    S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding,” in Proc. Int. Conf. Learn. Representations (ICLR), 2016

  7. [7]

    You Only Look Once: Unified, Real -Time Object Detection,

    J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real -Time Object Detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 779–788

  8. [8]

    YOLOv4: Optimal Speed and Accuracy of Object Detection

    A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv preprint, arXiv:2004.10934, 2020

  9. [9]

    YOLOv5 and YOLOv8,

    G. Jocher et al ., “YOLOv5 and YOLOv8,” Ultralytics,

  10. [10]

    Available: https://github.com/ultralytics/yolov5

  11. [11]

    Quantization and Training of Neural Networks for Efficient Integer -Arithmetic-Only Inference,

    B. Jacob et al., “Quantization and Training of Neural Networks for Efficient Integer -Arithmetic-Only Inference,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2704–2713

  12. [12]

    Quantizing deep convolutional networks for efficient inference: A whitepaper

    R. Krishnamoorthi, “Quantizing Deep Convolutional Networks for Efficient Inference: A Whitepaper,” arXiv preprint, arXiv:1806.08342, 2018

  13. [13]

    A White Paper on Neural Network Quantization

    M. Nagel, R. A. Amjad, M. van Baalen, T.Blankevoort, and M. Welling, “A White Paper on Neural Network Quantization,” arXiv preprint, arXiv:2106.08295, 2021

  14. [14]

    NVIDIA Corporation, TensorRT Documentation , 2025.Available: https://docs.nvidia.com/deeplearning/ten sorrt/

  15. [15]

    Benchmarking Neural Network Robustness to Common Corruptions and Perturbations,

    D. Hendrycks and T. Dietterich, “Benchmarking Neural Network Robustness to Common Corruptions and Perturbations,” in Proc. Int. Conf. Learn. Representations (ICLR), 2019

  16. [16]

    A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions

    S. Dodge and L. Karam, “A Study and Comparison of Human and Deep Learning Recognition Performance Under Visual Distortions,” arXiv preprint, arXiv:1705.02498, 2017

  17. [17]

    On the Impact of Low Precision Quantization on the Robustness of Deep Neural Networks,

    M. J. Shafiee, A. Mishra, and A. Wong, “On the Impact of Low Precision Quantization on the Robustness of Deep Neural Networks,” J. Comput. Vis. Imaging Syst., vol. 1, no. 1, 2021

  18. [18]

    Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations,

    I. Hubara, M. Courbariaux, D. Soudry, R. El -Yaniv, and Y. Bengio, “Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations,” J. Mach. Learn. Res., vol. 18, no. 187, pp. 1–30, 2018

  19. [19]

    Microsoft COCO: Common Objects in Context,

    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” European Conference on Computer Vision (ECCV), 2014

  20. [20]

    [Online]

    ONNX Runtime Documentation, Microsoft, 2025. [Online]. Available: https://onnxruntime.ai

  21. [21]

    COCO Dataset: Train2017 and Val2017 Splits,

    T.-Y. Lin et al., “COCO Dataset: Train2017 and Val2017 Splits,” COCO Website , 2017. [Online]. Available: https://cocodataset.org

  22. [22]

    Kazakov, A (2025), QRID

    Karimov, T., Imani, H. Kazakov, A (2025), QRID. Retrieved from https://github.com/AllanK24/QRID/tree/master/results_t ables

  23. [23]

    Albumentations: Fast and Flexible Image Augmentations,

    A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, “Albumentations: Fast and Flexible Image Augmentations,” Information, vol. 11, no. 2, p. 125, 2020