Evolutionary fine tuning of quantized convolution-based deep learning models

Marcin Pietro\'n

arxiv: 2605.05228 · v1 · submitted 2026-04-19 · 💻 cs.LG · cs.AI· cs.NE

Evolutionary fine tuning of quantized convolution-based deep learning models

Marcin Pietro\'n This is my paper

Pith reviewed 2026-05-10 06:21 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.NE

keywords quantized deep learningevolutionary optimizationmodel compressionconvolutional networksaccuracy improvementfine tuningpost-training optimization

0 comments

The pith

Evolutionary adjustment of a small fraction of weights can improve the accuracy of already quantized deep learning models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard nearest-neighbor quantization maps the weights of a pretrained network to a small set of discrete values to cut memory and compute costs, yet this rounding step often leaves some accuracy on the table. The paper tests an evolution strategy that repeatedly changes the quantization state of only a small percentage of weights, using a selected set of operators to explore nearby assignments. Experiments apply the method to pretrained quantized VGG and ResNet models on image classification and detection tasks as well as to autoencoders. The results indicate that suitable operator choices and parameters allow the evolutionary process to raise accuracy in relatively few iterations.

Core claim

The final quantization states obtained by nearest-neighbour rounding do not guarantee optimal accuracy; an evolution strategy that changes the values of a small percentage of weights to different quantization states in each iteration can fast improve the accuracy of quantized models.

What carries the argument

Evolution strategy that iteratively perturbs a small percentage of quantized weights to new quantization levels using a chosen set of operators and parameters.

Load-bearing premise

Evolutionary shifts of a small percentage of weights will reliably increase accuracy on unseen data without overfitting to the validation set used during the search.

What would settle it

Applying the evolved model to a completely held-out test set that played no role in the evolutionary selection and observing no accuracy gain over the original nearest-neighbor quantized model.

read the original abstract

Deep learning models are the most efficient models in many machine learning tasks. The main disadvantage when using them in IoT, mobile devices, independent autonomous or real-time systems is their complexity and memory size. Therefore, much research has concentrated on compression techniques of deep learning architectures. One of the most popular technique is quantization. In most of the works, the quantization is done based on the nearest neighbour quantization technique. This work focuses on improving the quantization efficiency in pretrained and quantized models. This approach has the potential to improve the final accuracy of quantized models. The main postulate of the work is that final quantization states of the network based on nearest neighbour rounding does not guarantee optimal accuracy. In the presented work, the evolution strategy is used as an optimization approach. The evolution in each iteration changes the values of the small percentage of weights. It shifts theirs values to different quantization states. The work shows that proposed evolution with an appropriate set of operators and parameters can fast improve the accuracy of the quantized models. The results are presented for popular architectures such as VGG and Resnet for image classification and detection. Additionally, simulations were carried out for the autoencoder architecture.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies evolutionary search to nudge a few weights across quantization bins after standard rounding, but the abstract gives no numbers or evaluation details to back the accuracy claims.

read the letter

The main point is that nearest-neighbor quantization leaves some accuracy on the table, and a simple evolutionary tweak that moves a small percentage of weights to alternate bins can recover part of it on VGG, ResNet, and autoencoders. The authors treat the final quantization states as an optimization target rather than a fixed rounding outcome, which is a reasonable practical step for deployment on constrained hardware. They use standard evolution operators and report that the process runs quickly with the right parameters. That framing is clear and directly addresses a known pain point in post-training compression. The experiments cover image classification, detection, and reconstruction, which gives the idea some breadth. The approach stays lightweight and does not require retraining the whole model from scratch. The soft spot is the complete lack of quantitative results, baselines, or protocol details in the abstract. No accuracy deltas appear, no mention of how many generations or what population size was used, and no indication that the evolutionary fitness evaluations ran on a validation split kept separate from the final test set. The stress-test concern about overfitting therefore stands: if accuracy on the same data drives selection, the reported gains could shrink or vanish on truly unseen inputs. The citation list also looks thin on recent quantization work, though that is secondary. This is aimed at engineers who already quantize models for edge devices and want a quick post-processing knob. A reader building compression pipelines might borrow the operator set or parameter choices, but the missing experimental rigor keeps it from being immediately actionable. The work shows clear thinking on the problem and honest engagement with the practical limits of rounding, so it deserves peer review to sort out the data splits and add the missing numbers.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an evolutionary strategy to fine-tune quantization levels in pretrained and quantized convolutional models (VGG, ResNet, autoencoders) by iteratively shifting a small percentage of weights to alternate quantization states, aiming to improve accuracy beyond standard nearest-neighbor rounding for image classification and detection tasks.

Significance. If the empirical gains hold under rigorous controls, the method could provide a lightweight post-quantization optimizer suitable for IoT and edge deployment. However, the work is presented purely as an empirical optimizer with no parameter-free derivations, machine-checked proofs, or reproducible code artifacts, and the absence of any quantitative results, baselines, or statistical tests in the abstract and described experiments prevents assessment of practical significance.

major comments (2)

[Abstract] Abstract: the central claim that 'proposed evolution ... can fast improve the accuracy of the quantized models' is asserted without any numerical results, baselines, datasets, accuracy metrics (top-1/top-5), or statistical tests, rendering the empirical contribution unverifiable from the provided text.
[Abstract] Abstract and described experiments: the evolutionary fitness signal is accuracy, yet no indication is given that final reported numbers use a completely held-out test set never seen during selection; if validation accuracy drives the search, the reported gains risk overfitting to dataset-specific noise rather than generalizing to unseen data.

minor comments (1)

[Abstract] The abstract references 'popular architectures such as VGG and Resnet' but provides no details on specific variants, quantization bit-widths, or datasets used in the simulations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point-by-point below and have made revisions to strengthen the presentation of our empirical results and experimental protocol.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'proposed evolution ... can fast improve the accuracy of the quantized models' is asserted without any numerical results, baselines, datasets, accuracy metrics (top-1/top-5), or statistical tests, rendering the empirical contribution unverifiable from the provided text.

Authors: We agree that the abstract should make the empirical contribution more verifiable at a glance. In the revised manuscript we have updated the abstract to include concrete quantitative results drawn from the experiments (accuracy deltas versus standard nearest-neighbor quantization for VGG, ResNet and autoencoder models on the classification and detection tasks described in the paper), the datasets employed, and the primary metric (top-1 accuracy). We also note that all reported figures are averages over multiple independent runs. revision: yes
Referee: [Abstract] Abstract and described experiments: the evolutionary fitness signal is accuracy, yet no indication is given that final reported numbers use a completely held-out test set never seen during selection; if validation accuracy drives the search, the reported gains risk overfitting to dataset-specific noise rather than generalizing to unseen data.

Authors: We have added an explicit clarification both in the abstract and in the experimental section: the evolutionary search uses a validation split for the fitness function, while all final accuracy numbers that appear in the paper (and now in the abstract) are measured on a completely held-out test set that is never seen during training, quantization, or evolutionary fine-tuning. This protocol is now stated unambiguously to confirm that the reported gains reflect generalization rather than validation-set overfitting. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evolutionary optimizer with no derived predictions or self-referential definitions

full rationale

The paper describes an evolutionary strategy that perturbs a small percentage of quantized weights to alternate levels and reports resulting accuracy gains on VGG, ResNet and autoencoder models. No equations, first-principles derivations, or 'predictions' appear in the abstract or described method. The central claim is an empirical observation that the evolutionary search improves accuracy; it does not reduce any reported quantity to a fitted parameter, self-citation chain, or input by construction. The approach is therefore self-contained as a standard optimizer and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is an empirical optimization study; no mathematical axioms, free parameters fitted inside a derivation, or newly postulated entities are described in the abstract.

pith-pipeline@v0.9.0 · 5496 in / 1030 out tokens · 32907 ms · 2026-05-10T06:21:33.507172+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

Al-Hami, M

M. Al-Hami, M. Pietron, R. Casas, and M. Wielgosz.Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification. January 2020

work page 2020
[2]

Song Han, Jeff Pool, John Tran, and William Dally.Learning both weights and connections for efficient neural network. 2015

work page 2015
[3]

Hardware-oriented Approximation of Convolutional Neural Networks

P. Gysel, M. Motamedi, and S. Ghiasi. Hardware-oriented approximation of convolutional neural networks. arXiv:1604.03168, 2016

work page Pith review arXiv 2016
[4]

Anwar, K

S. Anwar, K. Hwang, and W. Sung. Fixed point optimization of deep convolutional neural networks for object recognition.IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1131–1135, 2015

work page 2015
[5]

M.Pietron, M.Karwatowski, M.Wielgosz, and J.Duda.Fast Compression and Optimization of Deep Learning Models for Natural Language Processing. 2019

work page 2019
[6]

Mishra and E

A. Mishra and E. Nurvitadhi. Wrpn: Wide reduced-precision networks.ICLR, 2018

work page 2018
[7]

Jongsoo Park., Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, and Pradeep Dubey.Faster CNNs with Direct Sparse Convolutions and Guided Pruning. 2016

work page 2016
[8]

Zhang, Z

Y . Zhang, Z. Y . Dong, W. Kong, and K. Meng. A composite anomaly detection system for data-driven power plant condition monitoring.IEEE Transactions on Industrial Informatics, 2019

work page 2019
[9]

E. Park, J. Ahn, and S. Yoo. Weighted-entropy-based quantization for deep neural networks.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 July

work page 2017
[10]

Marcin Pietron, Dominik Zurek, and Bartlomiej Sniezynski.Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction, volume 67. 2023

work page 2023
[11]

Philipp Gysel.Ristretto: Hardware-oriented approximation of convolutional neural networks. 2016

work page 2016
[12]

Amjad, Mart van Baalen, Christos Louizos, and Tijmen Blankevoort

Markus Nagel, Rana A. Amjad, Mart van Baalen, Christos Louizos, and Tijmen Blankevoort. Up or down? adaptive rounding for post-training quantization.Proceedings of ICML, 2020

work page 2020
[13]

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

D. Zhang, J. Yang, D. Ye, and G. Hua. Learned quantization for highly accurate and compact deep neural networks. arXiv:1807.10029, 2018. 6 Running Title for Header

work page Pith review arXiv 2018
[14]

S. Jung, C. Son, S. Lee, J. Son, Y . Kwak, J.J. Han, and C Choi. Joint training of low-precision neural network with quantization interval parameters.arXiv:1808.05779, 2018

work page arXiv 2018
[15]

M. D. McDonnell. Training wide residual networks for deployment using a single bit for each weight.ICLR, 2018

work page 2018
[16]

K. Xu, J. An D. Zhang, L. Liu, L. Liu, and D. Wang.GenExp: Multi-objective pruning for deep neural network based on genetic algorithm. 2021

work page 2021
[17]

Evolving Deep Neural Networks

R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, H. Shahrzad, A. Navruzyan, N. Duffy, and B. Hodjat. Evolving deep neural networks.CoRR abs/1703.00548, Mar 2017

work page Pith review arXiv 2017
[18]

Moncae: Multi-objective neuroevolution of convolutional autoencoders.ICLR, Neural Architecture Search Workshop, 2021

Daniel Dimanov, Emili Balaguer-Ballester, Colin Singleton, and Shahin Rostami. Moncae: Multi-objective neuroevolution of convolutional autoencoders.ICLR, Neural Architecture Search Workshop, 2021

work page 2021
[19]

Neuroevolution of autoencoders by genetic algorithm.International Journal of Science and Engineering Investigations, 6:127–131, 2017

Hidehiko Okada. Neuroevolution of autoencoders by genetic algorithm.International Journal of Science and Engineering Investigations, 6:127–131, 2017

work page 2017
[20]

Ensemble neuroevolution-based approach for multivariate time series anomaly detection.Entropy, 23(11), November 2021

Kamil Faber, Marcin Pietron, and Dominik Zurek. Ensemble neuroevolution-based approach for multivariate time series anomaly detection.Entropy, 23(11), November 2021

work page 2021
[21]

Topicbert: A topic-enhanced neural language model fine-tuned for sentiment classification.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021

Yuxiang Zhou, Lejian Liao, Yang Gao, Rui Wang, and Heyan Huang. Topicbert: A topic-enhanced neural language model fine-tuned for sentiment classification.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021

work page 2021
[22]

Rollback ensemble with multiple local minima in fine-tuning deep learning networks.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021

Youngmin Ro, Jongwon Choi, Byeongho Heo, and Jin Young Choi. Rollback ensemble with multiple local minima in fine-tuning deep learning networks.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021. 7

work page 2021

[1] [1]

Al-Hami, M

M. Al-Hami, M. Pietron, R. Casas, and M. Wielgosz.Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification. January 2020

work page 2020

[2] [2]

Song Han, Jeff Pool, John Tran, and William Dally.Learning both weights and connections for efficient neural network. 2015

work page 2015

[3] [3]

Hardware-oriented Approximation of Convolutional Neural Networks

P. Gysel, M. Motamedi, and S. Ghiasi. Hardware-oriented approximation of convolutional neural networks. arXiv:1604.03168, 2016

work page Pith review arXiv 2016

[4] [4]

Anwar, K

S. Anwar, K. Hwang, and W. Sung. Fixed point optimization of deep convolutional neural networks for object recognition.IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1131–1135, 2015

work page 2015

[5] [5]

M.Pietron, M.Karwatowski, M.Wielgosz, and J.Duda.Fast Compression and Optimization of Deep Learning Models for Natural Language Processing. 2019

work page 2019

[6] [6]

Mishra and E

A. Mishra and E. Nurvitadhi. Wrpn: Wide reduced-precision networks.ICLR, 2018

work page 2018

[7] [7]

Jongsoo Park., Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, and Pradeep Dubey.Faster CNNs with Direct Sparse Convolutions and Guided Pruning. 2016

work page 2016

[8] [8]

Zhang, Z

Y . Zhang, Z. Y . Dong, W. Kong, and K. Meng. A composite anomaly detection system for data-driven power plant condition monitoring.IEEE Transactions on Industrial Informatics, 2019

work page 2019

[9] [9]

E. Park, J. Ahn, and S. Yoo. Weighted-entropy-based quantization for deep neural networks.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 July

work page 2017

[10] [10]

Marcin Pietron, Dominik Zurek, and Bartlomiej Sniezynski.Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction, volume 67. 2023

work page 2023

[11] [11]

Philipp Gysel.Ristretto: Hardware-oriented approximation of convolutional neural networks. 2016

work page 2016

[12] [12]

Amjad, Mart van Baalen, Christos Louizos, and Tijmen Blankevoort

Markus Nagel, Rana A. Amjad, Mart van Baalen, Christos Louizos, and Tijmen Blankevoort. Up or down? adaptive rounding for post-training quantization.Proceedings of ICML, 2020

work page 2020

[13] [13]

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

D. Zhang, J. Yang, D. Ye, and G. Hua. Learned quantization for highly accurate and compact deep neural networks. arXiv:1807.10029, 2018. 6 Running Title for Header

work page Pith review arXiv 2018

[14] [14]

S. Jung, C. Son, S. Lee, J. Son, Y . Kwak, J.J. Han, and C Choi. Joint training of low-precision neural network with quantization interval parameters.arXiv:1808.05779, 2018

work page arXiv 2018

[15] [15]

M. D. McDonnell. Training wide residual networks for deployment using a single bit for each weight.ICLR, 2018

work page 2018

[16] [16]

K. Xu, J. An D. Zhang, L. Liu, L. Liu, and D. Wang.GenExp: Multi-objective pruning for deep neural network based on genetic algorithm. 2021

work page 2021

[17] [17]

Evolving Deep Neural Networks

R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, H. Shahrzad, A. Navruzyan, N. Duffy, and B. Hodjat. Evolving deep neural networks.CoRR abs/1703.00548, Mar 2017

work page Pith review arXiv 2017

[18] [18]

Moncae: Multi-objective neuroevolution of convolutional autoencoders.ICLR, Neural Architecture Search Workshop, 2021

Daniel Dimanov, Emili Balaguer-Ballester, Colin Singleton, and Shahin Rostami. Moncae: Multi-objective neuroevolution of convolutional autoencoders.ICLR, Neural Architecture Search Workshop, 2021

work page 2021

[19] [19]

Neuroevolution of autoencoders by genetic algorithm.International Journal of Science and Engineering Investigations, 6:127–131, 2017

Hidehiko Okada. Neuroevolution of autoencoders by genetic algorithm.International Journal of Science and Engineering Investigations, 6:127–131, 2017

work page 2017

[20] [20]

Ensemble neuroevolution-based approach for multivariate time series anomaly detection.Entropy, 23(11), November 2021

Kamil Faber, Marcin Pietron, and Dominik Zurek. Ensemble neuroevolution-based approach for multivariate time series anomaly detection.Entropy, 23(11), November 2021

work page 2021

[21] [21]

Topicbert: A topic-enhanced neural language model fine-tuned for sentiment classification.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021

Yuxiang Zhou, Lejian Liao, Yang Gao, Rui Wang, and Heyan Huang. Topicbert: A topic-enhanced neural language model fine-tuned for sentiment classification.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021

work page 2021

[22] [22]

Rollback ensemble with multiple local minima in fine-tuning deep learning networks.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021

Youngmin Ro, Jongwon Choi, Byeongho Heo, and Jin Young Choi. Rollback ensemble with multiple local minima in fine-tuning deep learning networks.IEEE Transactions on Neural Networks and Learning Systems (Early Access), 2021. 7

work page 2021