pith. machine review for the scientific record. sign in

arxiv: 2605.11989 · v1 · submitted 2026-05-12 · 💻 cs.CV · cs.AI

Recognition: no theorem link

A Transfer Learning Evaluation of Deep Neural Networks for Image Classification

Nermeen Abou Baker, Nico Zengeler, Uwe Handmann

Authors on Pith no claims yet

Pith reviewed 2026-05-13 07:25 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords transfer learningimage classificationpre-trained modelsfine-tuningImageNetmodel selectionaccuracy metricsdeep neural networks
0
0 comments X

The pith

Refining the output layers of eleven ImageNet-pretrained models on five target datasets reveals which model best matches accuracy, training time, and size needs for image classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show how practitioners can pick the right pre-trained deep network for a new image classification problem instead of training one from scratch. It takes eleven models already trained on ImageNet and adapts only their output layers plus some general parameters to five separate target datasets. The authors then record accuracy, accuracy density, training time, and model size for both single-episode and ten-episode training runs. A reader would care because transfer learning is widely used yet the choice of base model still requires trial and error that consumes time and compute. The comparisons give concrete numbers that let users weigh performance against practical limits such as memory and speed.

Core claim

By refining the output layers and general network parameters of eleven ImageNet-pretrained models and applying them to five different target datasets, the study shows that no single model leads on all four measured metrics and that the best choice depends on the specific accuracy, efficiency, and size requirements of the target domain.

What carries the argument

Fine-tuning the final layers of pre-trained convolutional networks to transfer ImageNet knowledge to new target image datasets, scored by accuracy, accuracy density, training time, and model size.

If this is right

  • Model rankings shift across target domains, so selection must be repeated for each new dataset rather than relying on a fixed hierarchy.
  • Accuracy density lets users trade off raw accuracy against model size when memory is constrained.
  • Large differences in training time let practitioners choose faster models when compute budgets are tight.
  • Model size directly affects deployment feasibility on edge hardware.
  • Ten-episode runs give a stability check that single runs miss.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • A simple lookup table or small decision model could be trained on these metrics to suggest base networks for new datasets without full re-testing.
  • The same evaluation pipeline could be applied to other pre-training sources such as self-supervised or multimodal models.
  • Results hint that future architectures might be designed with explicit accuracy-density and speed targets rather than accuracy alone.
  • Partial fine-tuning of earlier layers, not just the output, might further improve adaptation on domains far from ImageNet.

Load-bearing premise

The performance rankings observed on the five chosen target datasets and the chosen fine-tuning procedure will hold for other unseen image classification tasks.

What would settle it

Running the same eleven models through the identical fine-tuning procedure on a sixth dataset whose images differ markedly in content or statistics, such as medical X-rays, and finding that the top-ranked model by the four metrics changes.

Figures

Figures reproduced from arXiv: 2605.11989 by Nermeen Abou Baker, Nico Zengeler, Uwe Handmann.

Figure 1
Figure 1. Figure 1: Infographic of the tested pre-trained models. Each model is introduced with its architecture symbol, the number of layers between brackets, and design specification (see the color map). 3.3. Datasets A combination of standard datasets was tested, which were: CIFAR10 with 60 K images [40], Modified National Institute of Standards and Technology (MNIST) with 70 K images [41], Hymenoptera [42], and non-standa… view at source ↗
Figure 2
Figure 2. Figure 2: Example of a subset of the smartphone dataset. 3.3.3. Augmented Smartphone Dataset Data augmentation is usually used to increase the volume of the dataset effortlessly. We applied a rotation operation in combination with increasing the noise. For the rotation operation, we rotated by r ∈ {45◦ , 135◦ , 225◦ , 315◦}; for the noise operation, we added noise in percentages p ∈ {10%, 25%, 50%} by adding pixels … view at source ↗
Figure 3
Figure 3. Figure 3: Average accuracy densities with full tuning and tuning the classifier layer only for one episode. 5.2. Ten-Episode Learning We tested 10 independent trials and calculated the average results to avoid any bias, as follows. 5.2.1. Fine-Tuning the Full Layers [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: shows that ten-episodes learning did not affect the order of the models in terms of the accuracy densities compared with the one-episode experiments, and the values were higher to a small degree [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Model sizes and accuracy vs. training time for all tasks and models after fine-tuning full layers. 70 80 90 100 110 120 Training Time [s] 66 68 70 72 74 76 78 80 82 84 86 Accuracy % Model sizes (Number of parameters x 10 4 ) and accuracy vs training time for all tasks and models after tuning the classification layers only Alex Dense GoogLe Mnas Mobile Res18 ResNext Squeeze Shuffle Vgg16 WideRes50 5457.18 2… view at source ↗
Figure 6
Figure 6. Figure 6: Model sizes and accuracy vs. training time for all tasks and models after fine-tuning the classifier layer only. 6. Conclusions DNNs’ performance has been enhanced over time in many aspects. Nonetheless, there are critical parameters that define which pre-trained model perfectly matches the application requirements. In this paper, we presented a comprehensive evaluation of eleven popular pre-trained models… view at source ↗
read the original abstract

Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages in achieving high performance while saving training time, memory, and effort in network design. In this paper, we investigate how to select the best pre-trained model that meets the target domain requirements for image classification tasks. In our study, we refined the output layers and general network parameters to apply the knowledge of eleven image processing models, pre-trained on ImageNet, to five different target domain datasets. We measured the accuracy, accuracy density, training time, and model size to evaluate the pre-trained models both in training sessions in one episode and with ten episodes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper conducts an empirical evaluation of transfer learning for image classification by fine-tuning eleven ImageNet-pretrained models on five target datasets. It refines output layers and general parameters, then compares the models using accuracy, accuracy density, training time, and model size across single-episode and ten-episode runs to inform model selection for target domains.

Significance. If the measurements hold, the multi-metric comparison (including efficiency proxies like accuracy density and training time) could offer practical guidance for selecting pretrained models in computer vision applications. The repeated-episode design adds a modest robustness check over single-run results. However, the narrow scope of five datasets and one fine-tuning protocol limits broader applicability.

major comments (3)
  1. [§3 (Methodology)] §3 (Methodology): The fine-tuning protocol is described only at a high level ('refined the output layers and general network parameters') with no details on optimizer, learning-rate schedule, batch size, data augmentation, or which layers were frozen versus updated. These omissions are load-bearing because training time and final accuracy are directly sensitive to them.
  2. [Results tables] Results tables (e.g., Tables 2–4): Accuracy differences between models are reported without standard deviations across runs or any statistical significance tests. This undermines the central claim that the measurements reliably indicate the 'best' model for a target domain.
  3. [§4.2 (Metrics)] §4.2 (Metrics): The definition and exact formula for 'accuracy density' are not provided, so it is impossible to verify whether it normalizes accuracy by model size, parameter count, or another quantity.
minor comments (3)
  1. [Abstract] Abstract and §1: The phrasing 'refined the output layers' should be replaced with standard terminology ('fine-tuned') for consistency with the transfer-learning literature.
  2. [Related Work] Related-work section: Add citations to established transfer-learning benchmarks (e.g., Yosinski et al. 2014, Razavian et al. 2014) to situate the five-dataset evaluation.
  3. [Figures] Figure captions: Add explicit axis labels and units (e.g., 'training time in seconds') to improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and suggestions. We address each of the major comments below and have made revisions to the manuscript where necessary to improve clarity and rigor.

read point-by-point responses
  1. Referee: [§3 (Methodology)] The fine-tuning protocol is described only at a high level ('refined the output layers and general network parameters') with no details on optimizer, learning-rate schedule, batch size, data augmentation, or which layers were frozen versus updated. These omissions are load-bearing because training time and final accuracy are directly sensitive to them.

    Authors: We agree that the methodology requires more detail for reproducibility. In the revised manuscript we have expanded §3 to specify the exact protocol used: Adam optimizer, initial learning rate 0.001 with step decay, batch size 32, random horizontal flips and rotations for augmentation, and fine-tuning of all layers (no freezing). These settings were applied uniformly across the reported experiments. revision: yes

  2. Referee: Results tables (e.g., Tables 2–4): Accuracy differences between models are reported without standard deviations across runs or any statistical significance tests. This undermines the central claim that the measurements reliably indicate the 'best' model for a target domain.

    Authors: We accept that the absence of variability measures weakens the reliability of the comparisons. The revised tables now report standard deviations computed over the ten episodes. We have also added paired t-test p-values between the top models to indicate whether accuracy differences are statistically significant. revision: yes

  3. Referee: [§4.2 (Metrics)] The definition and exact formula for 'accuracy density' are not provided, so it is impossible to verify whether it normalizes accuracy by model size, parameter count, or another quantity.

    Authors: We apologize for the missing definition. Section 4.2 has been revised to state that accuracy density is accuracy divided by model size in megabytes, with the explicit formula accuracy_density = accuracy / model_size_MB. This normalizes performance by storage footprint as originally intended. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical evaluation

full rationale

The paper performs a standard empirical comparison by fine-tuning output layers and parameters of 11 ImageNet-pretrained models on five target datasets, then directly reporting observed values for accuracy, accuracy density, training time, and model size across single and ten-episode runs. No equations, derivations, fitted parameters renamed as predictions, or self-citations are used to justify any load-bearing claim. The central investigation is scoped as an evaluation on the chosen datasets and procedure, with results presented as direct observations rather than general rules derived from prior quantities. The work is self-contained against external benchmarks with no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical evaluation paper. No mathematical derivations, free parameters, axioms, or invented entities are present in the central claim.

pith-pipeline@v0.9.0 · 5422 in / 944 out tokens · 52039 ms · 2026-05-13T07:25:19.681332+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 4 internal anchors

  1. [1]

    An overview of deep learning in medical imaging focusing on MRI.Z

    Lundervold, A.S.; Lundervold, A. An overview of deep learning in medical imaging focusing on MRI.Z. Fur Med. Phys.2019, 29, 102–127, doi:10.1016/j.zemedi.2018.11.002

  2. [2]

    Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis.Remote Sens.2020,12, 86, doi:10.3390/rs12010086

    Pires de Lima, R.; Marfurt, K. Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis.Remote Sens.2020,12, 86, doi:10.3390/rs12010086

  3. [3]

    Transfer Learning for Classification of Optical Satellite Image.Sens

    Zou, M.; Zhong, Y. Transfer Learning for Classification of Optical Satellite Image.Sens. Imaging2018,19, doi:10.1007/s11220-018- 0191-1

  4. [4]

    Feature-fusion transfer learning method as a basis to support automated smartphone recycling in a circular smart city

    Abou Baker, N.; Szabo-Müller, P .; Handmann, U. Feature-fusion transfer learning method as a basis to support automated smartphone recycling in a circular smart city. In Proceedings of the EAI S-CUBE 2020—11th EAI International Conference on Sensor Systems and Software, Aalborg, Denmark, 10–11 December 2020

  5. [5]

    Parameter- Efficient Transfer Learning for NLP .arXiv2019, arXiv:1902.00751

    Houlsby, N.; Giurgiu, A.; Jastrzebski, S.; Morrone, B.; de Laroussilhe, Q.; Gesmundo, A.; Attariyan, M.; Gelly, S. Parameter- Efficient Transfer Learning for NLP .arXiv2019, arXiv:1902.00751

  6. [6]

    The Real-Time Mobile Application for Classifying of Endangered Parrot Species Using the CNN Models Based on Transfer Learning.Mob

    Choe, D.; Choi, E.; Kim, D.K. The Real-Time Mobile Application for Classifying of Endangered Parrot Species Using the CNN Models Based on Transfer Learning.Mob. Inf. Syst.2020,2020, 1–13, doi:10.1155/2020/1475164

  7. [7]

    Ismail Fawaz, G

    Ismail Fawaz, H.; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, P .A. Transfer learning for time series classification. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; doi:10.1109/bigdata.2018.8621990

  8. [8]

    An Analysis of Deep Neural Network Models for Practical Applications.arXiv2017, arXiv:1605.07678

    Canziani, A.; Paszke, A.; Culurciello, E. An Analysis of Deep Neural Network Models for Practical Applications.arXiv2017, arXiv:1605.07678

  9. [9]

    Benchmark Analysis of Representative Deep Neural Network Architectures

    Bianco, S.; Cadene, R.; Celona, L.; Napoletano, P . Benchmark Analysis of Representative Deep Neural Network Architectures. IEEE Access2018,6, 64270–64277, doi:10.1109/access.2018.2877890

  10. [10]

    Zero-Shot Learning Through Cross-Modal Transfer

    Socher, R.; Ganjoo, M.; Sridhar, H.; Bastani, O.; Manning, C.D.; Ng, A.Y. Zero-Shot Learning Through Cross-Modal Transfer. arXiv2013, arXiv:1301.3666

  11. [11]

    Zero-Shot Learning—The Good, the Bad and the Ugly.arXiv2020, arXiv:1703.04394

    Xian, Y.; Schiele, B.; Akata, Z. Zero-Shot Learning—The Good, the Bad and the Ugly.arXiv2020, arXiv:1703.04394

  12. [12]

    Attribute-Based Classification for Zero-Shot Visual Object Categorization.IEEE T rans

    Lampert, C.H.; Nickisch, H.; Harmeling, S. Attribute-Based Classification for Zero-Shot Visual Object Categorization.IEEE T rans. Pattern Anal. Mach. Intell.2014,36, 453–465, doi:10.1109/TPAMI.2013.140

  13. [13]

    Zero-Shot Learning via Semantic Similarity Embedding.arXiv2015, arXiv:1509.04767

    Zhang, Z.; Saligrama, V . Zero-Shot Learning via Semantic Similarity Embedding.arXiv2015, arXiv:1509.04767

  14. [14]

    Label-Embedding for Image Classification.IEEE T rans

    Akata, Z.; Perronnin, F.; Harchaoui, Z.; Schmid, C. Label-Embedding for Image Classification.IEEE T rans. Pattern Anal. Mach. Intell.2016,38, 1425–1438, doi:10.1109/tpami.2015.2487986

  15. [15]

    Cross-generalization: Learning novel classes from a single example by feature replacement

    Bart, E.; Ullman, S. Cross-generalization: Learning novel classes from a single example by feature replacement. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–26 June 2005; Volume 1, pp. 672–679, doi:10.1109/CVPR.2005.117

  16. [16]

    Object Classification from a Single Example Utilizing Class Relevance Metrics

    Fink, M. Object Classification from a Single Example Utilizing Class Relevance Metrics. InAdvances in Neural Information Processing Systems; Saul, L., Weiss, Y., Bottou, L., Eds.; MIT Press: Cambridge, MA, USA, 2005; Volume 17

  17. [17]

    The More You Know, the Less You Learn: From Knowledge Transfer to One-shot Learning of Object Categories

    Tommasi, T.; Caputo, B. The More You Know, the Less You Learn: From Knowledge Transfer to One-shot Learning of Object Categories. In Proceedings of the BMVC, 2009. Available online: http://www.bmva.org/bmvc/2009/Papers/Paper353/Paper3 53.html (accessed on 30 November 2021)

  18. [18]

    Generalizing from a Few Examples: A Survey on Few-Shot Learning.ACM Comput

    Wang, Y.; Yao, Q.; Kwok, J.T.; Ni, L.M. Generalizing from a Few Examples: A Survey on Few-Shot Learning.ACM Comput. Surv. 2020,53, doi:10.1145/3386252

  19. [19]

    Multi-Content GAN for Few-Shot Font Style Transfer.arXiv 2017, arXiv:1712.00516

    Azadi, S.; Fisher, M.; Kim, V .; Wang, Z.; Shechtman, E.; Darrell, T. Multi-Content GAN for Few-Shot Font Style Transfer.arXiv 2017, arXiv:1712.00516

  20. [20]

    Feature Space Transfer for Data Augmentation.arXiv2019, arXiv:1801.04356

    Liu, B.; Wang, X.; Dixit, M.; Kwitt, R.; Vasconcelos, N. Feature Space Transfer for Data Augmentation.arXiv2019, arXiv:1801.04356

  21. [21]

    Label Efficient Learning of Transferable Representations across Domains and Tasks.arXiv 2017, arXiv:1712.00123

    Luo, Z.; Zou, Y.; Hoffman, J.; Fei-Fei, L. Label Efficient Learning of Transferable Representations across Domains and Tasks.arXiv 2017, arXiv:1712.00123

  22. [22]

    Transfer Learning with PipNet: For Automated Visual Analysis of Piping Design

    Tan, W.C.; Chen, I.M.; Pantazis, D.; Pan, S.J. Transfer Learning with PipNet: For Automated Visual Analysis of Piping Design. In Proceedings of the 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), Munich, Germany, 20–24 August 2018; pp. 1296–1301, doi:10.1109/COASE.2018.8560550

  23. [23]

    On the Number of Linear Regions of Deep Neural Networks.arXiv2014, arXiv:1402.1869

    Montúfar, G.; Pascanu, R.; Cho, K.; Bengio, Y. On the Number of Linear Regions of Deep Neural Networks.arXiv2014, arXiv:1402.1869

  24. [24]

    Effect of Depth and Width on Local Minima in Deep Learning.Neural Comput.2019, 31, 1462–1498, doi:10.1162/neco_a_01195

    Kawaguchi, K.; Huang, J.; Kaelbling, L.P . Effect of Depth and Width on Local Minima in Deep Learning.Neural Comput.2019, 31, 1462–1498, doi:10.1162/neco_a_01195

  25. [25]

    A survey of the recent architectures of deep convolutional neural networks.Artif

    Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A survey of the recent architectures of deep convolutional neural networks.Artif. Intell. Rev.2020,53, 5455–5516, doi:10.1007/s10462-020-09825-6

  26. [26]

    The Vanishing Gradient Problem during Learning Recurrent Neural Nets and Problem Solutions.Int

    Hochreiter, S. The Vanishing Gradient Problem during Learning Recurrent Neural Nets and Problem Solutions.Int. J. Uncertain. Fuzziness Knowl.-Based Syst.1998,6, 107–116, doi:10.1142/S0218488598000094

  27. [27]

    Highway Networks

    Srivastava, R.K.; Greff, K.; Schmidhuber, J. Highway Networks.arXiv2015, arXiv:1505.00387

  28. [28]

    Squeeze-and-Excitation Networks

    Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141, doi:10.1109/CVPR.2018.00745. 21 of 21

  29. [29]

    ImageNet Classification with Deep Convolutional Neural Networks.Adv

    Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks.Adv. Neural Inf. Process. Syst.2012,25, 1097–1105

  30. [30]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition.arXiv2014, arXiv:1409.1556

  31. [31]

    Going deeper with convolutions

    Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P .; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V .; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9, doi:10.1109/CVPR.2015.7298594

  32. [32]

    Deep Residual Learning for Image Recognition

    He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition.arXiv2015, arXiv:1512.03385

  33. [33]

    SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size.arXiv2016, arXiv:1602.07360

    Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size.arXiv2016, arXiv:1602.07360

  34. [34]

    In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp

    Xie, S.; Girshick, R.; Dollar, P .; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5987–5995, doi:10.1109/CVPR.2017.634

  35. [35]

    Densely Connected Convolutional Networks

    Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269, doi:10.1109/CVPR.2017.243

  36. [36]

    MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

    Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.arXiv2017, arXiv:1704.04861

  37. [37]

    Wide Residual Networks

    Zagoruyko, S.; Komodakis, N. Wide Residual Networks.arXiv2017, arXiv:1605.07146

  38. [38]

    ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices.arXiv 2017, arXiv:1707.01083

    Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices.arXiv 2017, arXiv:1707.01083

  39. [39]

    MnasNet: Platform-Aware Neural Architecture Search for Mobile.arXiv2019, arXiv:1807.11626

    Tan, M.; Chen, B.; Pang, R.; Vasudevan, V .; Sandler, M.; Howard, A.; Le, Q.V . MnasNet: Platform-Aware Neural Architecture Search for Mobile.arXiv2019, arXiv:1807.11626

  40. [40]

    A Study of the Optimization Algorithms in Deep Learning

    Zaheer, R.; Shaziya, H. A Study of the Optimization Algorithms in Deep Learning. In Proceedings of the 2019 Third International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 10–11 January 2019; pp. 536–539, doi:10.1109/ICISC44355.2019.9036442

  41. [41]

    A Comparison of Quantized Convolutional and LSTM Recurrent Neural Network Models Using MNIST

    Kaziha, O.; Bonny, T. A Comparison of Quantized Convolutional and LSTM Recurrent Neural Network Models Using MNIST. In Proceedings of the 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA), Ras Al Khaimah, United Arab Emirates, 19–21 November 2019; pp. 1–5, doi:10.1109/ICECTA48151.2019.8959793

  42. [42]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library.Adv

    Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library.Adv. Neural Inf. Process. Syst.2019,32, 8024–8035

  43. [43]

    Transfer learning-based method for automated e-waste recycling in smart cities

    Baker, N.A.; Szabo-Mýller, P .; Handmann, U. Transfer learning-based method for automated e-waste recycling in smart cities. EAI Endorsed T rans. Smart Cities2021,5, doi:10.4108/eai.16-4-2021.169337

  44. [44]

    Review of Image Classification Algorithms Based on Convolutional Neural Networks.Remote Sens.2021,13, 4712, doi:10.3390/rs13224712

    Chen, L.; Li, S.; Bai, Q.; Yang, J.; Jiang, S.; Miao, Y. Review of Image Classification Algorithms Based on Convolutional Neural Networks.Remote Sens.2021,13, 4712, doi:10.3390/rs13224712