On the Impact of Class Imbalance on the Learning Dynamics of Deep Neural Networks:An Intuitive Insight
Pith reviewed 2026-06-30 12:15 UTC · model grok-4.3
The pith
Class imbalance drives deep neural networks to underfit minority classes early in training while producing non-generalizable representations later.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Experimental monitoring of DNN learning patterns shows that class imbalance has a severe deteriorating impact, driving the model to underfit the minority class samples in the early training epochs while simultaneously learning only the majority class. Although DNN ultimately learns the minority samples, learning in this manner only results in learnt minority representations that are non-generalizable at test phase because they are merely overfitted to keep the overall training loss as low as possible.
What carries the argument
Systematic monitoring of learning patterns on majority versus minority classes in datasets with controlled varying imbalance ratios.
If this is right
- DNNs on imbalanced data will exhibit delayed and ineffective learning of minority classes compared to balanced cases.
- Minority class representations learned under imbalance will show poor test-phase generalization due to loss-driven overfitting rather than pattern capture.
- Standard training without imbalance correction will preferentially acquire majority class knowledge first.
- Imbalance-handling methods must target both the initial underfitting and the subsequent non-generalizable fitting of minority samples.
Where Pith is reading between the lines
- Imbalance-correction techniques may need to act in the earliest epochs to prevent the initial underfitting phase.
- The pattern could be tested on additional model architectures or domains to check if the dynamics are architecture-specific.
- Controlling for dataset size and feature properties in follow-up experiments would strengthen attribution to imbalance alone.
Load-bearing premise
Differences in learning patterns across datasets can be attributed primarily to class imbalance rather than to other factors such as dataset size, feature distributions, or hyperparameter choices.
What would settle it
A controlled experiment showing that minority-class test accuracy remains high and matches training accuracy on imbalanced data without any rebalancing technique would contradict the claim of non-generalizable overfitting.
read the original abstract
Class imbalance in deep neural networks (DNNs) has witnessed a rapid increase in research attention in recent years. However, the varying accounts of the reasons behind the poor performance of DNN on imbalance data in pertinent literature shows that little is known about how this agelong phenomenon impacts the performance of DNNs. A better understanding of this problem is crucial to developing effective DNN-based imbalance methods. Thus, this study systematically investigates the impact of class imbalance on the learning dynamics of DNN by monitoring the learning pattern of DNN models on both the majority and minority classes of datasets of varying imbalance ratios. Experimental findings shows that as against learning from balanced datasets where DNN learns the classes similarly, class imbalance has severe deteriorating impact on the performance of DNN, driving the model to underfit the minority class samples in the early training epochs while simultaneously learning only the majority class. Although DNN ultimately learns the minority samples, learning in this manner only results in learnt minority representations that are non-generalizable at test phase because they are merely overfitted to keep the overall training loss as low as possible.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that class imbalance severely deteriorates DNN performance by driving the model to underfit minority-class samples in early training epochs while learning only the majority class. Although the network eventually learns the minority samples, the resulting representations are non-generalizable at test time because they are merely overfitted to minimize overall training loss. This contrasts with balanced datasets, where classes are learned similarly. The investigation relies on monitoring per-class learning patterns across datasets with varying imbalance ratios.
Significance. If the empirical observations are substantiated with proper isolation of the imbalance ratio and full experimental details, the work could supply useful intuition about why imbalance harms generalization and thereby guide the design of imbalance-handling techniques. The manuscript contains no equations, derivations, machine-checked proofs, or reproducible code, so its contribution rests solely on the quality and controls of the reported experiments.
major comments (2)
- [Abstract] The abstract states experimental findings but supplies no datasets, architectures, training protocols, quantitative metrics, or controls, so it is impossible to verify whether the data actually support the stated claim about underfitting and non-generalizable representations.
- The central claim requires that varying imbalance ratios (while monitoring per-class learning) isolates the effect of the ratio itself. The manuscript does not indicate controls such as subsampling the majority class to hold total sample size N fixed or matching class-conditional distributions across ratios. Without these, early underfitting of the minority class and late overfitting could arise from fewer total examples or shifted data statistics rather than the imbalance ratio.
minor comments (1)
- Grammatical issues: 'Experimental findings shows' should be 'show'. 'agelong phenomenon' is nonstandard; consider 'longstanding phenomenon'.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The comments highlight important aspects of experimental clarity and controls that we will address to strengthen the manuscript. We respond to each major comment below.
read point-by-point responses
-
Referee: [Abstract] The abstract states experimental findings but supplies no datasets, architectures, training protocols, quantitative metrics, or controls, so it is impossible to verify whether the data actually support the stated claim about underfitting and non-generalizable representations.
Authors: We agree that the abstract, as a high-level summary, omits these specifics and could be improved for standalone readability. In the revision we will add a concise statement noting the primary datasets (CIFAR-10 variants with controlled imbalance ratios), architectures (ResNet family), and metrics (per-class loss and accuracy trajectories). Full protocols remain detailed in Section 3; the change is limited to the abstract. revision: yes
-
Referee: The central claim requires that varying imbalance ratios (while monitoring per-class learning) isolates the effect of the ratio itself. The manuscript does not indicate controls such as subsampling the majority class to hold total sample size N fixed or matching class-conditional distributions across ratios. Without these, early underfitting of the minority class and late overfitting could arise from fewer total examples or shifted data statistics rather than the imbalance ratio.
Authors: The concern about isolating the imbalance ratio is well-founded. Our reported experiments construct imbalance by subsampling minority classes while holding the majority class size fixed, which necessarily alters total N. To directly address this, the revised manuscript will include additional controlled experiments that keep total sample size N constant across imbalance ratios (by also subsampling the majority class) and will explicitly confirm that all variants are drawn from the same underlying class-conditional distributions. These new results will be reported alongside the original findings. revision: yes
Circularity Check
No derivation chain; purely observational with no equations or predictions.
full rationale
The paper reports experimental observations on DNN training dynamics under varying imbalance ratios but contains no equations, derivations, fitted parameters presented as predictions, or self-citation chains supporting a mathematical claim. All load-bearing statements are empirical findings from monitoring per-class learning patterns. No step reduces a claimed result to its inputs by construction. The absence of any formal derivation makes circularity analysis inapplicable; the reader's score of 1.0 is consistent with this.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Learning from imbalanced data: open challenges and future directions,
B. Krawczyk, "Learning from imbalanced data: open challenges and future directions," Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221-232, 2016
2016
-
[2]
Machine learning from imbalanced data sets 101,
F. Provost, "Machine learning from imbalanced data sets 101," in Proceedings of the AAAI’2000 workshop on imbalanced data sets, 2000, vol. 68, no. 2000: AAAI Press, pp. 1-3
2000
-
[3]
Cost -sensitive Prediction of Airline Delays Using Machine Learning,
S. Choi, Y. J. Kim, S. Briceno, D. Mavris, and Ieee, "Cost -sensitive Prediction of Airline Delays Using Machine Learning," in 2017 Ieee/Aiaa 36th Digital Avionics Systems Conference, (IEEE -AIAA Digital Avionics Systems Conference, 2017
2017
-
[4]
A Preliminary Study on Learning Challenges in Machine Learning -based Flight Delay Prediction,
I. B. Mustapha, S. M. Shamsuddin, and S. Hasan, "A Preliminary Study on Learning Challenges in Machine Learning -based Flight Delay Prediction," International Journal of Innovative Computing, vol. 9, no. 1, 2019
2019
-
[5]
Applying Cost- Sensitive Classification for Financial Fraud Detection under High Class-Imbalance,
S. O. Moepya, S. S. Akhoury, and F. V. Nelwamondo, "Applying Cost- Sensitive Classification for Financial Fraud Detection under High Class-Imbalance," in 2014 IEEE International Conference on Data Mining Workshop, 14 -14 Dec. 2014 2014, pp. 183 -192, doi: 10.1109/ICDMW.2014.141
-
[6]
Real -time Credit Card Fraud Detection Using Machine Learning,
A. Thennakoon, C. Bhagyani, S. Premadasa, S. Mihiranga, and N. Kuruwitaarachchi, "Real -time Credit Card Fraud Detection Using Machine Learning," in 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 10 -11 Jan. 2019 2019, pp. 488-493, doi: 10.1109/CONFLUENCE.2019.8776942
-
[7]
U. Fiore, A. De Santis, F. Perla, P. Zanetti, and F. Palmieri, "Using generative adversarial networks for improving classification effectiveness in credit card fraud detection," Information Sciences, vol. 479, pp. 448 -455, 2019/04/01/ 2019, doi: https://doi.org/10.1016/j.ins.2017.12.030
-
[8]
C. Arizmendi, D. A. Sierra, A. Vellido, and E. Romero, "Automated classification of brain tumours from short echo time in vivo MRS data using Gaussian Decomposition and Bayesian Neural Networks," Expert Systems with Applications, vol. 41, no. 11, pp. 5296 -5307, 2014/09/01/ 2014, doi: https://doi.org/10.1016/j.eswa.2014.02.031
-
[9]
S. Afzal et al., "A Data Augmentation -Based Framework to Handle Class Imbalance Problem for Alzheimer’s Stage Detection," IEEE Access, vol. 7, pp. 115528 -115539, 2019, doi: 10.1109/ACCESS.2019.2932786
-
[10]
Mining data with rare events: a case study,
C. Seiffert, T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, "Mining data with rare events: a case study," in 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), 2007, vol. 2: IEEE, pp. 132-139
2007
-
[11]
Survey on deep learning with class imbalance,
J. M. J. M. Khoshgoftaar, "Survey on deep learning with class imbalance," Journal of Big Data, vol. 6, no. 1, 2019, doi: 10.1186/s40537-019-0192-5
-
[12]
Solving the under-fitting problem for decision tree algorithms by incremental swarm optimization in rare-event healthcare classification,
J. Li, S. Fong, S. Mohammed, J. Fiaidhi, Q. Chen, and Z. Tan, "Solving the under-fitting problem for decision tree algorithms by incremental swarm optimization in rare-event healthcare classification," Journal of Medical Imaging and Health Informatics, vol. 6, no. 4, pp. 1102-1110,
-
[13]
Research Management Center Universiti Teknologi Malaysia
-
[14]
The class imbalance problem: Significance and strategies,
N. Japkowicz, "The class imbalance problem: Significance and strategies," in Proc. of the Int’l Conf. on Artificial Intelligence, 2000, vol. 56: Citeseer, pp. 111-117
2000
-
[15]
On the class overlap problem in imbalanced data classification,
P. Vuttipittayamongkol, E. Elyan, and A. Petrovski, "On the class overlap problem in imbalanced data classification," Knowledge-based systems, vol. 212, p. 106631, 2021
2021
-
[16]
Deep learning: methods and applications,
L. Deng and D. Yu, "Deep learning: methods and applications," Foundations and trends in signal processing, vol. 7, no. 3 –4, pp. 197- 387, 2014
2014
-
[17]
Representation learning: A review and new perspectives,
Y. Bengio, A. Courville, and P. Vincent, "Representation learning: A review and new perspectives," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1798-1828, 2013
2013
-
[18]
Deep neural networks and tabular data: A survey,
V. Borisov, T. Leemann, K. Seßler, J. Haug, M. Pawelczyk, and G. Kasneci, "Deep neural networks and tabular data: A survey," arXiv preprint arXiv:2110.01889, 2021
-
[19]
Goodfellow, Y
I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016
2016
-
[20]
A systematic study of the class imbalance problem in convolutional neural networks,
M. Buda, A. Maki, and M. A. Mazurowski, "A systematic study of the class imbalance problem in convolutional neural networks," Neural Networks, vol. 106, pp. 249 -259, 2018/10/01/ 2018, doi: https://doi.org/10.1016/j.neunet.2018.07.011
-
[21]
T. Grósz and I. N. T., "Document Classification with Deep Rectifier Neural Networks and Probabilistic Sampling," in Text, Speech and Dialogue, 2014, doi: 10.1007/978 -3-319-10816-2_14. [Online]. Available: http://link.springer.com/chapter/10.1007/978-3-319-10816- 2_14
work page doi:10.1007/978 2014
-
[22]
Learning Imbalanced Datasets with Label -Distribution-Aware Margin Loss,
K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, "Learning Imbalanced Datasets with Label -Distribution-Aware Margin Loss," arXiv preprint arXiv:1906.07413, 2019
-
[23]
Procrustean training for imbalanced deep learning,
H.-J. Ye, D. -C. Zhan, and W. -L. Chao, "Procrustean training for imbalanced deep learning," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 92-102
2021
-
[24]
Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification,
I. B. Mustapha, S. Hasan, H. S. Nabbus, M. M. A. Montaser, S. O. Olatunji, and S. M. Shamsuddin, "Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification," International Journal of Advanced Computer Science and Applications, vol. 14, no. 2, 2023
2023
-
[25]
Balanced - mixup for highly imbalanced medical image classification,
A. Galdran, G. Carneiro, and M. A. González Ballester, "Balanced - mixup for highly imbalanced medical image classification," in Medical Image Computing and Computer Assisted Intervention –MICCAI 2021: 24th International Conference, Strasbourg, France, Septem ber 27–October 1, 2021, Proceedings, Part V 24, 2021: Springer, pp. 323- 333
2021
-
[26]
Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem,
E. Wu, H. Cui, and R. E. Welsch, "Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem," IEEE Access, vol. 8, pp. 91265-91275, 2020
2020
-
[27]
H.-J. Ye, H. -Y. Chen, D.-C. Zhan, and W. -L. Chao, "Identifying and compensating for feature deviation in imbalanced deep learning," arXiv preprint arXiv:2001.01385, 2020
-
[28]
Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation,
Z. Li, K. Kamnitsas, and B. Glocker, "Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation," in International Conference on Medical Image Computing and Computer- Assisted Intervention, 2019: Springer, pp. 402-410
2019
-
[29]
Adjusting decision boundary for class imbalanced learning,
B. Kim and J. Kim, "Adjusting decision boundary for class imbalanced learning," IEEE Access, vol. 8, pp. 81674-81685, 2020
2020
-
[30]
Feature transfer learning for face recognition with under -represented data,
X. Yin, X. Yu, K. Sohn, X. Liu, and M. Chandraker, "Feature transfer learning for face recognition with under -represented data," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5704-5713
2019
-
[31]
Balanced meta -softmax for long -tailed visual recognition,
J. Ren et al., "Balanced meta -softmax for long -tailed visual recognition," arXiv preprint arXiv:2007.10740, 2020
-
[32]
Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,
P. J. Rousseeuw, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis," Journal of computational and applied mathematics, vol. 20, pp. 53-65, 1987
1987
-
[33]
Deep Learning and Data Sampling with Imbalanced Big Data,
J. M. Johnson and T. M. Khoshgoftaar, "Deep Learning and Data Sampling with Imbalanced Big Data," in 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), 30 July -1 Aug. 2019 2019, pp. 175 -183, doi: 10.1109/IRI.2019.00038
-
[34]
Deep MLPs for Imbalanced Classification,
D. Díaz -Vico, A. R. Figueiras -Vidal, and J. R. Dorronsoro, "Deep MLPs for Imbalanced Classification," in 2018 International Joint Conference on Neural Networks (IJCNN), 8-13 July 2018 2018, pp. 1- 7, doi: 10.1109/IJCNN.2018.8489504
-
[35]
Understanding the difficulty of training deep feedforward neural networks,
X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010: JMLR Workshop and Conference Proceedings, pp. 249-256
2010
-
[36]
Adam: A Method for Stochastic Optimization
D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[37]
Four equity considerations for the use of artificial intelligence in public health,
M. J. Smith, R. Axler, S. Bean, F. Rudzicz, and J. Shaw, "Four equity considerations for the use of artificial intelligence in public health," Bulletin of the World Health Organization, vol. 98, no. 4, p. 290, 2020
2020
-
[38]
Loss landscapes and optimization in over-parameterized non -linear systems and neural networks,
C. Liu, L. Zhu, and M. Belkin, "Loss landscapes and optimization in over-parameterized non -linear systems and neural networks," arXiv preprint arXiv:2003.00307, 2020
-
[39]
Classification with class imbalance problem: a review,
A. Ali, S. M. Shamsuddin, and A. L. Ralescu, "Classification with class imbalance problem: a review," Int. J. Advance Soft Compu. Appl, vol. 7, no. 3, pp. 176-204, 2015
2015
-
[40]
Class -balanced loss based on effective number of samples,
Y. Cui, M. Jia, T. -Y. Lin, Y. Song, and S. Belongie, "Class -balanced loss based on effective number of samples," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9268-9277. V. APPENDIX Table A.1 Binary Imbalanced Datasets Data #Instances #Attributes %Majority Class %Minority Class IR SC abalone19 4174 8 99.23 0...
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.