Recognition: unknown
Are Data Augmentation and Segmentation Always Necessary? Insights from COVID-19 X-Rays and a Methodology Thereof
Pith reviewed 2026-05-07 10:58 UTC · model grok-4.3
The pith
Lung segmentation is required for reliable COVID-19 detection in chest X-rays, while excessive data augmentation causes overfitting and accuracy loss.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Careful analysis of X-ray images and their corresponding heat maps under expert medical supervision reveals that lung segmentation is necessary for accurate COVID-19 prediction. Test accuracy significantly drops beyond a certain threshold with additional augmented images, indicating model overfitting. The proposed SDL-COVID methodology achieves a precision of 95.21% and a lower false negative rate, ensuring its reliability for COVID-19 detection using chest X-rays.
What carries the argument
Class activation mapping (CAM) to generate heatmaps that visualize CNN attention regions on lung areas, paired with side-by-side training on augmented and non-augmented datasets to locate the overfitting threshold in the SDL-COVID pipeline.
If this is right
- Skipping lung segmentation leaves CNNs free to base COVID-19 predictions on non-lung image features visible in heatmaps.
- Augmenting the dataset past an optimal point lowers test accuracy, showing overfitting in medical X-ray classification.
- SDL-COVID reaches 95.21 percent precision with fewer false negatives than unoptimized approaches.
- Expert validation of activation maps is required to confirm that models attend to the correct anatomical structures.
Where Pith is reading between the lines
- CAM-based checks could be used for other chest X-ray tasks such as pneumonia detection to decide when segmentation is required.
- Augmentation levels should be tested empirically for each medical imaging dataset rather than assumed to have a universal safe limit.
- Embedding expert heatmap review into model pipelines may increase clinical acceptance of AI tools for respiratory diagnostics.
Load-bearing premise
Expert-reviewed class activation maps definitively show that all accurate models must focus on lung regions, and that accuracy drops with more augmentation are caused only by overfitting rather than dataset size or model choices.
What would settle it
A CNN trained on unsegmented COVID-19 X-rays that reaches high test accuracy while its class activation maps show attention mainly outside the lungs, or a model where unlimited augmentation continues to raise or hold test accuracy without overfitting indicators.
Figures
read the original abstract
Purpose: Rapid and reliable diagnostic tools are crucial for managing respiratory diseases like COVID-19, where chest X-ray analysis coupled with artificial intelligence techniques has proven invaluable. However, most existing works on X-ray images have not considered lung segmentation, raising concerns about their reliability. Additionally, some have employed disproportionate and impractical augmentation techniques, making models less generalized and prone to overfitting. This study presents a critical analysis of both issues and proposes a methodology (SDL-COVID) for more reliable classification of chest X-rays for COVID-19 detection. Methods: We use class activation mapping to obtain a visual understanding of the predictions made by Convolutional Neural Networks (CNNs), validating the necessity of lung segmentation. To analyze the effect of data augmentation, deep learning models are implemented on two levels: one for an augmented dataset and another for a non-augmented dataset. Results: Careful analysis of X-ray images and their corresponding heat maps under expert medical supervision reveals that lung segmentation is necessary for accurate COVID-19 prediction. Regarding data augmentation, test accuracy significantly drops beyond a certain threshold with additional augmented images, indicating model overfitting. Conclusion: Our proposed methodology, SDL-COVID, achieves a precision of 95.21% and a lower false negative rate, ensuring its reliability for COVID-19 detection using chest X-rays.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that expert-reviewed class activation maps demonstrate the necessity of lung segmentation for accurate COVID-19 X-ray classification, that test accuracy declines with excessive data augmentation due to overfitting, and that the proposed SDL-COVID methodology achieves 95.21% precision with a lower false-negative rate.
Significance. If the necessity and overfitting claims were supported by controlled ablations, the work would usefully caution against unexamined preprocessing in medical imaging pipelines and highlight the value of expert oversight on model attention maps. The reported precision figure, if reproducible, would indicate a competitive baseline for COVID-19 detection.
major comments (3)
- [Abstract/Results] Abstract and Results: the conclusion that 'lung segmentation is necessary' rests on expert inspection of CAM heatmaps, yet no ablation is described that trains identical CNNs on raw versus explicitly segmented images while holding data splits, hyperparameters, and augmentation fixed; without this counterfactual, the necessity claim remains interpretive rather than demonstrated.
- [Results] Results: the statement that 'test accuracy significantly drops beyond a certain threshold with additional augmented images' is attributed to overfitting, but no training/validation curves, learning-rate schedules, or statistical tests (e.g., paired t-test on accuracy differences) are referenced to rule out dataset-specific effects or augmentation realism issues.
- [Methods] Methods: quantitative details on dataset sizes, sources, train/test splits, exact CNN architectures, augmentation parameters (e.g., rotation range, intensity thresholds), and the concrete components of SDL-COVID are absent, preventing verification of the reported 95.21% precision and lower false-negative rate.
minor comments (1)
- [Abstract] Abstract: the phrase 'two levels' for the augmentation experiments is undefined; clarifying what these levels consist of (e.g., specific augmentation counts or policies) would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and commit to revisions where the manuscript was incomplete.
read point-by-point responses
-
Referee: [Abstract/Results] Abstract and Results: the conclusion that 'lung segmentation is necessary' rests on expert inspection of CAM heatmaps, yet no ablation is described that trains identical CNNs on raw versus explicitly segmented images while holding data splits, hyperparameters, and augmentation fixed; without this counterfactual, the necessity claim remains interpretive rather than demonstrated.
Authors: We agree that the current evidence is interpretive, relying on expert-reviewed CAMs that show non-segmented models attending to extraneous regions. A direct ablation with identical CNNs, fixed splits, hyperparameters, and augmentation would strengthen the necessity claim. We will add this controlled comparison in the revised manuscript. revision: yes
-
Referee: [Results] Results: the statement that 'test accuracy significantly drops beyond a certain threshold with additional augmented images' is attributed to overfitting, but no training/validation curves, learning-rate schedules, or statistical tests (e.g., paired t-test on accuracy differences) are referenced to rule out dataset-specific effects or augmentation realism issues.
Authors: The observed accuracy decline with excessive augmentation was noted across our experiments. To better support the overfitting interpretation and exclude confounds, we will include training/validation curves, learning-rate details, and statistical tests such as paired t-tests on accuracy differences in the revised Results. revision: yes
-
Referee: [Methods] Methods: quantitative details on dataset sizes, sources, train/test splits, exact CNN architectures, augmentation parameters (e.g., rotation range, intensity thresholds), and the concrete components of SDL-COVID are absent, preventing verification of the reported 95.21% precision and lower false-negative rate.
Authors: We acknowledge these details were omitted. The revised Methods section will provide full quantitative information on dataset sizes, sources, splits, CNN architectures, augmentation parameters, and the specific components of SDL-COVID to enable reproducibility and verification of the 95.21% precision result. revision: yes
Circularity Check
No circularity: empirical claims rest on experiments, not self-referential definitions or fitted predictions
full rationale
The paper's core claims—that lung segmentation is necessary based on CAM heatmaps under expert review and that excessive augmentation causes overfitting—are presented as outcomes of direct experimental comparisons (augmented vs. non-augmented datasets) and visual analysis rather than any derivation, equation, or parameter fit that reduces to its own inputs. No mathematical modeling, uniqueness theorems, or self-citations are invoked as load-bearing steps in the provided abstract or methodology description; the SDL-COVID performance numbers are reported as measured results from implementation. This leaves the derivation chain self-contained against external benchmarks with no reduction by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Swaraj, A., Verma, K., Kaur, A., Singh, G., Kumar, A., & de Sales, L. M. (2021). Implementation of stacking based ARIMA model for prediction of Covid-19 cases in India. Journal of Biomedical Informatics, 121, 103887
2021
-
[2]
Y., Ng, M
Lee, E. Y., Ng, M. Y., & Khong, P. L. (2020). COVID-19 pneumonia: what has CT taught us? The Lancet Infectious Diseases, 20(4), 384- 385
2020
-
[3]
Ker, J., Wang, L., Rao, J., & Lim, T. (2017). Deep learning applications in medical image analysis. Ieee Access, 6, 9375 -9389
2017
-
[4]
A., Alsaeedi, A., & Saeed, F
Moujahid, H., Cherradi, B., Al-Sarem, M., Bahatti, L., Eljialy, B. A., Alsaeedi, A., & Saeed, F. (2021). Combining CNN and Grad -Cam for COVID-19 Disease Prediction and Visual Explanation. Intelligent Automation & Soft Computing, 32(2), 723 -745
2021
-
[5]
Z., Islam, M
Islam, M. Z., Islam, M. M., & Asraf, A. (2020). A combined deep CNN -LSTM network for the detection of novel coronavirus (COVID -19) using X-ray images. Informatics in medicine unlocked, 20, 100412
2020
-
[6]
Khalifa, N. E. M., Taha, M. H. N., Hassanien, A. E., & Elghamrawy, S. (2020). Detection of coronavirus (COVID-19) associated pneumonia based on generative adversarial networks and a fine -tuned deep transfer learning model using chest X -ray dataset. arXiv p reprint arXiv:2004.01184
-
[7]
B., Sarker, S., Rahman, S., & Shah, F
Ahmed, S., Hossain, T., Hoque, O. B., Sarker, S., Rahman, S., & Shah, F. M. (2021). Automated COVID -19 detection from chest x -ray images: a high-resolution network (hrnet) approach. SN computer science, 2(4), 1 -17
2021
-
[8]
Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Kashem, S. B. A., ... & Chowdhury, M. E. (2021). Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images. Computers in biology and medicine, 132, 104319
2021
-
[9]
I., de Moura Ramos, J
Morís, D. I., de Moura Ramos, J. J., Buján, J. N., & Hortas, M. O. (2021). Data augmentation approaches using cycle -consistent adversarial networks for improving COVID-19 screening in portable chest X-ray images. Expert Systems with Applications, 185, 115681
2021
-
[10]
Q., & Wong, A
Wang, L., Lin, Z. Q., & Wong, A. (2020). Covid -net: A tailored deep convolutional neural network design for detection of covid -19 cases from chest x-ray images. Scientific Reports, 10(1), 1-12
2020
-
[11]
Narin, A., Kaya, C., & Pamuk, Z. (2021). Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks. Pattern Analysis and Applications, 24(3), 1207 -1220
2021
-
[12]
H., Abdulkareem, G
Abed, M., Mohammed, K. H., Abdulkareem, G. Z., Begonya, M., Salama, A., Maashi, M. S., ... & Mutlag, L. (2021). A comprehensi ve 22 investigation of machine learning feature extraction and classification methods for automated diagnosis of COVID -19 based on X -ray images. Computers, Materials, & Continua, 3289-3310
2021
-
[13]
O., Pereira, R
Teixeira, L. O., Pereira, R. M., Bertolini, D., Oliveira, L. S., Nanni, L., Cavalcanti, G. D., & Costa, Y. M. (2021). Impact of lung segmentation on the diagnosis and explanation of COVID-19 in chest X-ray images. Sensors, 21(21), 7116
2021
-
[14]
Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al -Turjman, F., & Pinheiro, P. R. (2020). Covidgan: data augmentation using auxiliary classifier gan for improved covid-19 detection. Ieee Access, 8, 91916-91923
2020
-
[15]
H., & Masadeh, M
Masadeh, M., Masadeh, A., Alshorman, O., Khasawneh, F. H., & Masadeh, M. A. (2022). An efficient machine learning -based COVID- 19 identification utilizing chest X-ray images. IAES International Journal of Artificial Intelligence, 11(1), 356
2022
-
[16]
Maguolo, G., & Nanni, L. (2021). A critic evaluation of methods for COVID-19 automatic detection from X-ray images. Information Fusion, 76, 1-7
2021
-
[17]
S., Kumar, K., Swaraj, A., Verma, K., Kaur, A., Sharma, S.,
Bhadouria, H. S., Kumar, K., Swaraj, A., Verma, K., Kaur, A., Sharma, S., ... & de Sales, L. M. (2021). Classification of COV ID-19 on chest X-Ray images using Deep Learning model with Histogram Equalization and Lungs Segmentation. arXiv preprint arXiv:2112.02478
-
[18]
DeGrave, A.J., Janizek, J.D. & Lee, SI. AI for radiographic COVID-19 detection selects shortcuts over signal. Nat Mach Intell 3, 610–619 (2021). https://doi.org/10.1038/s42256-021-00338-7
-
[19]
Roberts, M., Driggs, D., Thorpe, M. et al. Common pitfalls and recommendations for using machine learning to detect and progn osticate for COVID-19 using chest radiographs and CT scans. Nat Mach Intell 3, 199 –217 (2021). https://doi.org/10.1038/s42256-021-00307-0
-
[20]
L., Sevillano -García, I., Rey -Area, M., Charte, D.,
Tabik, S., Gómez -Ríos, A., Martín -Rodríguez, J. L., Sevillano -García, I., Rey -Area, M., Charte, D., ... & Herrera, F. (2020). COVIDGR dataset and COVID-SDNet methodology for predicting COVID -19 based on chest X -ray images. IEEE journal of biomedical and he alth informatics, 24(12), 3595-3605
2020
-
[21]
Validating deep learning inference during chest X -ray classification for COVID-19 screening
Sadre, Robbie, Baskaran Sundaram, Sharmila Majumdar, and Daniela Ushizima. "Validating deep learning inference during chest X -ray classification for COVID-19 screening." Scientific reports 11, no. 1 (2021): 1 -10
2021
-
[22]
& Ren, K
Fang, Z., Zhao, H., Ren, J., Maclellan, C., Xia, Y., Li, S., ... & Ren, K. (2022). SC2Net: A Novel Segmentation -based Classification Network for Detection of COVID-19 in Chest X-ray Images. IEEE Journal of Biomedical and Health Informatics
2022
-
[23]
Detection and analysis of COVID- 19 in medical images using deep learning techniques
Yang, Dandi, Cristhian Martinez, Lara Visuña, Hardev Khandhar, Chintan Bhatt, and Jesus Carretero. "Detection and analysis of COVID- 19 in medical images using deep learning techniques." Scientific Reports 11, no. 1 (2021): 1 -13
2021
-
[24]
EDL-COVID: Ensemble Deep Learning for COVID-19 Case Detection From Chest X-Ray Images,
S. Tang et al., "EDL-COVID: Ensemble Deep Learning for COVID-19 Case Detection From Chest X-Ray Images," in IEEE Transactions on Industrial Informatics, vol. 17, no. 9, pp. 6539-6549, Sept. 2021, doi: 10.1109/TII.2021.3057683
-
[25]
COVIDiagnosis -Net: Deep Bayes -SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images
Ucar, Ferhat, and Deniz Korkmaz. "COVIDiagnosis -Net: Deep Bayes -SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images." Medical hypotheses 140 (2020): 109761
2019
-
[26]
(2015, October)
Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U -net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer -assisted intervention (pp. 234-241). Springer, Cham
2015
-
[27]
Zhao, W., Zhong, Z., Xie, X., Yu, Q., & Liu, J. (2020). Relation between chest CT findings and clinical conditions of coronav irus disease (COVID-19) pneumonia: a multicenter study. AJR Am J Roentgenol, 214(5), 1072 -1077
2020
-
[28]
Yasin, R., & Gouda, W. (2020). Chest X -ray findings monitoring COVID -19 disease course and severity. Egyptian Journal of Radiology and Nuclear Medicine, 51(1), 1-18
2020
-
[29]
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE confer ence on computer vision and pattern recognition (pp. 770-778)
2016
-
[30]
& Rabinovich, A
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convoluti ons. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1 -9)
2015
-
[31]
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25
2012
-
[32]
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). SqueezeNet: AlexNet -level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360
-
[33]
J., Li, K., & Fei-Fei, L
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248 -255). Ieee
2009
-
[34]
B., Díaz -Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A.,
Arrieta, A. B., Díaz -Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., ... & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information fusion, 58, 8 2-115
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.