pith. sign in

arxiv: 2606.29715 · v1 · pith:G36VIKFBnew · submitted 2026-06-29 · 💻 cs.CV

Accurate Recognition of Pneumonia and COVID-19 by Geometric Shape Normalization of Lung Region using Automatic Landmark Detection and Piecewise Affine Warping

Pith reviewed 2026-06-30 06:47 UTC · model grok-4.3

classification 💻 cs.CV
keywords chest X-rayCOVID-19 detectionpneumonia classificationlandmark detectiongeometric normalizationpiecewise affine warpinglung alignment
0
0 comments X

The pith

Geometric normalization of lung shapes in chest X-rays improves COVID-19 and pneumonia classification by reducing acquisition artifacts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that automatically detecting 15 lung-contour landmarks and warping each image to a standard lung shape produces inputs that let a ResNet-18 classifier reach 98.60 percent accuracy on the COVID-19 Radiography Database. This approach outperforms simple artifact masking or cropping on both that database and a mixed adult-pediatric set. A sympathetic reader would care because the method aims to make disease recognition depend more on lung pathology and less on how the radiograph was acquired. The results indicate that the normalized representation is more controlled and artifact-resistant than unaligned alternatives.

Core claim

A ResNet-18 detector locates 15 lung-contour landmarks with 3.61-pixel mean error; these landmarks drive Generalized Procrustes Analysis, Delaunay triangulation, and piecewise affine warping that maps every lung region to one fixed shape. Classifiers trained on the resulting normalized images achieve 98.60 percent accuracy and 98.00 percent F1-macro on the COVID-19 Radiography Database under five-fold cross-validation and outperform artifact-masked unaligned images on both the original database (98.60 percent versus 96.24 percent) and a balanced adult-pediatric collection (94.67 percent versus 94.17 percent).

What carries the argument

Piecewise affine warping driven by 15 automatically detected lung-contour landmarks, which standardizes lung shape while preserving disease-relevant features for the downstream classifier.

If this is right

  • Normalized images yield higher accuracy than artifact-masked or cropped unaligned versions on the COVID-19 Radiography Database.
  • The same advantage appears on a balanced dataset that mixes adult and pediatric cases.
  • Grad-CAM visualizations indicate the classifier relies less on acquisition artifacts when given normalized inputs.
  • The landmark detector reaches 3.61-pixel mean error via an ensemble of four ResNet-18 models with test-time augmentation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same landmark-and-warp pipeline could be tested on other chest X-ray tasks such as tuberculosis or cardiomegaly detection to check whether shape standardization helps more broadly.
  • If landmark detection error grows on lower-quality scans, the performance gain from normalization may shrink or reverse.
  • Combining this geometric step with intensity-based augmentations might further stabilize classification across different scanners.

Load-bearing premise

The 15 predicted landmarks are accurate enough that the warping step removes shape variation without distorting the disease features the classifier needs.

What would settle it

A new test set in which all acquisition artifacts have already been removed by other means, showing that normalized images no longer outperform unaligned ones, would falsify the claim that normalization supplies an artifact-resistant representation.

Figures

Figures reproduced from arXiv: 2606.29715 by Aldrin Barreto-Flores, Lauro Reyes-Cocoletzi, Rafael Alejandro Cruz-Ovando, Salvador E. Ayala-Raggi.

Figure 1
Figure 1. Figure 1: Evaluation pipeline: input radiograph (299 [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Landmark prediction model architecture ( [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: GPA process: (a) original unaligned shapes, (b) centered and scaled configurations, (c) rotation-aligned [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Delaunay triangulation on the standard shape. The 15 landmarks (red points) form a mesh of 16 triangles. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Geometric normalization examples. Top: original radiographs showing variations in size, position, and [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Prediction error by landmark. (a) Mean error for each point L1–L15; dashed line indicates global mean (3.61 [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Confusion matrix of the Warped + SAHS classifier obtained from the aggregated out-of-fold predictions of [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: ROC curves (one-vs-rest): (a) COVID-19 vs. rest (AUC=0.995), (b) Normal vs. rest (AUC=0.993), (c) Viral [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Model learning curves, including the evolution of the loss function and accuracy during the training and [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Confusion matrix of the first repeated split of the Warped + SAHS balanced validation dataset (Accuracy: [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Confusion matrix of the first repeated split of the Warped + SAHS balanced test dataset (Accuracy: 94.67%, [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Grad-CAM visualization comparing original (left) and warped (right) images for COVID-19, Normal, and [PITH_FULL_IMAGE:figures/full_fig_p012_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Representative failure cases showing characteristic misclassification patterns. Top: COVID-19 cases [PITH_FULL_IMAGE:figures/full_fig_p013_13.png] view at source ↗
read the original abstract

This paper presents an automatic system for recognizing pulmonary diseases in chest X-rays using geometric normalization of the lung region. The method combines three modules: (1) a ResNet-18 landmark detector with coordinate attention that predicts 15 lung-contour landmarks, achieving a mean localization error of 3.61 pixels through an ensemble of four models with test-time augmentation; (2) a geometric normalizer based on Generalized Procrustes Analysis, Delaunay triangulation, and piecewise affine warping to map each lung region to a standardized shape; and (3) a ResNet-18 classifier with transfer learning and SAHS contrast enhancement to classify images as COVID-19, Viral Pneumonia, or Normal. On the COVID-19 Radiography Database, the normalized-image classifier achieved 98.60+/-0.26% accuracy and 98.00% F1-Macro using five-fold cross-validation. Although original images produced slightly higher raw accuracy, Grad-CAM and cropping experiments suggest that this advantage is partly influenced by acquisition artifacts. In contrast, geometrically normalized images outperformed artifact-masked/cropped unaligned images on both the COVID-19 Radiography Database (98.60% vs. 96.24%) and a balanced adult-pediatric mixed dataset including pediatric cases from the Kermany dataset (94.67% vs. 94.17%). These results suggest that anatomical alignment can provide a more controlled and artifact-resistant representation for pulmonary disease recognition.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript describes an automatic system for pulmonary disease recognition in chest X-rays that detects 15 lung-contour landmarks using an ensemble of ResNet-18 models with coordinate attention (mean error 3.61 pixels), applies Generalized Procrustes Analysis, Delaunay triangulation, and piecewise affine warping for geometric normalization of the lung region, and then uses a ResNet-18 classifier with SAHS contrast enhancement. On the COVID-19 Radiography Database, normalized images achieve 98.60% accuracy in 5-fold CV, outperforming artifact-masked unaligned images (96.24%), with similar gains on a mixed adult-pediatric dataset (94.67% vs 94.17%), suggesting improved artifact resistance through anatomical alignment.

Significance. If the central assumption holds, the work shows that landmark-driven geometric normalization can yield more controlled representations for CXR classification that are less influenced by acquisition artifacts and shape variation, with empirical support from public datasets and cross-validation that includes pediatric cases. This could inform future efforts in standardizing anatomical alignment for pulmonary disease detection.

major comments (1)
  1. [Abstract / geometric normalizer description] The central claim that normalized images provide a more artifact-resistant representation rests on the premise that 15 contour landmarks (mean error 3.61 pixels) plus piecewise affine warping preserve disease-relevant internal features such as opacities and consolidations. No quantitative check (e.g., sensitivity analysis on landmark error or pre/post-warping texture/feature comparison) is reported to substantiate that the residual localization error leaves these signals unchanged.
minor comments (1)
  1. [Results] Accuracy differences are given with 5-fold CV standard deviations but without statistical significance testing (e.g., paired tests on the per-fold results) to establish whether the reported gains over masked/cropped baselines are reliable.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on our manuscript. We address the major point below and will strengthen the validation of feature preservation in the revision.

read point-by-point responses
  1. Referee: [Abstract / geometric normalizer description] The central claim that normalized images provide a more artifact-resistant representation rests on the premise that 15 contour landmarks (mean error 3.61 pixels) plus piecewise affine warping preserve disease-relevant internal features such as opacities and consolidations. No quantitative check (e.g., sensitivity analysis on landmark error or pre/post-warping texture/feature comparison) is reported to substantiate that the residual localization error leaves these signals unchanged.

    Authors: We agree that a direct quantitative check on internal feature preservation would strengthen the central claim. The manuscript currently provides indirect support via classification gains over artifact-masked baselines and Grad-CAM visualizations, but does not include sensitivity analysis on landmark error or pre/post-warping texture comparisons. In the revised manuscript we will add: (1) a sensitivity study perturbing the 15 landmarks within the observed 3.61-pixel mean error and measuring impact on classifier logits and Grad-CAM heatmaps; (2) quantitative texture/feature comparisons (e.g., local entropy, contrast, and Haralick features) computed on disease-relevant regions before and after warping. These additions will be reported in a new subsection of the experiments. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical comparisons are independent of inputs

full rationale

The paper reports measured landmark localization error (3.61 px) from a ResNet-18 ensemble, applies standard GPA + Delaunay + piecewise affine warping, and evaluates classification accuracy on held-out public datasets via five-fold cross-validation. The central result (normalized images outperforming masked/cropped baselines at 98.60% vs 96.24% and 94.67% vs 94.17%) is a direct empirical comparison with no reduction of any claimed prediction to a fitted parameter, self-definition, or self-citation chain. All steps remain falsifiable against external data and do not collapse by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach relies on standard computer-vision primitives and neural-network architectures without new postulated entities. The primary domain assumption is that a fixed set of 15 landmarks suffices to capture lung shape for normalization purposes.

free parameters (1)
  • Number of landmarks
    The choice of exactly 15 lung-contour landmarks is stated without derivation from first principles or exhaustive search; it is treated as given for the warping step.
axioms (1)
  • domain assumption A set of 15 landmarks on the lung contour is sufficient to represent shape variation for the purpose of piecewise affine normalization.
    Invoked when the landmark detector output is fed into Generalized Procrustes Analysis and Delaunay triangulation to produce the standardized image.

pith-pipeline@v0.9.1-grok · 5822 in / 1551 out tokens · 38641 ms · 2026-06-30T06:47:44.631622+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 3 canonical work pages · 1 internal anchor

  1. [1]

    Use of Chest Imaging in COVID-19: A Rapid Advice Guide

    World Health Organization. Use of Chest Imaging in COVID-19: A Rapid Advice Guide. Technical report, World Health Organization, Geneva, Switzerland, 2020

  2. [2]

    CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

    Pranav Rajpurkar, Jeremy Irvin, Kaylie Zhu, Brandon Yang, Hershel Mehta, Tony Duan, Daisy Ding, Aarti Bagul, Curtis Langlotz, Katie Shpanskaya, et al. CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning.arXiv preprint arXiv:1711.05225, 2017

  3. [3]

    COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images.Scientific Reports, 10:19549, 2020

    Linda Wang, Zhong Qiu Lin, and Alexander Wong. COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images.Scientific Reports, 10:19549, 2020

  4. [4]

    Muhammad E. H. Chowdhury, Tawsifur Rahman, Amith Khandakar, Rashid Mazhar, Muhammad A. Kadir, Zaid Bin Mahbub, Khandaker R. Islam, Muhammad Salman Khan, Atif Iqbal, Nasser Al Emadi, et al. Can AI Help in Screening Viral and COVID-19 Pneumonia?IEEE Access, 8:132665–132676, 2020

  5. [5]

    Zech, Marcus A

    John R. Zech, Marcus A. Badgeley, Manway Liu, Anthony B. Costa, Joseph J. Titano, and Eric Karl Oermann. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study.PLoS Medicine, 15:e1002683, 2018

  6. [6]

    Ayala-Raggi, Alejandro E

    Salvador E. Ayala-Raggi, Alejandro E. Picazo-Castillo, Aldrin Barreto-Flores, and José F. Portillo-Robledo. Synergizing chest X-ray image normalization and discriminative feature selection for efficient and automatic COVID-19 recognition. InPattern Recognition. ACPR 2023, volume 14407 ofLecture Notes in Computer Science, pages 224–238, Cham, Switzerland, ...

  7. [7]

    Picazo-Castillo, Salvador E

    Alejandro E. Picazo-Castillo, Salvador E. Ayala-Raggi, Luis Altamirano-Robles, Aldrin Barreto-Flores, and José F. Portillo- Robledo. Comparative study of lung image representations for automated pneumonia recognition.International Journal of Combinatorial Optimization Problems and Informatics, 15:193–201, 2024

  8. [8]

    Spatial transformer networks

    Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial transformer networks. InAdvances in Neural Information Processing Systems, volume 28, pages 2017–2025, 2015

  9. [9]

    John C. Gower. Generalized Procrustes analysis.Psychometrika, 40:33–51, 1975

  10. [10]

    Sur la sphère vide.Izvestia Akademii Nauk SSSR, 7:793–800, 1934

    Boris Delaunay. Sur la sphère vide.Izvestia Akademii Nauk SSSR, 7:793–800, 1934

  11. [11]

    IEEE Computer Society Press, Los Alamitos, CA, USA, 1990

    George Wolberg.Digital Image Warping. IEEE Computer Society Press, Los Alamitos, CA, USA, 1990

  12. [12]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, Las Vegas, NV , USA, 2016

  13. [13]

    Tariqul Islam, Somaya Al Maadeed, Susu M

    Tawsifur Rahman, Amith Khandakar, Yazan Qiblawey, Anas Tahir, Serkan Kiranyaz, Saad Bin Abul Kashem, Md. Tariqul Islam, Somaya Al Maadeed, Susu M. Zughaier, Muhammad Salman Khan, et al. Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images.Computers in Biology and Medicine, 132:104319, 2021

  14. [14]

    ImageNet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, Miami, FL, USA, 2009

  15. [15]

    Coordinate attention for efficient mobile network design

    Qibin Hou, Daquan Zhou, and Jiashi Feng. Coordinate attention for efficient mobile network design. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13713–13722, Nashville, TN, USA, 2021

  16. [16]

    Group normalization

    Yuxin Wu and Kaiming He. Group normalization. InProceedings of the European Conference on Computer Vision, pages 3–19, Munich, Germany, 2018

  17. [17]

    Wing loss for robust facial landmark localisation with convolutional neural networks

    Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, and Xiao-Jun Wu. Wing loss for robust facial landmark localisation with convolutional neural networks. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2235–2245, Salt Lake City, UT, USA, 2018

  18. [18]

    Cruz Ovando, Salvador E

    Rafael A. Cruz Ovando, Salvador E. Ayala Raggi, Alejandro E. Picazo Castillo, and Aldrin Barreto Flores. Statistical Asymmetrical Histogram Stretching for Contrast Enhancement in Chest X-ray Images for Pneumonia Detection.Computación y Sistemas, 29(4), 2025. doi:10.13053/cys-29-4-6115

  19. [19]

    Decoupled weight decay regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InProceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 2019. 16 This manuscript is currently under consideration for publication.ARXIVPREPRINT

  20. [20]

    Cell172(5), 1122– 1131.e9 (2018).https://doi.org/10.1016/j.cell.2018.02.010

    Daniel S. Kermany, Michael Goldbaum, Wenjia Cai, Carolina C. S. Valentim, Huiying Liang, Sally L. Baxter, Alex McKeown, Ge Yang, Xiaokang Wu, Fangbing Yan, et al. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning.Cell, 172(5):1122–1131, 2018. doi:10.1016/j.cell.2018.02.010

  21. [21]

    Rajendra Acharya

    Tulin Özturk, Muhammed Talo, Eylul Azra Yildirim, Ulas Baran Baloglu, Ozal Yildirim, and U. Rajendra Acharya. Automated detection of COVID-19 cases using deep neural networks with X-ray images.Computers in Biology and Medicine, 121:103792, 2020

  22. [22]

    Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks.Pattern Analysis and Applications, 24:1207–1220, 2021

    Ali Narin, Ceren Kaya, and Ziynet Pamuk. Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks.Pattern Analysis and Applications, 24:1207–1220, 2021. 17