arxiv: 2605.08265 · v1 · submitted 2026-05-08 · 🧮 math.GM

Recognition: no theorem link

CrackMorph-XAI-Net: A Topology-Preserving and Explainable Framework for Automated Crack Morphology

Sri Surya Pravallika Ajjarapu , S. M. Mallikarjunaiah

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:03 UTC · model grok-4.3

classification 🧮 math.GM

keywords crack morphologyskeleton extractionjunction detectiontopology preservationexplainable frameworkinfrastructure monitoringCRACK500image analysis

0 comments

The pith

CrackMorph-XAI-Net converts crack images into measurable morphological features like centerlines, junctions, and topology via a four-stage pipeline.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CrackMorph-XAI-Net as a framework that moves beyond binary crack segmentation masks to deliver structural details required for engineering interpretation. It processes images through four stages that produce topology-preserving skeletons, junction locations, quantitative descriptors, and severity screening. The authors extend the CRACK500 dataset with aligned annotations for skeletons, junctions, and topology to enable stage-wise testing. Experiments report a 0.991 mean Dice score for skeleton extraction with topology preserved in 98.5 percent of test cases, plus 0.964 recall and 0.887 F1 for junctions, and descriptor correlations above 0.95. This would let automated inspection systems report crack length, width, branching, and tortuosity in forms directly usable for maintenance decisions.

Core claim

CrackMorph-XAI-Net is an explainable morphology-aware framework that converts crack image and region-mask data into interpretable structural outputs through four stages: topology-preserving skeleton extraction, junction detection via Gaussian heatmap regression, morphology descriptor computation, and severity-oriented screening. On the extended CRACK500 benchmark the learned skeleton stage reaches a mean Dice coefficient of 0.991 with topology preserved in 98.5 percent of test images, junction detection attains 0.964 recall and 0.887 F1-score, and predicted morphology values correlate above 0.95 with reference values for length, width, orientation, junction count, and tortuosity.

What carries the argument

Four-stage pipeline that first learns a topology-preserving skeleton from the input mask, then regresses junction locations as Gaussian heatmaps, computes scalar descriptors, and applies severity screening.

Load-bearing premise

The manually created skeleton, junction, and topology annotations added to CRACK500 accurately represent real-world crack variability and the four trained stages generalize without major domain shift.

What would settle it

Running the model on a fresh crack image collection from different materials, lighting, or cameras and observing skeleton Dice below 0.95 or topology preservation below 90 percent would falsify the generalization claim.

Figures

Figures reproduced from arXiv: 2605.08265 by S. M. Mallikarjunaiah, Sri Surya Pravallika Ajjarapu.

**Figure 2.** Figure 2: Overall architectural structure of the proposed morphology-aware crack analysis [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Deterministic computation of morphology descriptors leveraging the predicted skele [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Representative structural overlay and morphology summary for a test instance. The [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: A comprehensive visualization of the sequential pipeline, transitioning from raw [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative failure case analysis. The transparent nature of the pipeline allows en [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

read the original abstract

Automated crack inspection is increasingly recognized as a critical component of infrastructure monitoring; however, cracks continue to be reported primarily as binary segmentation masks by many current vision-based systems. While localization is facilitated by such masks, limited structural information is provided for robust engineering interpretation. For practical crack assessment, measurable morphological features -- including centerline geometry, branching behavior, junction locations, topology, and severity-related indicators -- are required. In this work, \textit{CrackMorph-XAI-Net}, an explainable morphology-aware framework for image-based crack analysis, is presented. Crack image and region-mask data are converted into a sequence of interpretable structural outputs through four distinct stages: topology-preserving skeleton extraction, junction detection via Gaussian heatmap regression, morphology descriptor computation, and severity-oriented screening. To support rigorous stage-wise evaluation, the standard \textit{CRACK500} benchmark is extended with aligned skeleton maps, junction heatmaps, and topology labels. Experimental validation demonstrates that a mean Dice coefficient of 0.991 is achieved by the learned skeleton extraction stage, with topology preserved in 98.5\% of test images. Furthermore, a recall of 0.964 and an F1-score of 0.887 are obtained in the junction detection stage, highlighting the efficacy of heatmap regression for sparse structural targets. Strong agreement between predicted and reference morphology values is revealed by descriptor-level evaluation, with correlations exceeding 0.95 for length, width, orientation, junction count, and tortuosity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical four-stage pipeline for turning crack images into morphology descriptors, but its headline metrics rest on unverified manual extensions to CRACK500.

read the letter

The new piece here is the end-to-end setup: topology-preserving skeleton extraction, Gaussian heatmap regression for junctions, descriptor calculation, and severity screening, all trained on an extended CRACK500 set. That specific combination is not in the earlier crack papers the abstract cites, and it produces outputs that engineers can actually use—length, width, tortuosity, junction count—rather than just another binary mask.

Referee Report

3 major / 2 minor

Summary. The manuscript presents CrackMorph-XAI-Net, a four-stage explainable framework for automated crack morphology analysis from images. It converts crack images and region masks into interpretable outputs via topology-preserving skeleton extraction, junction detection through Gaussian heatmap regression, morphology descriptor computation (length, width, orientation, junction count, tortuosity), and severity-oriented screening. The CRACK500 benchmark is extended with aligned skeleton maps, junction heatmaps, and topology labels to support stage-wise evaluation. Reported results include a mean Dice coefficient of 0.991 for skeleton extraction (with topology preserved in 98.5% of test images), recall of 0.964 and F1-score of 0.887 for junction detection, and correlations exceeding 0.95 between predicted and reference morphology descriptors.

Significance. If the results hold under rigorous validation, the work could meaningfully advance infrastructure monitoring by moving beyond binary crack segmentation to provide measurable, topology-aware structural features useful for engineering severity assessment. The emphasis on topology preservation and the multi-stage pipeline for interpretable outputs addresses a practical gap in current vision systems. The high descriptor correlations suggest potential utility if annotation quality and generalization are confirmed.

major comments (3)

[Abstract / Experimental validation] Abstract / Experimental validation: The central performance claims (Dice 0.991 for skeleton extraction, 98.5% topology preservation, junction recall 0.964 / F1 0.887, descriptor correlations >0.95) are evaluated exclusively against manually extended CRACK500 annotations for skeletons, junctions, and topology. No inter-annotator agreement metrics, annotation protocol details, or consistency checks are referenced, which is load-bearing because skeleton and junction tasks are sparse and boundary-sensitive; small labeling variations can inflate metrics when models are trained and tested on the same annotation style.
[Dataset extension and evaluation] Dataset extension and evaluation: The extension of CRACK500 is used to create the reference skeleton maps, junction heatmaps, and topology labels that underpin all quantitative results, yet no details on generation process, potential biases, or validation against real-world crack variability are provided. This directly affects the reliability of the reported stage-wise metrics and the assumption that the four-stage models will generalize without major domain shift.
[Results / Descriptor-level evaluation] Results / Descriptor-level evaluation: Strong agreement (correlations >0.95) is claimed for morphology descriptors, but the manuscript summary provides no baseline comparisons to existing crack skeletonization or junction detection methods, no ablation studies on the pipeline stages, and no error analysis or standard deviations. This makes it difficult to determine whether the framework's contributions are incremental or if results are tied to the custom annotation process.

minor comments (2)

[Title / Abstract] The title references 'XAI-Net' and the abstract describes an 'explainable' framework, but specific mechanisms for explainability (e.g., attention maps, feature attribution) are not detailed in the provided summary; expand this in the methods or discussion for clarity.
[Experimental validation] Ensure all reported metrics (e.g., Dice, F1, correlations) are accompanied by measures of variability such as standard deviation across test images or folds to better indicate robustness.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major point below and will incorporate revisions to enhance the clarity, transparency, and rigor of the evaluation.

read point-by-point responses

Referee: [Abstract / Experimental validation] Abstract / Experimental validation: The central performance claims (Dice 0.991 for skeleton extraction, 98.5% topology preservation, junction recall 0.964 / F1 0.887, descriptor correlations >0.95) are evaluated exclusively against manually extended CRACK500 annotations for skeletons, junctions, and topology. No inter-annotator agreement metrics, annotation protocol details, or consistency checks are referenced, which is load-bearing because skeleton and junction tasks are sparse and boundary-sensitive; small labeling variations can inflate metrics when models are trained and tested on the same annotation style.

Authors: We agree that inter-annotator agreement and annotation protocol details are essential for validating sparse, boundary-sensitive tasks such as skeleton extraction and junction detection. The revised manuscript will include a dedicated subsection detailing the annotation protocol for extending CRACK500 (including tools, guidelines, and verification steps). Where multiple annotators contributed to subsets of the data, we will report inter-annotator agreement metrics (e.g., Dice overlap for skeletons and F1 for junctions); for portions annotated by a single expert, we will explicitly note this limitation and discuss its implications for metric interpretation. revision: yes
Referee: [Dataset extension and evaluation] Dataset extension and evaluation: The extension of CRACK500 is used to create the reference skeleton maps, junction heatmaps, and topology labels that underpin all quantitative results, yet no details on generation process, potential biases, or validation against real-world crack variability are provided. This directly affects the reliability of the reported stage-wise metrics and the assumption that the four-stage models will generalize without major domain shift.

Authors: We acknowledge the need for full transparency on the dataset extension. The revision will expand the methods section with a step-by-step description of the generation process for skeleton maps, junction heatmaps, and topology labels, including any semi-automated procedures followed by manual correction. Potential sources of bias (e.g., annotation style or image selection) will be discussed, along with a qualitative comparison to real-world crack variability drawn from the original CRACK500 images and additional external samples. We will also add a brief analysis of domain-shift risks and note plans for future cross-dataset testing. revision: yes
Referee: [Results / Descriptor-level evaluation] Results / Descriptor-level evaluation: Strong agreement (correlations >0.95) is claimed for morphology descriptors, but the manuscript summary provides no baseline comparisons to existing crack skeletonization or junction detection methods, no ablation studies on the pipeline stages, and no error analysis or standard deviations. This makes it difficult to determine whether the framework's contributions are incremental or if results are tied to the custom annotation process.

Authors: We agree that comparative baselines, ablations, and error analysis strengthen the evaluation. In the revised manuscript we will add (i) baseline results using established skeletonization methods (e.g., morphological thinning, Zhang-Suen) and junction detectors on the same extended annotations, (ii) ablation studies isolating each pipeline stage, and (iii) error analysis including standard deviations for all reported metrics and correlations. These additions will clarify the incremental value of the proposed framework relative to prior approaches. revision: yes

Circularity Check

0 steps flagged

No circularity: standard supervised evaluation on manually extended benchmark

full rationale

The paper describes a four-stage pipeline (skeleton extraction, junction detection via heatmap regression, descriptor computation, severity screening) trained and evaluated on a manually extended version of the public CRACK500 dataset. Reported metrics (Dice 0.991, topology preservation 98.5%, correlations >0.95) are conventional test-set performance figures measured against the human-provided skeleton maps, heatmaps, and topology labels. No equations, self-referential definitions, fitted-parameter-as-prediction steps, or load-bearing self-citations appear in the abstract or described chain; the morphology outputs are computed from the learned stages rather than presupposed by them. The derivation therefore remains independent of its own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The claims rest on standard supervised deep-learning assumptions and the representativeness of the manually annotated CRACK500 extension; no new physical entities or unproven mathematical axioms are introduced beyond typical neural-network training.

free parameters (1)

Neural network weights and hyperparameters
Learned parameters of the skeleton extraction and junction regression networks are fitted to the extended training data.

axioms (1)

domain assumption Supervised learning on annotated crack images can produce accurate topology-preserving skeletons and junction detections
Invoked by the training of the first two stages on the extended CRACK500 labels.

pith-pipeline@v0.9.0 · 5578 in / 1389 out tokens · 75522 ms · 2026-05-12T01:03:20.979235+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

[1]

Abdel-Qader, O

I. Abdel-Qader, O. Abudayyeh, and M. E. Kelly. Analysis of edge-detection techniques for crack identification in bridges.Journal of Computing in Civil Engineering, 17(4):255– 263, 2003

work page 2003
[2]

Cord and S

A. Cord and S. Chambon. Automatic road defect detection by textural pattern recognition based on AdaBoost.Computer-Aided Civil and Infrastructure Engineering, 27(4):244– 259, 2012

work page 2012
[3]

Oliveira and P

H. Oliveira and P. L. Correia. Automatic road crack detection and characterization.IEEE Transactions on Intelligent Transportation Systems, 14(1):155–168, 2013

work page 2013
[4]

Y . Shi, L. Cui, Z. Qi, F. Meng, and Z. Chen. Automatic road crack detection using random structured forests.IEEE Transactions on Intelligent Transportation Systems, 17(12):3434–3445, 2016

work page 2016
[5]

Q. Zou, Y . Cao, Q. Li, Q. Mao, and S. Wang. CrackTree: Automatic crack detection from pavement images.Pattern Recognition Letters, 33(3):227–238, 2012

work page 2012
[6]

J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmen- tation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 3431–3440, 2015

work page 2015
[7]

Ronneberger, P

O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. InMedical Image Computing and Computer-Assisted Intervention, pages 234–241, 2015

work page 2015
[8]

Y . Liu, J. Yao, X. Lu, R. Xie, and L. Li. DeepCrack: A deep hierarchical feature learning architecture for crack segmentation.Neurocomputing, 338:139–153, 2019

work page 2019
[9]

Zhang, Q

Z. Zhang, Q. Liu, and Y . Wang. Road extraction by deep residual U-Net.IEEE Geoscience and Remote Sensing Letters, 15(5):749–753, 2018

work page 2018
[10]

Z. Liu, W. Li, Q. Wang, H. Lu, and W. Wang. CrackFormer: Transformer network for fine-grained crack detection.IEEE Transactions on Geoscience and Remote Sensing, 60:1–13, 2022. 19

work page 2022
[11]

Y . Yang, P. Xu, and Y . Yan. Pavement crack detection method based on multi-scale dilated spatial attention.International Journal of Transportation Science and Technology, 2025. doi:10.1016/j.ijtst.2025.06.004

work page doi:10.1016/j.ijtst.2025.06.004 2025
[12]

F. Yang, L. Zhang, S. Yu, D. Prokhorov, X. Mei, and H. Ling. Feature pyramid and hierar- chical boosting network for pavement crack detection.IEEE Transactions on Intelligent Transportation Systems, 21(4):1525–1535, 2020

work page 2020
[13]

Mosinska, P

A. Mosinska, P. Marquez-Neila, M. Kozinski, and P. Fua. Beyond the pixel-wise loss for topology-aware delineation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3136–3145, 2018

work page 2018
[14]

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-CAM: Visual explanations from deep networks via gradient-based localization.International Journal of Computer Vision, 128(2):336–359, 2020

work page 2020
[15]

Barredo Arrieta, N

A. Barredo Arrieta, N. Diaz-Rodriguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, and F. Herrera. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI.Information Fusion, 58:82–115, 2020

work page 2020
[16]

Melching, T

D. Melching, T. Strohmann, G. Requena, and E. Breitbarth. Explainable machine learning for precise fatigue crack tip detection.Scientific Reports, 12:9513, 2022. doi:10.1038/s41598-022-13275-1

work page doi:10.1038/s41598-022-13275-1 2022
[17]

Y .-J. Cha, W. Choi, and O. Buyukozturk. Deep learning-based crack damage detection using convolutional neural networks.Computer-Aided Civil and Infrastructure Engineer- ing, 32(5):361–378, 2017

work page 2017
[18]

Segment Anything

A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, and others. Segment anything.arXiv preprint arXiv:2304.02643, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[19]

Newell, K

A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. InEuropean Conference on Computer Vision, pages 483–499, 2016

work page 2016
[20]

C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. J. Cardoso. Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations. InDeep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Sup- port, pages 240–248, 2017

work page 2017
[21]

T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Dollar. Focal loss for dense object detection. InProceedings of the IEEE International Conference on Computer Vision, pages 2980– 2988, 2017

work page 2017
[22]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations, 2015

work page 2015
[23]

J. Zhao, Y . Sang, and F. Duan. The state of the art of two-dimensional digital image correlation computational method.Engineering Reports, 1:e12038, 2019

work page 2019
[24]

S. Roux, J. Rethore, and F. Hild. Digital image correlation and fracture: An advanced technique for estimating stress intensity factors of 2D and 3D cracks.Journal of Physics D: Applied Physics, 42:214004, 2009. 20

work page 2009
[25]

J. Rethore. Automatic crack tip detection and stress intensity factors estimation of curved cracks from digital images.International Journal for Numerical Methods in Engineering, 103:516–534, 2015

work page 2015
[26]

M. L. Williams. On the stress distribution at the base of a stationary crack.Journal of Applied Mechanics, 24:109–114, 1957

work page 1957
[27]

T. L. Anderson.Fracture Mechanics: Fundamentals and Applications. CRC Press, 4th edition, 2017

work page 2017
[28]

Mallikarjunaiah, S. M. (2025). Anhp-adaptive finite element framework for static cracks: The impact of pointwise density variations on mode I, mode II, and mixed-mode fracture. arXiv preprint arXiv:2512.21443

work page arXiv 2025
[29]

Mallikarjunaiah, S.M., and P. Venkatachalapthy.hp-adaptive finite element simulation of a static anti-plane shear crack in a nonlinear strain-limiting elastic solid.Finite Elements in Analysis and Design, 255:104520, 2026

work page 2026
[30]

Modeling fracture in the context of a strain-limiting theory of elasticity: a single plane-strain crack.International Journal of Engineering Science, 88:73–82, 2015

Gou, K., Mallikarjuna, M., Rajagopal, K.R., and Walton, J. Modeling fracture in the context of a strain-limiting theory of elasticity: a single plane-strain crack.International Journal of Engineering Science, 88:73–82, 2015. 21

work page 2015