pith. machine review for the scientific record. sign in

arxiv: 2604.05594 · v2 · submitted 2026-04-07 · 💻 cs.CV

Recognition: no theorem link

RABC-Net: Reliability-Aware Annotation-Free Skin Lesion Segmentation for Low-Resource Dermoscopy

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:03 UTC · model grok-4.3

classification 💻 cs.CV
keywords skin lesion segmentationannotation-free learningpseudo-label reliabilityboundary calibrationdermoscopylow-resource medical imaginguncertainty-aware training
0
0 comments X

The pith

RABC-Net achieves competitive skin lesion segmentation accuracy on dermoscopy images without any manual pixel-level annotations for training or adaptation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an annotation-free system for segmenting skin lesions in dermoscopy images, which matters because creating manual masks is expensive and limits automated tools in low-resource settings. It combines pseudo-label reliability learning during training with restricted adaptation and a calibration step that refines boundaries using uncertainty and confidence at inference. The system reports macro-average DICE of 86.58 percent and JAC of 79.47 percent across ISIC-2017, ISIC-2018, and PH2 while using any available validation labels only for operating-point selection. If correct, the approach shows that reliable representations can be learned and deployed from unlabeled images alone.

Core claim

RABC-Net decouples reliability learning from deployment by shaping representations through uncertainty-aware pseudo-label interaction during training, then applies Reliability-Adaptive Boundary Calibration in logit space from boundary confidence, uncertainty, and foreground probability, all without manual masks in training or adaptation, yielding the reported scores and an image-only inference path at 87.4 FPS.

What carries the argument

Reliability-Adaptive Boundary Calibration (RABC) that performs local logit-space calibration using boundary confidence, uncertainty estimates, and foreground probability to refine outputs after the base model processes an image.

Load-bearing premise

Uncertainty-aware pseudo-label interaction during training produces robust representations that transfer to new images without accumulating confirmation bias or errors.

What would settle it

Retraining the model on one dataset with the exact unlabeled procedure and testing on a separate held-out dermoscopy collection yields substantially lower DICE than 86.58 percent.

Figures

Figures reproduced from arXiv: 2604.05594 by Jiangzhao Li, Junjie Huang, Wen Xiao, Xiaofan Li, Yan Qiao, Yuhaohang He, Yujie Yao, Yunsen Liang, Zhou Liu.

Figure 1
Figure 1. Figure 1: Overall method overview. Four unsupervised paths produce a 4-channel pseudo-label prior tensor for IPC/PIA interaction. The image branch extracts multi-scale features, reliability heads estimate pseudo-label reliability, and the deployed image path uses RABC to calibrate logits before thresholding [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Uncertainty-guided IPC/PIA correction. Image features and pseudo-label features interact through IPC and PIA attention, sigma heads produce reliability-aware weights, and the resulting weighted consensus probability map is binarized into a cleaner consensus mask for supervision. during training; (2) paired with the stronger ConvNeXt-Tiny backbone, the interaction attention can capture finer-grained semanti… view at source ↗
Figure 3
Figure 3. Figure 3: Training-stage view of RABC-Net. The figure shows the shared forward path, reliability-weighted pseudo supervision, restricted branch adaptation, and the loss groups used for annotation-free optimization. correction should be applied. To keep these updates local, we regularize RABC through boundary-consistency, far￾background-preservation, and sparsity losses. With pseudo￾consensus map 𝑃𝑐 , Sobel boundary … view at source ↗
Figure 4
Figure 4. Figure 4: Inference-stage operating-point pipeline. RABC-calibrated logits are converted to a probability map, optional TTA and reference smoothing are applied only when selected, and lightweight morphology yields the final mask. 4. Experiments 4.1. Experimental Protocol We evaluate on three public dermoscopy datasets: • ISIC-2017 (Codella et al., 2018): 2,000 training / 150 validation / 600 test images. • ISIC-2018… view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative segmentation comparison. Columns show image, GT, raw P0, final prediction, and error map (green=TP, red=FP, yellow=FN). 4.3. Ablation Studies We first examine the training-side strategy on ISIC￾2018 ( [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison on ISIC-2018. Columns show image, GT, reproduced RPI-Net anchor, prompted MedSAM reference, final prediction, and final-model error map [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Pseudo-label refinement example. IPC/PIA converts divergent pseudo-labels into a reliability-weighted consensus before final prediction. For ISIC-2018 (test, 1000 samples), the staged inference results are listed in [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: SEN–SPE frontiers for optional reference smoothing on the no-RABC backbone. RABC is evaluated separately as learned calibration [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: summarizes the training and decoder ablations [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Per-sample metric distributions on the three test sets [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
read the original abstract

Pixel-level annotation is costly in low-resource dermoscopy. We present RABC-Net, a reliability-aware annotation-free segmentation system that combines pseudo-label reliability learning, restricted target-domain adaptation, and Reliability-Adaptive Boundary Calibration (RABC). The system decouples reliability learning from deployment: uncertainty-aware pseudo-label interaction shapes robust representations during training, while the image-only inference path is preserved and RABC performs local logit-space calibration from boundary confidence, uncertainty, and foreground probability. No manual masks are used for training or target-domain adaptation; validation labels, when available, are used only for final operating-point selection. Across ISIC-2017, ISIC-2018, and PH2, RABC-Net achieves macro-average DICE/JAC of 86.58\%/79.47\% and consistent matched-protocol results. Controlled within-study analyses show that RABC provides localized gains over nonlearned boundary correction, while the overall result comes from the full reliability-aware system. Adaptation updates only 3.50\% of model parameters, image-only inference runs at 87.4 FPS, and the selected operating points use $\sigma=0$ on all three datasets, indicating that learned calibration avoids extra smoothing at deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents RABC-Net, a reliability-aware annotation-free segmentation system for dermoscopy images that integrates uncertainty-aware pseudo-label learning, restricted target-domain adaptation (updating 3.5% of parameters), and Reliability-Adaptive Boundary Calibration (RABC) for local logit-space calibration. No manual masks are used for training or adaptation; validation labels serve only for final operating-point selection. The system reports macro-average Dice/Jaccard of 86.58%/79.47% across ISIC-2017, ISIC-2018, and PH2, with 87.4 FPS inference and consistent matched-protocol results, attributing gains to the full reliability-aware pipeline over nonlearned boundary correction.

Significance. If the annotation-free claim and performance numbers hold without selection bias, the work offers a practical advance for low-resource medical segmentation by minimizing annotation needs while preserving deployment efficiency and image-only inference. The decoupling of training-time reliability learning from inference and the uniform selection of σ=0 (indicating learned calibration suffices without extra smoothing) are concrete strengths that enhance applicability.

major comments (2)
  1. [Experimental results and operating-point selection] In the experimental protocol and results (as described in the abstract and quantitative evaluations), validation labels are used to select operating points (σ=0 uniformly across all three datasets). If this selection maximizes the reported metric on the validation split post-hoc without nested cross-validation or a pre-specified fixed rule, the headline macro DICE/JAC of 86.58%/79.47% risks optimistic bias, particularly on the smaller PH2 set. This selection step directly threatens the central 'no manual masks for training or target-domain adaptation' guarantee for the full pipeline whose performance is being claimed.
  2. [Results and within-study analyses] The abstract states that 'controlled within-study analyses show that RABC provides localized gains' and that the overall result comes from the full system, yet no detailed ablations, pseudo-label generation procedure, uncertainty estimation specifics, or error bars on the quantitative tables are referenced. Without these, it is difficult to isolate the contribution of reliability-aware pseudo-label interaction versus other components and to assess whether the reported gains are robust or sensitive to implementation choices.
minor comments (1)
  1. [Abstract] The abstract reports macro-averages but does not include per-dataset scores or standard deviations; adding these (e.g., in a results table) would strengthen claims of consistency across ISIC-2017/2018 and PH2.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. Below we provide detailed responses to the major comments, clarifying our experimental protocol and committing to revisions that enhance transparency and robustness of the presented results.

read point-by-point responses
  1. Referee: In the experimental protocol and results (as described in the abstract and quantitative evaluations), validation labels are used to select operating points (σ=0 uniformly across all three datasets). If this selection maximizes the reported metric on the validation split post-hoc without nested cross-validation or a pre-specified fixed rule, the headline macro DICE/JAC of 86.58%/79.47% risks optimistic bias, particularly on the smaller PH2 set. This selection step directly threatens the central 'no manual masks for training or target-domain adaptation' guarantee for the full pipeline whose performance is being claimed.

    Authors: We agree that post-hoc selection of the operating point using validation labels can introduce optimistic bias in the reported metrics. The manuscript is transparent that validation labels are used only for this final selection step and not for any training or adaptation. The fact that σ=0 was selected uniformly suggests that the RABC mechanism provides effective calibration without requiring dataset-specific tuning. To strengthen the claim and address the bias concern, we will revise the paper to adopt σ=0 as a fixed operating point determined by the system design (learned calibration suffices), present the results under this fixed rule, and explicitly discuss how this maintains the annotation-free nature of the training and adaptation phases while noting the standard practice of operating point selection for deployment. revision: partial

  2. Referee: The abstract states that 'controlled within-study analyses show that RABC provides localized gains' and that the overall result comes from the full system, yet no detailed ablations, pseudo-label generation procedure, uncertainty estimation specifics, or error bars on the quantitative tables are referenced. Without these, it is difficult to isolate the contribution of reliability-aware pseudo-label interaction versus other components and to assess whether the reported gains are robust or sensitive to implementation choices.

    Authors: While the manuscript includes descriptions of the pseudo-label generation, uncertainty estimation, and within-study comparisons demonstrating RABC's localized gains over nonlearned methods, we recognize that these may not be sufficiently detailed or prominently referenced for full reproducibility and assessment. In the revised version, we will expand the relevant sections with more specifics on the procedures, add error bars to the tables, include additional ablation studies if necessary, and ensure clear cross-references from the abstract and results to these analyses. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical method with explicit validation protocol

full rationale

The paper presents an empirical ML segmentation pipeline rather than a mathematical derivation chain. Core claims rest on training without manual masks (pseudo-labels and restricted adaptation) and reporting macro DICE/JAC on public benchmarks (ISIC-2017/2018, PH2). The explicit statement that validation labels are used only for final operating-point selection (with σ=0 chosen uniformly) does not reduce any prediction or result to a fitted input by construction; it is a transparent post-training step for metric reporting. No equations, uniqueness theorems, or self-citations are invoked in a load-bearing way that collapses the central result to its own inputs. The method is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Review limited to abstract; exact free parameters and implementation assumptions cannot be audited in detail. The approach rests on standard deep-learning assumptions plus the paper-specific claim that reliability signals from pseudo-labels can be decoupled from inference.

free parameters (2)
  • sigma
    Operating-point parameter selected as 0 on all three datasets after validation; indicates learned calibration replaces extra smoothing.
  • adaptation fraction
    3.5% of model parameters updated during restricted target-domain adaptation.
axioms (2)
  • domain assumption Uncertainty-aware pseudo-label interaction shapes robust representations without manual supervision.
    Central premise of the reliability learning stage.
  • domain assumption RABC local logit-space calibration improves boundary accuracy using confidence, uncertainty, and foreground probability.
    Assumed to deliver the reported localized gains.

pith-pipeline@v0.9.0 · 5546 in / 1349 out tokens · 195231 ms · 2026-05-10T19:03:33.285537+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 25 canonical work pages

  1. [1]

    , author Feng, D

    author Ahn, E. , author Feng, D. , author Kim, J. , year 2021 . title A Spatial Guided Self-Supervised Clustering Network for Medical Image Segmentation , in: booktitle Medical Image Computing and Computer Assisted Intervention -- MICCAI 2021 , pp. pages 379--388 . :10.1007/978-3-030-87193-2_36

  2. [2]

    , author Asadi-Aghbolaghi, M

    author Azad, R. , author Asadi-Aghbolaghi, M. , author Fathy, M. , author Escalera, S. , year 2020 . title Attention Deeplabv3+: Multi-level Context Attention Mechanism for Skin Lesion Segmentation , in: booktitle Computer Vision -- ECCV 2020 Workshops , pp. pages 251--266 . :10.1007/978-3-030-65414-6_19

  3. [3]

    , author Kervadec, H

    author Bateson, M. , author Kervadec, H. , author Dolz, J. , author Lombaert, H. , author Ben Ayed , I. , year 2022 . title Source-free Domain Adaptation for Image Segmentation . journal Medical Image Analysis volume 82 , pages 102617 . :10.1016/j.media.2022.102617

  4. [4]

    , author Dou, Q

    author Chen, C. , author Dou, Q. , author Chen, H. , author Qin, J. , author Heng, P.A. , year 2019 . title Synergistic Image and Feature Adaptation: Towards Cross-Modality Domain Adaptation for Medical Image Segmentation , in: booktitle Proc. AAAI , pp. pages 865--872 . :10.1609/aaai.v33i01.3301865

  5. [5]

    CoRRabs/2308.16184 (2023)

    author Cheng, J. , author Ye, J. , author Deng, Z. , et al., year 2023 . title SAM-Med2D . journal arXiv preprint arXiv:2308.16184 :10.48550/arXiv.2308.16184

  6. [6]

    Jessica Dai and Sarah M Brown

    author Codella, N.C.F. , author Gutman, D. , author Celebi, M.E. , et al., year 2018 . title Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC) , in: booktitle Proc. IEEE ISBI , pp. pages 168--172 . :10.1109/ISBI.2018.8363547

  7. [7]

    Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)

    author Codella, N.C.F. , author Rotemberg, V. , author Tschandl, P. , et al., year 2019 . title Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC) . journal arXiv preprint arXiv:1902.03368 :10.48550/arXiv.1902.03368

  8. [8]

    Guided image filtering.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6):1397–1409, 2013

    author He, K. , author Sun, J. , author Tang, X. , year 2013 . title Guided Image Filtering . journal IEEE Transactions on Pattern Analysis and Machine Intelligence volume 35 , pages 1397--1409 . :10.1109/TPAMI.2012.213

  9. [9]

    , author Erdil, E

    author Karani, N. , author Erdil, E. , author Chaitanya, K. , author Konukoglu, E. , year 2021 . title Test-time Adaptable Neural Networks for Robust Medical Image Segmentation . journal Medical Image Analysis volume 68 , pages 101907 . :10.1016/j.media.2020.101907

  10. [10]

    , author Mintun, E

    author Kirillov, A. , author Mintun, E. , author Ravi, N. , et al., year 2023 . title Segment Anything , in: booktitle Proc. IEEE/CVF ICCV , pp. pages 4015--4026

  11. [11]

    a henb \

    author Kr \"a henb \"u hl, P. , author Koltun, V. , year 2011 . title Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , in: booktitle Advances in Neural Information Processing Systems , pp. pages 109--117

  12. [12]

    , author Peng, B

    author Li, X. , author Peng, B. , author Hu, J. , author Ma, C. , author Yang, D. , author Xie, Z. , year 2024 . title USL-Net: Uncertainty Self-Learning Network for Unsupervised Skin Lesion Segmentation . journal Biomedical Signal Processing and Control volume 89 , pages 105769 . :10.1016/j.bspc.2023.105769

  13. [13]

    , author Peng, B

    author Li, X. , author Peng, B. , author Zhang, J. , author Zhang, Z. , author Xie, Z. , year 2026 . title Reliable Multi-Source Contrastive Pseudo-Labels Interaction Network for unsupervised skin lesion segmentation . journal Biomedical Signal Processing and Control volume 112 , pages 108433 . :10.1016/j.bspc.2025.108433

  14. [14]

    , author Dou, Q

    author Liu, Q. , author Dou, Q. , author Heng, P.A. , year 2020 . title Shape-Aware Meta-learning for Generalizing Prostate MRI Segmentation to Unseen Domains , in: booktitle Proc. MICCAI , pp. pages 475--485 . :10.1007/978-3-030-59713-9_46

  15. [15]

    MViTv2: Improved Multiscale Vision Transformers for Classification and Detection , isbn =

    author Liu, Z. , author Mao, H. , author Wu, C.Y. , et al., year 2022 . title A ConvNet for the 2020s , in: booktitle Proc. IEEE/CVF CVPR , pp. pages 11966--11976 . :10.1109/CVPR52688.2022.01167

  16. [16]

    , author He, Y

    author Ma, J. , author He, Y. , author Li, F. , et al., year 2024 . title Segment Anything in Medical Images . journal Nature Communications volume 15 , pages 654 . :10.1038/s41467-024-44824-z

  17. [17]

    , year 1967

    author MacQueen, J. , year 1967 . title Some Methods for Classification and Analysis of Multivariate Observations , in: booktitle Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability , pp. pages 281--297

  18. [18]

    , author Ferreira, P.M

    author Mendon c a, T. , author Ferreira, P.M. , author Marques, J.S. , author Marcal, A.R.S. , author Rozeira, J. , year 2013 . title PH2 --- A Dermoscopic Image Database for Research and Benchmarking , in: booktitle Proc. IEEE EMBC , pp. pages 5437--5440 . :10.1109/EMBC.2013.6610779

  19. [19]

    , author Avenda \ n o, J

    author Pati \ n o, D. , author Avenda \ n o, J. , author Branch, J.W. , year 2018 . title Automatic Skin Lesion Segmentation on Dermoscopic Images by the Means of Superpixel Merging . journal arXiv preprint arXiv:1808.06759 :10.48550/arXiv.1808.06759

  20. [20]

    In: Medical Image Compu ting and Computer-Assisted Intervention – MICCAI 2015

    author Ronneberger, O. , author Fischer, P. , author Brox, T. , year 2015 . title U-Net: Convolutional Networks for Biomedical Image Segmentation , in: booktitle Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015 , pp. pages 234--241 . :10.1007/978-3-319-24574-4_28

  21. [21]

    , author Xiang, S

    author Ruan, J. , author Xiang, S. , author Xie, M. , author Liu, T. , author Fu, Y. , year 2022 . title MALUNet: A Multi-Attention and Light-Weight UNet for Skin Lesion Segmentation , in: booktitle 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , pp. pages 1150--1156 . :10.1109/BIBM55620.2022.9995040. note see also arXiv:2211.01784

  22. [22]

    , author Malik, J

    author Shi, J. , author Malik, J. , year 2000 . title Normalized Cuts and Image Segmentation . journal IEEE Transactions on Pattern Analysis and Machine Intelligence volume 22 , pages 888--905

  23. [23]

    , author Chen, S

    author Wu, H. , author Chen, S. , author Chen, G. , et al., year 2022 . title FAT-Net: Feature Adaptive Transformers for Automated Skin Lesion Segmentation . journal Medical Image Analysis volume 76 , pages 102327 . :10.1016/j.media.2021.102327

  24. [24]

    , author Peng, H

    author Zeng, G. , author Peng, H. , author Li, A. , author Liu, Z. , author Liu, C. , author Yu, P.S. , author He, L. , year 2023 . title Unsupervised Skin Lesion Segmentation via Structural Entropy Minimization on Multi-Scale Superpixel Graphs , in: booktitle 2023 IEEE International Conference on Data Mining (ICDM) , pp. pages 768--777 . :10.1109/ICDM585...

  25. [25]

    arXiv preprint arXiv:2304.13785 (2023)

    author Zhang, K. , author Liu, D. , year 2023 . title Customized Segment Anything Model for Medical Image Segmentation . journal arXiv preprint arXiv:2304.13785 :10.48550/arXiv.2304.13785

  26. [26]

    , author Liu, H

    author Zhang, Y. , author Liu, H. , author Hu, Q. , year 2021 . title TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation , in: booktitle Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2021 , pp. pages 14--24 . :10.1007/978-3-030-87193-2_2

  27. [27]

    Assran, Q

    author Zhou, H. , author Qiao, B. , author Yang, L. , author Lai, J. , author Xie, X. , year 2023 a. title Texture-Guided Saliency Distilling for Unsupervised Salient Object Detection , in: booktitle Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pp. pages 7257--7267 . :10.1109/CVPR52729.2023.00701

  28. [28]

    , author Tong, T

    author Zhou, X. , author Tong, T. , author Zhong, Z. , author Fan, H. , author Li, Z. , year 2023 b. title Saliency-CCE: Exploiting Colour Contextual Extractor and Saliency-Based Biomedical Image Segmentation . journal Computers in Biology and Medicine volume 154 , pages 106551 . :10.1016/j.compbiomed.2023.106551

  29. [29]

    , author Siddiquee, M.M.R

    author Zhou, Z. , author Siddiquee, M.M.R. , author Tajbakhsh, N. , author Liang, J. , year 2018 . title UNet++: A Nested U-Net Architecture for Medical Image Segmentation , in: booktitle Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support , pp. pages 3--11 . :10.1007/978-3-030-00889-5_1