pith. sign in

arxiv: 2606.30577 · v1 · pith:W2LU2DU3new · submitted 2026-06-29 · 💻 cs.CV

APRIL-MedSeg: A Modular Medical Image Segmentation Toolbox Embracing Modern Paradigms

Pith reviewed 2026-06-30 06:07 UTC · model grok-4.3

classification 💻 cs.CV
keywords medical image segmentationmodular frameworksemi-supervised learningdomain adaptationknowledge distillationfoundation modelsYAML configurationregistry system
0
0 comments X

The pith

APRIL-MedSeg provides a modular, YAML-configured framework for medical image segmentation that integrates multiple advanced learning paradigms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents APRIL-MedSeg as a framework designed to make medical image segmentation research more organized and flexible. It decomposes segmentation networks into reusable components so that different parts can be mixed and matched. The framework brings together techniques like semi-supervised learning, domain adaptation, knowledge distillation, weakly supervised learning, text-guided segmentation, and foundation model support. A registry-based configuration system using YAML files with inheritance allows users to manage experiments by switching models, datasets, and strategies easily. This is intended to bridge new algorithmic ideas with practical implementation and deployment in medical imaging.

Core claim

APRIL-MedSeg is a YAML-driven modular framework for 2D medical image segmentation. It decomposes segmentation networks into reusable components and integrates a broad spectrum of advanced paradigms including semi-supervised learning, domain adaptation, knowledge distillation, weakly supervised learning, text-guided segmentation and foundation model support. A registry-based configuration system with inheritance enables flexible and reproducible experiment management, supporting seamless switching across models, datasets, and training strategies. It also provides unified interfaces for medical datasets, augmentation pipelines, deployment utilities and model ensembling.

What carries the argument

The registry-based configuration system with inheritance for managing experiments across different models and paradigms.

If this is right

  • Experiment management becomes more reproducible by inheriting and modifying configuration files for different setups.
  • New segmentation methods and paradigms can be integrated into the existing ecosystem with reduced development effort.
  • Researchers gain access to a unified interface for handling datasets, augmentations, and model deployment.
  • Model ensembling and support for foundation models become standard features within the same platform.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Adoption of this framework could lead to more consistent benchmarking of segmentation techniques across studies.
  • It may enable faster prototyping of hybrid approaches combining multiple paradigms like knowledge distillation with domain adaptation.
  • Community contributions could extend the registry to include additional modalities or 3D segmentation tasks.

Load-bearing premise

The YAML-driven registry system with inheritance will enable seamless switching across models, datasets, and training strategies without requiring substantial custom code or leading to configuration complexity.

What would settle it

An attempt to implement or switch to a new paradigm such as a novel text-guided method that requires writing extensive custom code outside the provided registry and configuration mechanisms.

read the original abstract

We present APRIL-MedSeg, a YAML-driven modular framework for 2D medical image segmentation. It provides a unified and extensible ecosystem that decomposes segmentation networks into reusable components. Also, the framework integrates a broad spectrum of advanced paradigms, including semi-supervised learning, domain adaptation, knowledge distillation, weakly supervised learning, and text-guided segmentation as well as foundation model support. A registry-based configuration system with inheritance enables flexible and reproducible experiment management, supporting seamless switching across models, datasets, and training strategies. In addition, the framework provides a unified interface for medical datasets, augmentation pipelines, deployment utilities and model ensembling. Overall, APRIL-MedSeg is designed as a general-purpose research and development platform that bridges algorithmic innovation and practical deployment, while also serving as a structured ecosystem for systematically organizing and reproducing advances in medical image segmentation. The code is available at https://github.com/juntaoJianggavin/APRIL-MedSeg under an Apache 2.0 license.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents APRIL-MedSeg, a YAML-driven modular framework for 2D medical image segmentation. It claims to decompose segmentation networks into reusable components, integrate a broad spectrum of advanced paradigms including semi-supervised learning, domain adaptation, knowledge distillation, weakly supervised learning, text-guided segmentation and foundation model support, and employ a registry-based configuration system with inheritance for flexible and reproducible experiment management. The framework also provides unified interfaces for medical datasets, augmentation pipelines, deployment utilities and model ensembling, positioning itself as a general-purpose research and development platform. The code is released at https://github.com/juntaoJianggavin/APRIL-MedSeg under Apache 2.0.

Significance. If the implementation details confirm the claimed seamless extensibility and paradigm integration, APRIL-MedSeg could serve as a practical ecosystem for organizing and reproducing medical image segmentation advances, with the open-source release providing a concrete strength for community use and reproducibility.

major comments (2)
  1. [Abstract] Abstract: The central claim that the registry-based configuration system with inheritance enables 'seamless switching across models, datasets, and training strategies' while remaining extensible 'without requiring substantial custom code' is load-bearing but unsupported, as the manuscript provides only high-level description with no concrete registration mechanics, code examples, or walkthroughs for adding a novel component (e.g., a new text-guided loss or domain-adaptation module).
  2. [Abstract] Abstract: The assertion of integrating 'a broad spectrum of advanced paradigms' into a unified ecosystem lacks any implementation details, benchmarks, or verification that the claimed decomposition and integration have been achieved, rendering the soundness of the toolbox description unverifiable from the manuscript.
minor comments (2)
  1. The manuscript would benefit from explicit section headings and a dedicated 'Framework Architecture' or 'Configuration System' section to organize the high-level claims.
  2. Consider adding a table comparing APRIL-MedSeg features against existing toolboxes (e.g., MONAI, nnU-Net) to clarify unique contributions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed feedback. We agree that the abstract claims require concrete supporting material in the manuscript and will revise to include implementation details, code examples, and verification of the claimed features.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the registry-based configuration system with inheritance enables 'seamless switching across models, datasets, and training strategies' while remaining extensible 'without requiring substantial custom code' is load-bearing but unsupported, as the manuscript provides only high-level description with no concrete registration mechanics, code examples, or walkthroughs for adding a novel component (e.g., a new text-guided loss or domain-adaptation module).

    Authors: We accept this criticism. The manuscript currently emphasizes the high-level design. In revision we will add a dedicated subsection with (1) explicit registration code for components, (2) a worked example of registering and using a new text-guided loss or domain-adaptation module, and (3) YAML configuration snippets that demonstrate inheritance-based switching across models, datasets, and training strategies. These additions will directly substantiate the extensibility claim. revision: yes

  2. Referee: [Abstract] Abstract: The assertion of integrating 'a broad spectrum of advanced paradigms' into a unified ecosystem lacks any implementation details, benchmarks, or verification that the claimed decomposition and integration have been achieved, rendering the soundness of the toolbox description unverifiable from the manuscript.

    Authors: We agree that the current text is insufficiently specific. The revised manuscript will include (a) explicit descriptions of how each listed paradigm (semi-supervised learning, domain adaptation, knowledge distillation, weakly supervised learning, text-guided segmentation, foundation-model support) maps onto the modular component registry, (b) code-level illustrations of the decomposition, and (c) brief verification examples drawn from the released codebase. As this is a toolbox paper rather than an algorithmic benchmark study, we will focus on integration verification rather than new quantitative benchmarks, but the added material will allow readers to confirm the claimed unification. revision: yes

Circularity Check

0 steps flagged

No circularity: software architecture description with no derivations or predictions

full rationale

The paper is a descriptive account of a modular software toolbox for medical image segmentation. It contains no mathematical derivations, equations, predictions, fitted parameters, or first-principles results that could reduce to their own inputs. Claims about the YAML registry, component decomposition, and paradigm integration are presented as design features rather than derived quantities. No self-citation chains or uniqueness theorems are invoked in a load-bearing way. The absence of any derivation chain makes circularity analysis inapplicable, consistent with the default expectation that most papers are not circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are required because the contribution is a software engineering artifact rather than a mathematical or empirical derivation.

pith-pipeline@v0.9.1-grok · 5711 in / 1148 out tokens · 31385 ms · 2026-06-30T06:07:36.095539+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

300 extracted references · 86 canonical work pages · 29 internal anchors

  1. [1]

    Variational Information Distillation for Knowledge Transfer

    Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence, and Zhenwen Dai. Variational information distillation for knowledge transfer, 2019. URLhttps://arxiv.org/abs/1904.05835. 12

  2. [2]

    Dataset of breast ultrasound images

    Walid Al-Dhabyani, Mohammed Gomaa, Hussien Khaled, and Aly Fahmy. Dataset of breast ultrasound images. Data in brief, 28:104863, 2020. 13

  3. [3]

    Test-time adaptation with salip: A cascade of sam and clip for zero-shot medical image segmentation

    Sidra Aleem, Fangyijie Wang, Mayug Maniparambil, Eric Arazo, Julia Dietlmeier, Kathleen Curran, Noel EO’ Connor, and Suzanne Little. Test-time adaptation with salip: A cascade of sam and clip for zero-shot medical image segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5184–5193, 2024. 7, 12

  4. [4]

    Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation

    Md Zahangir Alom, Mahmudul Hasan, Chris Yakopcic, Tarek M Taha, and Vijayan K Asari. Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation.arXiv preprint arXiv:1802.06955, 2018. 7 14

  5. [5]

    Vim-unet: Vision mamba for biomedical segmentation.arXiv preprint arXiv:2404.07705, 2024

    Anwai Archit and Constantin Pape. Vim-unet: Vision mamba for biomedical segmentation.arXiv preprint arXiv:2404.07705, 2024. 7

  6. [6]

    Dae-former: Dual attention-guided efficient transformer for medical image segmentation

    Reza Azad, René Arimond, Ehsan Khodapanah Aghdam, Amirhossein Kazerouni, and Dorit Merhof. Dae-former: Dual attention-guided efficient transformer for medical image segmentation. InInternational workshop on predictive intelligence in medicine, pages 83–95. Springer, 2023. 7, 9

  7. [7]

    Qwen3-VL Technical Report

    Shuai Bai, Yuxuan Cai, Ruizhe Chen, Keqin Chen, Xionghui Chen, Zesen Cheng, Lianghao Deng, Wei Ding, Chang Gao, Chunjiang Ge, et al. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631, 2025. 3, 8

  8. [8]

    Qwen2.5-VL Technical Report

    Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, Humen Zhong, Yuanzhi Zhu, Mingkun Yang, Zhaohai Li, Jianqiang Wan, Pengfei Wang, Wei Ding, Zheren Fu, Yiheng Xu, Jiabo Ye, Xi Zhang, Tianbao Xie, Zesen Cheng, Hang Zhang, Zhibo Yang, Haiyang Xu, and Junyang Lin. Qwen2.5-vl technical report.ar...

  9. [9]

    Learning to exploit temporal structure for biomedical vision-language processing

    Shruthi Bannur, Stephanie Hyland, Qianchu Liu, Fernando Perez-Garcia, Maximilian Ilse, Daniel C Castro, Benedikt Boecking, Harshita Sharma, Kenza Bouzid, Anja Thieme, et al. Learning to exploit temporal structure for biomedical vision-language processing. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15016–1502...

  10. [10]

    Endovit: pretraining vision transformers on a large collection of endoscopic images.International Journal of Computer Assisted Radiology and Surgery, 19(6):1085–1091, 2024

    Dominik Batić, Felix Holm, Ege Özsoy, Tobias Czempiel, and Nassir Navab. Endovit: pretraining vision transformers on a large collection of endoscopic images.International Journal of Computer Assisted Radiology and Surgery, 19(6):1085–1091, 2024. 8

  11. [11]

    What’s the point: Semantic segmentation with point supervision

    Amy Bearman, Olga Russakovsky, Vittorio Ferrari, and Li Fei-Fei. What’s the point: Semantic segmentation with point supervision. InEuropean conference on computer vision, pages 549–565. Springer, 2016. 4, 11

  12. [12]

    Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs

    Jorge Bernal, F Javier Sánchez, Gloria Fernández-Esparrach, Debora Gil, Cristina Rodríguez, and Fernando Vilariño. Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians.Computerized Medical Imaging and Graphics, 43:99–111, 2015. 13

  13. [13]

    Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: Is the problem solved?IEEE Transactions on Medical Imaging, 37(11):2514–2525, 2018

    Olivier Bernard, Alain Lalande, Clement Zotti, Frederick Cervenansky, Xin Yang, Pheng-Ann Heng, Irem Cetin, Karim Lekadir, Oscar Camara, Miguel Angel Gonzalez Ballester, et al. Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: Is the problem solved?IEEE Transactions on Medical Imaging, 37(11):2514–2525, 2018. 13

  14. [14]

    YOLOv4: Optimal Speed and Accuracy of Object Detection

    Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection.arXiv preprint arXiv:2004.10934, 2020. 13

  15. [15]

    Robustvesselsegmentation in fundus images.International Journal of Biomedical Imaging, 2013:154860, 2013

    AttilaBudai, RüdigerBock, AndreasMaier, JoachimHornegger, andGeorgMichelson. Robustvesselsegmentation in fundus images.International Journal of Biomedical Imaging, 2013:154860, 2013. 13

  16. [16]

    U-rwkv: Accurate and efficient volumetric medical image segmentation via rwkv.IEEE Transactions on Image Processing, 2026

    Hongyu Cai, Yifan Wang, Liu Wang, Jian Zhao, and Zhejun Kuang. U-rwkv: Accurate and efficient volumetric medical image segmentation via rwkv.IEEE Transactions on Image Processing, 2026. 7

  17. [17]

    Swin-unet: Unet-like pure transformer for medical image segmentation

    Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xiaopeng Zhang, Qi Tian, and Manning Wang. Swin-unet: Unet-like pure transformer for medical image segmentation. InECCV, pages 205–218. Springer, 2022. 3, 7, 9, 10

  18. [18]

    Denseunet: densely connected unet for electron microscopy image segmentation.IET Image Processing, 14(12):2682–2689, 2020

    Yue Cao, Shigang Liu, Yali Peng, and Jun Li. Denseunet: densely connected unet for electron microscopy image segmentation.IET Image Processing, 14(12):2682–2689, 2020. 7

  19. [19]

    MONAI: An open-source framework for deep learning in healthcare

    M Jorge Cardoso, Wenqi Li, Richard Brown, et al. Monai: An open-source framework for deep learning in healthcare.arXiv preprint arXiv:2211.02701, 2022. 1, 3, 4

  20. [20]

    Emerging properties in self-supervised vision transformers

    Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. InICCV, pages 9650–9660, 2021. 3, 8

  21. [21]

    Mist: A simple and scalable end-to-end 3d medical imaging segmentation framework.arXiv preprint arXiv:2407.21343, 2024

    Adrian Celaya et al. Mist: A simple and scalable end-to-end 3d medical imaging segmentation framework.arXiv preprint arXiv:2407.21343, 2024. 1, 3, 4

  22. [22]

    Esfpnet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video

    Qi Chang, Danish Ahmad, Jennifer Toth, Rebecca Bascom, and William E Higgins. Esfpnet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video. InMedical Imaging 2023: Biomedical Applications in Molecular, Structural, and Functional Imaging, volume 12468, page 1246803. SPIE, 2023. 7

  23. [23]

    OphMAE: Bridging Volumetric and Planar Imaging with a Foundation Model for Adaptive Ophthalmological Diagnosis

    Tienyu Chang, Zhen Chen, Renjie Liang, Jinyu Ding, Jie Xu, Sunu Mathew, Amir Reza Hajrasouliha, Andrew J Saykin, Ruogu Fang, Yu Huang, et al. Ophmae: Bridging volumetric and planar imaging with a foundation model for adaptive ophthalmological diagnosis.arXiv preprint arXiv:2605.02714, 2026. 8 15

  24. [24]

    Linknet: Exploiting encoder representations for efficient semantic segmentation

    Abhishek Chaurasia and Eugenio Culurciello. Linknet: Exploiting encoder representations for efficient semantic segmentation. In2017 IEEE visual communications and image processing (VCIP), pages 1–4. IEEE, 2017. 7

  25. [25]

    Bingzhi Chen, Yishu Liu, Zheng Zhang, Guangming Lu, and Adams Wai Kin Kong. Transattunet: Multi-level attention-guided u-net with transformer for medical image segmentation.IEEE Transactions on Emerging Topics in Computational Intelligence, 8(1):55–68, 2023. 7

  26. [26]

    Source-free domain adaptive fundus image segmentation with denoised pseudo-labeling, 2021

    Cheng Chen, Quande Liu, Yueming Jin, Qi Dou, and Pheng-Ann Heng. Source-free domain adaptive fundus image segmentation with denoised pseudo-labeling, 2021. URLhttps://arxiv.org/abs/2109.09735. 12

  27. [27]

    Knowledge distillation with the reused teacher classifier, 2022

    Defang Chen, Jian-Ping Mei, Hailin Zhang, Can Wang, Yan Feng, and Chun Chen. Knowledge distillation with the reused teacher classifier, 2022. URLhttps://arxiv.org/abs/2203.14001. 12

  28. [28]

    Aau-net: an adaptive attention u-net for breast lesions segmentation in ultrasound images.IEEE Transactions on Medical Imaging, 42(5):1289–1300,

    Gongping Chen, Lei Li, Yu Dai, Jianxun Zhang, and Moi Hoon Yap. Aau-net: an adaptive attention u-net for breast lesions segmentation in ultrasound images.IEEE Transactions on Medical Imaging, 42(5):1289–1300,

  29. [29]

    Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning.arXiv preprint arXiv:2301.10921, 2023

    Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, and Marios Savvides. Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning.arXiv preprint arXiv:2301.10921, 2023. 11

  30. [30]

    TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

    Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L Yuille, and Yuyin Zhou. Transunet: Transformers make strong encoders for medical image segmentation.arXiv preprint arXiv:2102.04306, 2021. 3, 7, 9, 10

  31. [31]

    Towards injecting medical visual knowledge into multimodal llms at scale

    Junying Chen, Chi Gui, Ruyi Ouyang, Anningzhe Gao, Shunian Chen, Guiming Hardy Chen, Xidong Wang, Zhenyang Cai, Ke Ji, Xiang Wan, et al. Towards injecting medical visual knowledge into multimodal llms at scale. InProceedings of the 2024 conference on empirical methods in natural language processing, pages 7346–7370,

  32. [32]

    Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs.IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017. 9

  33. [33]

    Rethinking Atrous Convolution for Semantic Image Segmentation

    Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. Rethinking atrous convolution for semantic image segmentation.arXiv preprint arXiv:1706.05587, 2017. 9

  34. [34]

    Deliberated domain bridging for domain adaptive semantic segmentation, 2022

    Lin Chen, Zhixiang Wei, Xin Jin, Huaian Chen, Miao Zheng, Kai Chen, and Yi Jin. Deliberated domain bridging for domain adaptive semantic segmentation, 2022. URLhttps://arxiv.org/abs/2209.07695. 12

  35. [35]

    Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning

    Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, and Tat-Seng Chua. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 5659–5667, 2017. 9

  36. [36]

    Pipa: Pixel-and patch-wise self-supervised learning for domain adaptative semantic segmentation

    Mu Chen, Zhedong Zheng, Yi Yang, and Tat-Seng Chua. Pipa: Pixel-and patch-wise self-supervised learning for domain adaptative semantic segmentation. InProceedings of the 31st ACM International Conference on Multimedia, pages 1905–1914, 2023. 12

  37. [37]

    Gridmask data augmentation

    Pengguang Chen, Shu Liu, Hengshuang Zhao, Xingquan Wang, and Jiaya Jia. Gridmask data augmentation. arXiv preprint arXiv:2001.04086, 2020. 13

  38. [38]

    Distilling knowledge via knowledge review, 2021

    Pengguang Chen, Shu Liu, Hengshuang Zhao, and Jiaya Jia. Distilling knowledge via knowledge review, 2021. URLhttps://arxiv.org/abs/2104.09044. 12

  39. [39]

    Towards a general-purpose foundation model for computational pathology.Nature medicine, 30(3):850–862, 2024

    Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Andrew H Song, Bowen Chen, Andrew Zhang, Daniel Shao, Muhammad Shaban, et al. Towards a general-purpose foundation model for computational pathology.Nature medicine, 30(3):850–862, 2024. 8

  40. [40]

    xlstm-unet can be an effective backbone for 2d & 3d biomedical image segmentation better than its mamba counterparts

    Tianrun Chen, Chaotao Ding, Lanyun Zhu, et al. xlstm-unet can be an effective backbone for 2d & 3d biomedical image segmentation better than its mamba counterparts. InIEEE BHI, pages 1–8. IEEE, 2024. 3, 7

  41. [41]

    Zig-rir: Zigzag rwkv-in-rwkv for efficient medical image segmentation.IEEE Transactions on Medical Imaging, 2025

    Tianxiang Chen, Xudong Zhou, Zhentao Tan, Yue Wu, Ziyang Wang, Zi Ye, Tao Gong, Qi Chu, Nenghai Yu, and Le Lu. Zig-rir: Zigzag rwkv-in-rwkv for efficient medical image segmentation.IEEE Transactions on Medical Imaging, 2025. 7

  42. [42]

    Boundary-Aware Network for Fast and High-Accuracy Portrait Segmentation

    Xi Chen, Donglian Qi, and Jianxin Shen. Boundary-aware network for fast and high-accuracy portrait segmenta- tion.arXiv preprint arXiv:1901.03814, 2019. 9 16

  43. [43]

    Semi-supervised semantic segmentation with cross pseudo supervision, 2021

    Xiaokang Chen, Yuhui Yuan, Gang Zeng, and Jingdong Wang. Semi-supervised semantic segmentation with cross pseudo supervision, 2021. URLhttps://arxiv.org/abs/2106.01226. 11

  44. [44]

    Causalclipseg: Unlocking clip’s potential in referring medical image segmentation with causal intervention

    Yaxiong Chen, Minghong Wei, Zixuan Zheng, Jingliang Hu, Yilei Shi, Shengwu Xiong, Xiao Xiang Zhu, and Lichao Mou. Causalclipseg: Unlocking clip’s potential in referring medical image segmentation with causal intervention. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 77–87. Springer, 2024. 7, 12

  45. [45]

    Extracting class activation maps from non-discriminative features as well,

    Zhaozheng Chen and Qianru Sun. Extracting class activation maps from non-discriminative features as well,

  46. [46]

    URLhttps://arxiv.org/abs/2303.10334. 11

  47. [47]

    Class re- activation maps for weakly-supervised semantic segmentation, 2022

    Zhaozheng Chen, Tan Wang, Xiongwei Wu, Xian-Sheng Hua, Hanwang Zhang, and Qianru Sun. Class re- activation maps for weakly-supervised semantic segmentation, 2022. URLhttps://arxiv.org/abs/2203.00962. 11

  48. [48]

    CoRRabs/2308.16184 (2023)

    Junlong Cheng, Jin Ye, Zhongying Deng, Jianpin Chen, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Jilong Chen, Lei Jiang, et al. Sam-med2d.arXiv preprint arXiv:2308.16184, 2023. 7

  49. [49]

    Normkd: Normalized logits for knowledge distillation, 2023

    Zhihao Chi, Tu Zheng, Hengjia Li, Zheng Yang, Boxi Wu, Binbin Lin, and Deng Cai. Normkd: Normalized logits for knowledge distillation, 2023. URLhttps://arxiv.org/abs/2308.00520. 12

  50. [50]

    Learning phrase representations using rnn encoder–decoder for statistical machine translation

    Kyunghyun Cho, Bart Van Merriënboer, Çağlar Gulçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder–decoder for statistical machine translation. InProceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1724–1734, 2014. 10

  51. [51]

    Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)

    Noel Codella, Veronica Rotemberg, Philipp Tschandl, M Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, et al. Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic).arXiv preprint arXiv:1902.03368, 2019. 13

  52. [52]

    Noel CF Codella, David Gutman, M Emre Celebi, Brian Helba, Michael A Marchetti, Stephen W Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic)...

  53. [53]

    MMSegmentation: Openmmlab semantic segmentation toolbox and benchmark

    MMSegmentation Contributors. MMSegmentation: Openmmlab semantic segmentation toolbox and benchmark. https://github.com/open-mmlab/mmsegmentation, 2020. 1, 3, 4

  54. [54]

    Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation

    Jifeng Dai, Kaiming He, and Jian Sun. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. InProceedings of the IEEE international conference on computer vision, pages 1635–1643, 2015. 11

  55. [55]

    Deformable convolutional networks

    Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. Deformable convolutional networks. InProceedings of the IEEE international conference on computer vision, pages 764–773, 2017. 10

  56. [56]

    Log-vmamba: local-global vision mamba for medical image segmentation

    Trung Dinh Quoc Dang, Huy Hoang Nguyen, and Aleksei Tiulpin. Log-vmamba: local-global vision mamba for medical image segmentation. InProceedings of the Asian Conference on Computer Vision, pages 548–565, 2024. 7

  57. [57]

    Osegnet: Operational segmentation network for covid-19 detection using chest x-ray images

    Aysen Degerli, Serkan Kiranyaz, Muhammad EH Chowdhury, and Moncef Gabbouj. Osegnet: Operational segmentation network for covid-19 detection using chest x-ray images. In2022 IEEE International Conference on Image Processing (ICIP), pages 2306–2310. IEEE, 2022. 13

  58. [58]

    Improved Regularization of Convolutional Neural Networks with Cutout

    Terrance DeVries and Graham W Taylor. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017. 13

  59. [59]

    Resunet-a: A deep learning framework for semantic segmentation of remotely sensed data.ISPRS Journal of Photogrammetry and Remote Sensing, 162: 94–114, 2020

    Foivos I Diakogiannis, François Waldner, Peter Caccetta, and Chen Wu. Resunet-a: A deep learning framework for semantic segmentation of remotely sensed data.ISPRS Journal of Photogrammetry and Remote Sensing, 162: 94–114, 2020. 7

  60. [60]

    1m parameters are enough? a lightweight cnn-based model for medical image segmentation

    Binh-Duong Dinh, Thanh-Thu Nguyen, Thi-Thao Tran, and Van-Truong Pham. 1m parameters are enough? a lightweight cnn-based model for medical image segmentation. In2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pages 1279–1284. IEEE, 2023. 7 17

  61. [61]

    Polyp-pvt: Polyp segmentation with pyramid vision transformers.arXiv preprint arXiv:2108.06932, 2021

    Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, and Ling Shao. Polyp-pvt: Polyp segmentation with pyramid vision transformers.arXiv preprint arXiv:2108.06932, 2021. 7, 9

  62. [62]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 3

  63. [63]

    Are vision xlstm embedded unet more reliable in medical 3d image segmentation?arXiv preprint arXiv:2406.16993, 2024

    Pallabi Dutta, Soham Bose, Swalpa Kumar Roy, and Sushmita Mitra. Are vision xlstm embedded unet more reliable in medical 3d image segmentation?arXiv preprint arXiv:2406.16993, 2024. 7

  64. [64]

    Ultralbm-unet: Ultralight bidirectional mamba-based model for skin lesion segmentation.arXiv preprint arXiv:2512.21584, 2025

    Linxuan Fan, Juntao Jiang, Weixuan Liu, Zhucun Xue, Jiajun Lv, Jiangning Zhang, and Yong Liu. Ultralbm-unet: Ultralight bidirectional mamba-based model for skin lesion segmentation.arXiv preprint arXiv:2512.21584, 2025. 7

  65. [65]

    Md-rwkv-unet: Scale-aware anatomical encoding with cross-stage fusion for multi-organ segmenta- tion.arXiv preprint arXiv:2603.27261, 2026

    Zhuoyi Fang. Md-rwkv-unet: Scale-aware anatomical encoding with cross-stage fusion for multi-organ segmenta- tion.arXiv preprint arXiv:2603.27261, 2026. 7

  66. [66]

    Enhancement of blood vessels in digital fundus photographs via the application of multiscale line operators.Journal of the Franklin institute, 345(7):748–765, 2008

    Damian JJ Farnell, Fraser N Hatfield, Paul Knox, Michael Reakes, Stan Spencer, David Parry, and Simon P Harding. Enhancement of blood vessels in digital fundus photographs via the application of multiscale line operators.Journal of the Franklin institute, 345(7):748–765, 2008. 13

  67. [67]

    Scaling self-supervised learning for histopathology with masked image modeling.MedRxiv, pages 2023–07, 2023

    Alexandre Filiot, Ridouane Ghermi, Antoine Olivier, Paul Jacob, Lucas Fidon, Axel Camara, Alice Mac Kain, Charlie Saillard, and Jean-Baptiste Schiratti. Scaling self-supervised learning for histopathology with masked image modeling.MedRxiv, pages 2023–07, 2023. 8

  68. [68]

    Phikon-v2, a large and public feature extractor for biomarker prediction.arXiv preprint arXiv:2409.09173,

    Alexandre Filiot, Paul Jacob, Alice Mac Kain, and Charlie Saillard. Phikon-v2, a large and public feature extractor for biomarker prediction.arXiv preprint arXiv:2409.09173, 2024. 8

  69. [69]

    An ensemble classification-based approach applied to retinal blood vessel segmentation.IEEE Transactions on Biomedical Engineering, 59(9):2538–2548, 2012

    Muhammad Moazam Fraz, Paolo Remagnino, Andreas Hoppe, Bunyarit Uyyanonvara, Alicja R Rudnicka, Christopher G Owen, and Sarah A Barman. An ensemble classification-based approach applied to retinal blood vessel segmentation.IEEE Transactions on Biomedical Engineering, 59(9):2538–2548, 2012. 13

  70. [70]

    Dual attention network for scene segmentation

    Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. Dual attention network for scene segmentation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3146–3154, 2019. 9

  71. [71]

    Pannuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification

    Jevgenij Gamper, Navid Alemi Koohbanani, Ksenija Benet, Ali Khuram, and Nasir Rajpoot. Pannuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification. InEuropean congress on digital pathology, pages 11–19. Springer, 2019. 13

  72. [72]

    Domain-adversarial training of neural networks.Journal of machine learning research, 17(59):1–35, 2016

    Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario March, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of machine learning research, 17(59):1–35, 2016. 4, 12

  73. [73]

    Simple copy-paste is a strong data augmentation method for instance segmentation

    Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D Cubuk, Quoc V Le, and Barret Zoph. Simple copy-paste is a strong data augmentation method for instance segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2918–2928, 2021. 13

  74. [74]

    nnmamba: 3d biomedical image segmentation, classification and landmark detection with state space model

    Haifan Gong, Luoyao Kang, Yitao Wang, Yihan Wang, Xiang Wan, Xusheng Wu, and Haofeng Li. nnmamba: 3d biomedical image segmentation, classification and landmark detection with state space model. In2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2025. 7

  75. [75]

    Boundary-aware geometric encoding for semantic segmentation of point clouds

    Jingyu Gong, Jiachen Xu, Xin Tan, Jie Zhou, Yanyun Qu, Yuan Xie, and Lizhuang Ma. Boundary-aware geometric encoding for semantic segmentation of point clouds. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 1424–1432, 2021. 12

  76. [76]

    Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images

    Simon Graham, Quoc Dang Vu, Shan E Ahmed Raza, Ayesha Azam, Yee Wah Tsang, Jin Tae Kwak, and Nasir Rajpoot. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical image analysis, 58:101563, 2019. 7

  77. [77]

    The Llama 3 Herd of Models

    Aaron Grattafiori, Abhimanyu Dubey, et al. The llama 3 herd of models, 2024. URLhttps://arxiv.org/abs/ 2407.21783. 8

  78. [78]

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces

    Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces.arXiv preprint arXiv:2312.00752, 2023. 3 18

  79. [79]

    Transdiffseg: Transformer-based conditional diffusion segmentation model for abdominal multi-objective

    WenWen Gu, GuoDong Zhang, RongHui Ju, SuRan Wang, YanLin Li, TingYu Liang, Wei Guo, and ZhaoXuan Gong. Transdiffseg: Transformer-based conditional diffusion segmentation model for abdominal multi-objective. Journal of Imaging Informatics in Medicine, 38(1):262–280, 2025. 9

  80. [80]

    Sa-unet: Spatial attention u-net for retinal vessel segmentation

    Changlu Guo, Márton Szemenyei, Yugen Yi, Wenle Wang, Buer Chen, and Changqi Fan. Sa-unet: Spatial attention u-net for retinal vessel segmentation. In2020 25th international conference on pattern recognition (ICPR), pages 1236–1242. IEEE, 2021. 7

Showing first 80 references.