MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models

Hamid Alinejad-Rokny; Imran Razzak; Mohammad AU Khan; Shahzaib Iqbal; Syed Saud Naqvi; Tariq M. Khan; Thantrira Porntaveetus

arxiv: 2606.06103 · v1 · pith:Y64OW6YCnew · submitted 2026-06-04 · 💻 cs.CV

MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models

Tariq M. Khan , Syed Saud Naqvi , Thantrira Porntaveetus , Hamid Alinejad-Rokny , Shahzaib Iqbal , Imran Razzak , Mohammad AU Khan This is my paper

Pith reviewed 2026-06-28 01:52 UTC · model grok-4.3

classification 💻 cs.CV

keywords medical image segmentationdataset knowledge cardmodel designdataset-conditioned designDRIVEISIC2018ACDC

0 comments

The pith

Dataset knowledge cards make medical segmentation design start from data requirements rather than architecture search

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that medical image segmentation design should begin by making explicit what a given dataset requires from a model, including factors like foreground occupancy, morphology, boundary ambiguity, and acquisition variation. It introduces the MS-DKC framework to record this evidence through five descriptor categories and map them directly to failure modes, design priors, and evaluation criteria. When applied to DRIVE for thin vessels, ISIC2018 for variable lesions, and ACDC for cardiac structures, the framework produces different recommendations for models, loss functions, and metrics on each dataset. Results show tailored selections achieving competitive scores while aligning with risk considerations. A sympathetic reader would care because this shifts focus from competing on benchmarks to traceable, dataset-specific choices that may better suit clinical needs.

Core claim

MS-DKC records dataset evidence through image/acquisition, morphology, supervision, context-dependence, and deployment-risk descriptors. These descriptors are mapped to failure modes, design priors, and risk-aligned criteria, making segmentation design more traceable than architecture-first comparison. Evaluation on DRIVE, ISIC2018, and ACDC shows that this produces dataset-conditioned recommendations, such as detail-preserving models and topology-aware metrics for DRIVE or class-balanced supervision for ACDC, supporting that different datasets require different priors, operating points, and evidence before a model can be judged appropriate.

What carries the argument

The MS-DKC framework, which records dataset evidence in five descriptor categories and maps them to failure modes, design priors, and risk-aligned criteria for model selection and adaptation.

If this is right

Vessel datasets like DRIVE favor detail-preserving models, sensitivity-aware optimization, and topology-aware metrics over standard Dice-only training.
Lesion datasets like ISIC2018 benefit from validation-constrained score-function selection that avoids augmentation when it harms boundary or risk profiles.
Multi-class cardiac datasets like ACDC call for four-class softmax, class-balanced losses, and class-wise surface distance evaluation.
Model comparisons become valid only after dataset descriptors have fixed the operating point and evidence requirements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same descriptor categories could be applied to non-segmentation tasks such as detection or registration to test consistency of the mappings.
Automated dataset scanners might generate MS-DKC entries from raw images, reducing the manual effort needed before model selection.
Clinical deployment pipelines could require an MS-DKC review step before approving a segmentation model for a new site or scanner.

Load-bearing premise

The five descriptor categories and their mappings to failure modes and design priors are sufficient and accurate for determining appropriate models across medical segmentation tasks.

What would settle it

A controlled test on a held-out medical segmentation dataset in which every model selected via MS-DKC mappings underperforms a model chosen by standard architecture search or random selection on the risk-aligned metrics.

Figures

Figures reproduced from arXiv: 2606.06103 by Hamid Alinejad-Rokny, Imran Razzak, Mohammad AU Khan, Shahzaib Iqbal, Syed Saud Naqvi, Tariq M. Khan, Thantrira Porntaveetus.

**Figure 1.** Figure 1: MS-DKC dataset-conditioned segmentation design workflow. Measured dataset structure is translated into descriptor profiles, anticipated risks, ranked [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: U-Net as a conditional design prior in the MS-DKC framework. The dataset profile determines whether a standard U-Net prior should be accepted, [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Dataset-conditioned reasoning for capacity, pretraining, and transfer. Model scale and transfer strategy are selected according to measured dataset demands [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

**Figure 4.** Figure 4: Risk-aligned evaluation beyond Dice. Dominant dataset risks are mapped to evaluation emphases that better reflect structural, statistical, and deployment [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗

read the original abstract

Medical image segmentation is often framed as a search for stronger architectures, but this can obscure a more fundamental question: what does the dataset require from the model? In medical imaging, this requirement is shaped by foreground occupancy, morphology, boundary ambiguity, topology sensitivity, annotation quality, acquisition variation, and operating point. This paper introduces the Medical Segmentation Dataset Knowledge Card (MS-DKC), a framework for making these factors explicit. MS-DKC records dataset evidence through image/acquisition, morphology, supervision, context-dependence, and deployment-risk descriptors. These descriptors are mapped to failure modes, design priors, and risk-aligned criteria, making segmentation design more traceable than architecture-first comparison. We evaluate MS-DKC on DRIVE, ISIC2018, and ACDC, representing distinct regimes. DRIVE contains sparse, thin, branching vessels, favoring detail-preserving models, sensitivity-aware optimization, threshold analysis, and topology-aware metrics. DKC-TNet-v2 achieved Dice 0.8044 and IoU 0.6730 with 35103 parameters, while SA-UNetv2-DKC-AmbRef reached Dice 0.8141, IoU 0.6865, sensitivity 0.8265, specificity 0.9804, and AUC 0.9853. ISIC2018 involves compact but appearance-variable lesions; validation-constrained score-function selection on Att-Next-Topo/ATTNext produced MS-DKC-AttNextTopo-VCSF-NoAug with Dice 0.8872, IoU 0.8214, precision 0.9173, Boundary F1 0.4878, and ASSD 4.13, while plausible additions failed to improve the risk-aligned profile. ACDC provides a multi-class cardiac case, where MS-DKC recommends four-class softmax segmentation, class-balanced Dice/CE supervision, and class-wise surface evaluation. Overall, the results support dataset-conditioned design: different datasets require different priors, operating points, and evidence before a model can be judged appropriate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MS-DKC gives a structured way to record dataset properties for segmentation design, but the three case studies do not test whether the mappings actually change outcomes compared to standard choices.

read the letter

The paper's main contribution is the MS-DKC framework itself. It defines five descriptor categories—image/acquisition, morphology, supervision, context-dependence, and deployment-risk—and maps them to failure modes, design priors, and evaluation criteria. This is not described in the cited prior work, so the named structure and explicit mapping process count as new.

It applies the framework to DRIVE (thin vessels), ISIC2018 (variable lesions), and ACDC (multi-class cardiac). For each it selects models, losses, and metrics that match the descriptors and reports Dice, IoU, boundary, and sensitivity numbers. The approach makes the reasoning from dataset to model choice more traceable than pure architecture search.

The experiments stop short of the needed controls. The abstract shows results after applying the mappings but does not report what happens with a fixed architecture, default loss, or generic threshold on the same datasets. Without those runs it is hard to tell whether the descriptor-to-prior steps are necessary or whether any reasonable model would produce similar profiles. No error bars or ablation details appear in the summary either.

This is useful reading for researchers who already work on medical segmentation and want a checklist for dataset factors. It is less useful for readers looking for new architectures or clinical validation. The central idea is coherent and the authors engage with real dataset differences, so the paper deserves a serious referee even though the current evidence is preliminary.

Referee Report

2 major / 1 minor

Summary. The paper introduces the Medical Segmentation Dataset Knowledge Card (MS-DKC) framework, which uses five descriptor categories (image/acquisition, morphology, supervision, context-dependence, deployment-risk) to explicitly record dataset evidence and map it to failure modes, design priors, and risk-aligned evaluation criteria. It applies the framework to three datasets representing distinct regimes—DRIVE (sparse thin vessels), ISIC2018 (compact variable lesions), and ACDC (multi-class cardiac structures)—and reports tailored model results such as DKC-TNet-v2 achieving Dice 0.8044 on DRIVE and Att-Next-Topo achieving Dice 0.8872 on ISIC2018, concluding that these support dataset-conditioned design over architecture-first approaches.

Significance. If the descriptor-to-prior mappings hold under controlled testing, the framework could shift medical segmentation research toward more traceable, dataset-aware design choices that reduce inappropriate model selection. The manuscript earns credit for grounding claims in public datasets (DRIVE, ISIC2018, ACDC) and standard metrics while providing concrete examples of how descriptors translate into choices like sensitivity-aware optimization or class-balanced losses. However, the significance remains provisional without evidence that the mappings are necessary rather than incidental.

major comments (2)

[Evaluation sections on DRIVE, ISIC2018, and ACDC] Evaluation sections on DRIVE, ISIC2018, and ACDC: the reported metric profiles (e.g., Dice 0.8044 / IoU 0.6730 for DKC-TNet-v2 on DRIVE; Dice 0.8872 / Boundary F1 0.4878 for MS-DKC-AttNextTopo-VCSF-NoAug on ISIC2018; four-class softmax on ACDC) are produced after applying the five-descriptor mappings, yet no baseline runs with a fixed architecture, default loss, or generic operating point are presented on the same datasets. Without these controls, it is not possible to determine whether the observed outcomes require the MS-DKC mappings or would arise from any reasonable model choice, directly undermining the central claim that the results support dataset-conditioned design.
[Framework description and abstract] Framework description and abstract: the claim that the five descriptor categories are sufficient to determine appropriate models across medical segmentation tasks rests on the mappings to failure modes and priors, but no ablation, sensitivity analysis, or comparison against alternative categorizations is provided to test whether these categories capture the necessary factors or whether different groupings would produce equivalent design recommendations.

minor comments (1)

[Abstract] Abstract: the statement that 'plausible additions failed to improve the risk-aligned profile' on ISIC2018 lacks detail on what the additions were or how failure was quantified, reducing reproducibility of that specific claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments that identify opportunities to strengthen the empirical support for the MS-DKC framework. We address each major comment below.

read point-by-point responses

Referee: Evaluation sections on DRIVE, ISIC2018, and ACDC: the reported metric profiles (e.g., Dice 0.8044 / IoU 0.6730 for DKC-TNet-v2 on DRIVE; Dice 0.8872 / Boundary F1 0.4878 for MS-DKC-AttNextTopo-VCSF-NoAug on ISIC2018; four-class softmax on ACDC) are produced after applying the five-descriptor mappings, yet no baseline runs with a fixed architecture, default loss, or generic operating point are presented on the same datasets. Without these controls, it is not possible to determine whether the observed outcomes require the MS-DKC mappings or would arise from any reasonable model choice, directly undermining the central claim that the results support dataset-conditioned design.

Authors: We agree that the lack of controlled baseline comparisons limits the ability to attribute performance gains specifically to the MS-DKC mappings. In the revised manuscript we will add experiments on all three datasets using fixed standard architectures (e.g., U-Net) with default losses and operating points to provide the necessary controls and clarify whether the reported results depend on the dataset-specific priors. revision: yes
Referee: Framework description and abstract: the claim that the five descriptor categories are sufficient to determine appropriate models across medical segmentation tasks rests on the mappings to failure modes and priors, but no ablation, sensitivity analysis, or comparison against alternative categorizations is provided to test whether these categories capture the necessary factors or whether different groupings would produce equivalent design recommendations.

Authors: The five categories were chosen to reflect recurring challenges documented in the medical segmentation literature. The manuscript presents the framework as an initial structured approach rather than asserting that the categories are provably minimal or optimal. We will add a dedicated discussion section in the revision that justifies the chosen categories, notes potential alternative groupings, and illustrates how different categorizations could alter design recommendations. revision: partial

Circularity Check

0 steps flagged

No circularity: framework is descriptive and evaluations are empirical applications on public data.

full rationale

The paper introduces MS-DKC as a set of five descriptor categories (image/acquisition, morphology, supervision, context-dependence, deployment-risk) that are mapped to failure modes and design priors. These mappings are presented as explicit records rather than derived quantities. The case studies on DRIVE, ISIC2018, and ACDC apply the descriptors to select models, losses, and metrics, but report standard public metrics without any equations, fitted parameters, or self-citations that reduce the central claim to its own inputs by construction. The derivation chain consists of dataset description followed by application; no step equates a reported outcome to a quantity defined inside the paper itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the domain assumption that the listed descriptors capture the factors that determine model suitability; no free parameters or invented physical entities are introduced.

axioms (1)

domain assumption The five descriptor categories capture the key factors that shape what a segmentation model requires from a dataset.
This assumption underpins the mapping from dataset evidence to failure modes and design priors.

pith-pipeline@v0.9.1-grok · 5945 in / 1193 out tokens · 28802 ms · 2026-06-28T01:52:27.090325+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 16 canonical work pages

[1]

Ronneberger, P

O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolu- tional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Inter- vention – MICCAI 2015, volume 9351 ofLecture Notes in Computer Science, Springer, 2015, pp. 234–241. doi:10. 1007/978-3-319-24574-4_28

2015
[2]

Milletari, N

F. Milletari, N. Navab, S.-A. Ahmadi, V-Net: Fully con- volutional neural networks for volumetric medical im- age segmentation, in: 2016 Fourth International Con- ference on 3D Vision (3DV), IEEE, 2016, pp. 565–571. doi:10.1109/3DV.2016.79

work page doi:10.1109/3dv.2016.79 2016
[3]

Çiçek, A

Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, O. Ronneberger, 3D U-Net: Learning dense volumet- ric segmentation from sparse annotation, in: Medi- cal Image Computing and Computer-Assisted Interven- tion – MICCAI 2016, volume 9901 ofLecture Notes in Computer Science, Springer, 2016, pp. 424–432. doi:10. 1007/978-3-319-46723-8_49

2016
[4]

Oktay, J

O. Oktay, J. Schlemper, L. Le Folgoc, M. Lee, M. Hein- rich, K. Misawa, K. Mori, S. McDonagh, N. Y . Hammerla, B. Kainz, B. Glocker, D. Rueckert, Attention U-Net: Learning where to look for the pancreas, arXiv preprint arXiv:1804.03999 (2018).arXiv:1804.03999

Pith/arXiv arXiv 2018
[5]

Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. Liang, UNet++: Redesigning skip connections to exploit multi- scale features in image segmentation, IEEE Transactions on Medical Imaging 39 (2020) 1856–1867. doi:10.1109/ TMI.2019.2959609

arXiv 2020
[6]

Isensee, P

F. Isensee, P. F. Jäger, S. A. A. Kohl, J. Petersen, K. H. Maier-Hein, nnU-Net: A self-configuring method for deep learning-based biomedical image segmenta- tion, Nature Methods 18 (2021) 203–211. doi:10.1038/ s41592-020-01008-z

2021
[7]

MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition

A. Hatamizadeh, Y . Tang, V . Nath, D. Yang, A. Myro- nenko, B. Landman, H. R. Roth, D. Xu, UNETR: Trans- formers for 3d medical image segmentation, in: Proceed- ings of the IEEE/CVF Winter Conference on Applica- tions of Computer Vision (W ACV), 2022, pp. 1748–1758. doi:10.1109/WACV51458.2022.00181

work page doi:10.1109/wacv51458.2022.00181 2022
[8]

H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, M. Wang, Swin-Unet: Unet-like pure transformer for medical image segmentation, in: Computer Vision – ECCV 2022 Workshops, volume 13803 ofLecture Notes in Computer Science, Springer, 2023, pp. 205–218. doi:10.1007/978-3-031-25066-8_9. 39

work page doi:10.1007/978-3-031-25066-8_9 2022
[9]

H.-Y . Zhou, J. Guo, Y . Zhang, L. Yu, L. Wang, Y . Yu, nnFormer: Interleaved transformer for volumetric seg- mentation, arXiv preprint arXiv:2109.03201 (2021). arXiv:2109.03201

arXiv 2021
[10]

W. Wang, C. Chen, M. Ding, H. Yu, S. Zha, J. Li, Trans- BTS: Multimodal brain tumor segmentation using trans- former, in: Medical Image Computing and Computer- Assisted Intervention – MICCAI 2021, volume 12901 of Lecture Notes in Computer Science, Springer, 2021, pp. 109–119. doi:10.1007/978-3-030-87193-2_11

work page doi:10.1007/978-3-030-87193-2_11 2021
[11]

Zhang, H

Y . Zhang, H. Liu, Q. Hu, TransFuse: Fusing trans- formers and CNNs for medical image segmentation, in: Medical Image Computing and Computer-Assisted In- tervention – MICCAI 2021, volume 12901 ofLecture Notes in Computer Science, Springer, 2021, pp. 14–24. doi:10.1007/978-3-030-87193-2_2

work page doi:10.1007/978-3-030-87193-2_2 2021
[12]

J. Chen, J. Mei, X. Li, Y . Lu, Q. Yu, Q. Wei, X. Luo, Y . Xie, E. Adeli, Y . Wang, M. P. Lungren, S. Zhang, L. Xing, L. Lu, A. Yuille, Y . Zhou, TransUNet: Rethink- ing the U-Net architecture design for medical image seg- mentation through the lens of transformers, Medical Im- age Analysis 97 (2024) 103280. doi:10.1016/j.media. 2024.103280

work page doi:10.1016/j.media 2024
[13]

Z. Xing, T. Ye, Y . Yang, G. Liu, L. Zhu, SegMamba: Long-range sequential modeling Mamba for 3d medi- cal image segmentation, in: Medical Image Comput- ing and Computer-Assisted Intervention – MICCAI 2024, Springer, 2024

2024
[14]

T. D. Q. Dang, H. H. Nguyen, A. Tiulpin, LoG-VMamba: Local-global vision Mamba for medical image segmenta- tion, in: Proceedings of the Asian Conference on Com- puter Vision (ACCV), 2024, pp. 548–565

2024
[15]

J. Ma, Y . He, F. Li, L. Han, C. You, B. Wang, Segment anything in medical images, Nature Communications 15 (2024) 654. doi:10.1038/s41467-024-44824-z

work page doi:10.1038/s41467-024-44824-z 2024
[16]

J. Zhu, A. Hamdi, Y . Qi, Y . Jin, J. Wu, Med- ical SAM 2: Segment medical images as video via segment anything model 2, arXiv preprint arXiv:2408.00874 (2024). doi:10.48550/arXiv.2408. 00874.arXiv:2408.00874

work page doi:10.48550/arxiv.2408 2024
[17]

T. M. Khan, S. S. Naqvi, E. Meijering, Leveraging im- age complexity in macro-level neural network design for medical image segmentation, Scientific Reports 12 (2022) 22286

2022
[18]

T. M. Khan, M. Arsalan, A. Robles-Kelly, E. Meijer- ing, Mkis-net: a light-weight multi-kernel network for medical image segmentation, in: International Confer- ence on Digital Image Computing: Techniques and Appli- cations (DICTA), 10.1109/DICTA56598.2022.10034573, 2022, pp. 1–8

work page doi:10.1109/dicta56598.2022.10034573 2022
[19]

Iqbal, A

S. Iqbal, A. N. Qureshi, M. Alhussein, I. A. Choudhry, K. Aurangzeb, T. M. Khan, Fusion of textural and vi- sual information for medical image modality retrieval us- ing deep learning-based feature engineering, IEEE Access 11 (2023) 93238–93253

2023
[20]

Iqbal, T

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, M. Usman, H. A. Khan, I. Razzak, Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices, IEEE journal of biomedical and health informatics (2023)

2023
[21]

Qayyum, I

A. Qayyum, I. Razzak, M. Mazher, T. Khan, W. Ding, S. Niederer, Two-stage self-supervised contrastive learn- ing aided transformer for real-time medical image seg- mentation, IEEE Journal of Biomedical and Health In- formatics (2023)

2023
[22]

Javed, T

S. Javed, T. M. Khan, A. Qayyum, H. Alinejad-Rokny, A. Sowmya, I. Razzak, Advancing medical image seg- mentation with mini-net: A lightweight solution tailored for efficient segmentation of medical images, arXiv preprint arXiv:2405.17520 (2024)

arXiv 2024
[23]

Iqbal, T

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, E. Meijer- ing, Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation, Pattern Recognition 158 (2025) 111028

2025
[24]

T. M. Khan, S. S. Naqvi, E. Meijering, Esdmr-net: A lightweight network with expand-squeeze and dual mul- tiscale residual connections for medical image segmenta- tion, Engineering Applications of Artificial Intelligence 133 (2024) 107995

2024
[25]

Y . Xu, T. M. Khan, Y . Song, E. Meijering, Edge deep learning in computer vision and medical diagnostics: a comprehensive survey, Artificial Intelligence Review 58 (2025) 93

2025
[26]

Safdar, S

M. Safdar, S. Iqbal, M. Mehmood, M. Ghafoor, T. M. Khan, I. Razzak, Focal modulation and bidirectional feature fusion network for medical image segmentation, arXiv preprint arXiv:2510.20933 (2025)

Pith/arXiv arXiv 2025
[27]

T. M. Khan, Q. E. U. Haq, S. Iqbal, T. A. Soomro, Edge- based artificial intelligence: Understanding the evolution of hardware and software and future trends, Engineering Applications of Artificial Intelligence 174 (2026) 114526

2026
[28]

J. Ma, F. Li, S. Kim, R. Asakereh, B.-H. Le, D.-K. Nguyen-Vu, A. Pfefferle, M. Wei, R. Gao, D. Lyu, S. Yang, L. Purucker, Z. Marinov, M. Staring, H. Lu, T. T. Dao, X. Ye, Z. Li, G. Brugnara, P. V ollmuth, M. Foltyn-Dumitru, J. Cho, M. A. Mahmutoglu, M. Bend- szus, I. Pflüger, A. Rastogi, D. Ni, X. Yang, G.-Q. Zhou, K. Wang, N. Heller, N. Papanikolopoulos,...

work page doi:10.48550/arxiv.2412 2024
[29]

Peiris, M

H. Peiris, M. Hayat, Z. Chen, G. Egan, M. Harandi, VT- UNet: A robust volumetric transformer for accurate 3d tu- mor segmentation 13435 (2022) 162–172. doi:10.1007/ 978-3-031-16443-9_16

2022
[30]

Isensee, P

F. Isensee, P. F. Jäger, S. A. A. Kohl, J. Petersen, K. H. Maier-Hein, No new-net, arXiv preprint arXiv:1809.10486 (2018).arXiv:1809.10486

Pith/arXiv arXiv 2018
[31]

Gibson, W

E. Gibson, W. Li, C. Sudre, L. Fidon, D. I. Shakir, G. Wang, Z. Eaton-Rosen, R. Gray, T. Doel, Y . Hu, T. Whyntie, T. Vercauteren, M. J. Cardoso, M. Modat, D. C. Barratt, S. Ourselin, NiftyNet: A deep-learning platform for medical imaging, Computer Methods and Programs in Biomedicine 158 (2018) 113–122. doi:10. 1016/j.cmpb.2018.01.025

2018
[32]

Specaugment on large scale datasets

H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y . Iwamoto, X. Han, Y .-W. Chen, J. Wu, UNet 3+: A full-scale con- nected UNet for medical image segmentation, in: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2020, pp. 1055–1059. doi:10.1109/ICASSP40776.2020.9053405

work page doi:10.1109/icassp40776.2020.9053405 2020
[33]

J. M. J. Valanarasu, V . A. Sindagi, I. Hacihaliloglu, V . M. Patel, KiU-Net: Towards accurate segmentation of biomedical images using over-complete representations, in: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2020, volume 12264 ofLecture Notes in Computer Science, Springer, 2020, pp. 363–373. doi:10.1007/978-3-030-59719-1_36

work page doi:10.1007/978-3-030-59719-1_36 2020
[34]

Kervadec, J

H. Kervadec, J. Bouchtiba, C. Desrosiers, E. Granger, J. Dolz, I. Ben Ayed, Boundary loss for highly unbal- anced segmentation, Medical Image Analysis 67 (2021) 101851. doi:10.1016/j.media.2020.101851

work page doi:10.1016/j.media.2020.101851 2021
[35]

S. Shit, J. C. Paetzold, A. Sekuboyina, I. Ezhov, A. Unger, A. Zhylka, J. P. W. Pluim, U. Bauer, B. H. Menze, clDice: A novel topology-preserving loss function for tubular structure segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16560–16569.arXiv:2003.07311

arXiv 2021
[36]

A. Myronenko, 3D MRI brain tumor segmentation using autoencoder regularization, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, volume 11384 ofLecture Notes in Computer Science, Springer, 2019, pp. 311–320. doi:10.1007/ 978-3-030-11726-9_28

2019
[37]

Hatamizadeh, V

A. Hatamizadeh, V . Nath, Y . Tang, D. Yang, H. R. Roth, D. Xu, Swin UNETR: Swin transformers for seman- tic segmentation of brain tumors in MRI images, arXiv preprint arXiv:2201.01266 (2022).arXiv:2201.01266

arXiv 2022
[38]

J. Ma, F. Li, B. Wang, U-Mamba: Enhancing long-range dependency for biomedical image segmentation, arXiv preprint arXiv:2401.04722 (2024).arXiv:2401.04722

Pith/arXiv arXiv 2024
[39]

Radiology: Artificial Intelligence5(5), e230024 (Sep 2023)

J. Wasserthal, H.-C. Breit, M. T. Meyer, M. Pradella, D. Hinck, A. W. Sauter, T. Heye, D. Boll, J. Cyriac, S. Yang, M. Bach, M. Segeroth, TotalSegmentator: Ro- bust segmentation of 104 anatomic structures in CT im- ages, Radiology: Artificial Intelligence 5 (2023) e230024. doi:10.1148/ryai.230024

work page doi:10.1148/ryai.230024 2023
[40]

J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, Y . Zhou, TransUNet: Transformers make strong encoders for medical image segmentation, arXiv preprint arXiv:2102.04306 (2021).arXiv:2102.04306

Pith/arXiv arXiv 2021
[41]

Z. Deng, X. Huang, D. Li, X. Yuan, MISSFormer: An effective medical image segmentation transformer, arXiv preprint arXiv:2109.07162 (2021).arXiv:2109.07162

arXiv 2021
[42]

Cheng, J

J. Cheng, J. Ye, Z. Deng, J. Chen, T. Li, H. Wang, Y . Su, Z. Huang, J. Chen, L. Jiang, H. Sun, J. He, S. Zhang, M. Zhu, Y . Qiao, SAM-Med2D, arXiv preprint arXiv:2308.16184 (2023).arXiv:2308.16184

arXiv 2023
[43]

J. Wu, Z. Wang, M. Hong, W. Ji, H. Fu, Y . Xu, M. Xu, Y . Jin, Medical SAM adapter: Adapting segment anything model for medical image segmentation, Medical Image Analysis (2025) 103547. doi:10.1016/j.media.2025. 103547, also available as arXiv:2304.12620

work page doi:10.1016/j.media.2025 2025
[44]

J. Ma, Z. Yang, S. Kim, B. Chen, M. Baharoon, A. Fal- lahpour, R. Asakereh, H. Lyu, B. Wang, MedSAM2: Segment anything in 3d medical images and videos, arXiv preprint arXiv:2504.03600 (2025). doi:10.48550/ arXiv.2504.03600.arXiv:2504.03600

arXiv 2025
[45]

Bibliographic details to be ver- ified; citation supplied by the experiment implementation record

GLAD-Net, GLAD-Net: Global-local adaptive fusion and cross-stage distillation for cross-level multi-scale medical image segmentation, n.d. Bibliographic details to be ver- ified; citation supplied by the experiment implementation record
[46]

Katar, O

C. Katar, O. B. Eryilmaz, E. M. Eksioglu, Att-Next for skin lesion segmentation with topological awareness, Expert Systems with Applications 282 (2025) 127637. doi:10.1016/j.eswa.2025.127637. 41

work page doi:10.1016/j.eswa.2025.127637 2025

[1] [1]

Ronneberger, P

O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolu- tional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Inter- vention – MICCAI 2015, volume 9351 ofLecture Notes in Computer Science, Springer, 2015, pp. 234–241. doi:10. 1007/978-3-319-24574-4_28

2015

[2] [2]

Milletari, N

F. Milletari, N. Navab, S.-A. Ahmadi, V-Net: Fully con- volutional neural networks for volumetric medical im- age segmentation, in: 2016 Fourth International Con- ference on 3D Vision (3DV), IEEE, 2016, pp. 565–571. doi:10.1109/3DV.2016.79

work page doi:10.1109/3dv.2016.79 2016

[3] [3]

Çiçek, A

Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, O. Ronneberger, 3D U-Net: Learning dense volumet- ric segmentation from sparse annotation, in: Medi- cal Image Computing and Computer-Assisted Interven- tion – MICCAI 2016, volume 9901 ofLecture Notes in Computer Science, Springer, 2016, pp. 424–432. doi:10. 1007/978-3-319-46723-8_49

2016

[4] [4]

Oktay, J

O. Oktay, J. Schlemper, L. Le Folgoc, M. Lee, M. Hein- rich, K. Misawa, K. Mori, S. McDonagh, N. Y . Hammerla, B. Kainz, B. Glocker, D. Rueckert, Attention U-Net: Learning where to look for the pancreas, arXiv preprint arXiv:1804.03999 (2018).arXiv:1804.03999

Pith/arXiv arXiv 2018

[5] [5]

Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. Liang, UNet++: Redesigning skip connections to exploit multi- scale features in image segmentation, IEEE Transactions on Medical Imaging 39 (2020) 1856–1867. doi:10.1109/ TMI.2019.2959609

arXiv 2020

[6] [6]

Isensee, P

F. Isensee, P. F. Jäger, S. A. A. Kohl, J. Petersen, K. H. Maier-Hein, nnU-Net: A self-configuring method for deep learning-based biomedical image segmenta- tion, Nature Methods 18 (2021) 203–211. doi:10.1038/ s41592-020-01008-z

2021

[7] [7]

MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition

A. Hatamizadeh, Y . Tang, V . Nath, D. Yang, A. Myro- nenko, B. Landman, H. R. Roth, D. Xu, UNETR: Trans- formers for 3d medical image segmentation, in: Proceed- ings of the IEEE/CVF Winter Conference on Applica- tions of Computer Vision (W ACV), 2022, pp. 1748–1758. doi:10.1109/WACV51458.2022.00181

work page doi:10.1109/wacv51458.2022.00181 2022

[8] [8]

H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, M. Wang, Swin-Unet: Unet-like pure transformer for medical image segmentation, in: Computer Vision – ECCV 2022 Workshops, volume 13803 ofLecture Notes in Computer Science, Springer, 2023, pp. 205–218. doi:10.1007/978-3-031-25066-8_9. 39

work page doi:10.1007/978-3-031-25066-8_9 2022

[9] [9]

H.-Y . Zhou, J. Guo, Y . Zhang, L. Yu, L. Wang, Y . Yu, nnFormer: Interleaved transformer for volumetric seg- mentation, arXiv preprint arXiv:2109.03201 (2021). arXiv:2109.03201

arXiv 2021

[10] [10]

W. Wang, C. Chen, M. Ding, H. Yu, S. Zha, J. Li, Trans- BTS: Multimodal brain tumor segmentation using trans- former, in: Medical Image Computing and Computer- Assisted Intervention – MICCAI 2021, volume 12901 of Lecture Notes in Computer Science, Springer, 2021, pp. 109–119. doi:10.1007/978-3-030-87193-2_11

work page doi:10.1007/978-3-030-87193-2_11 2021

[11] [11]

Zhang, H

Y . Zhang, H. Liu, Q. Hu, TransFuse: Fusing trans- formers and CNNs for medical image segmentation, in: Medical Image Computing and Computer-Assisted In- tervention – MICCAI 2021, volume 12901 ofLecture Notes in Computer Science, Springer, 2021, pp. 14–24. doi:10.1007/978-3-030-87193-2_2

work page doi:10.1007/978-3-030-87193-2_2 2021

[12] [12]

J. Chen, J. Mei, X. Li, Y . Lu, Q. Yu, Q. Wei, X. Luo, Y . Xie, E. Adeli, Y . Wang, M. P. Lungren, S. Zhang, L. Xing, L. Lu, A. Yuille, Y . Zhou, TransUNet: Rethink- ing the U-Net architecture design for medical image seg- mentation through the lens of transformers, Medical Im- age Analysis 97 (2024) 103280. doi:10.1016/j.media. 2024.103280

work page doi:10.1016/j.media 2024

[13] [13]

Z. Xing, T. Ye, Y . Yang, G. Liu, L. Zhu, SegMamba: Long-range sequential modeling Mamba for 3d medi- cal image segmentation, in: Medical Image Comput- ing and Computer-Assisted Intervention – MICCAI 2024, Springer, 2024

2024

[14] [14]

T. D. Q. Dang, H. H. Nguyen, A. Tiulpin, LoG-VMamba: Local-global vision Mamba for medical image segmenta- tion, in: Proceedings of the Asian Conference on Com- puter Vision (ACCV), 2024, pp. 548–565

2024

[15] [15]

J. Ma, Y . He, F. Li, L. Han, C. You, B. Wang, Segment anything in medical images, Nature Communications 15 (2024) 654. doi:10.1038/s41467-024-44824-z

work page doi:10.1038/s41467-024-44824-z 2024

[16] [16]

J. Zhu, A. Hamdi, Y . Qi, Y . Jin, J. Wu, Med- ical SAM 2: Segment medical images as video via segment anything model 2, arXiv preprint arXiv:2408.00874 (2024). doi:10.48550/arXiv.2408. 00874.arXiv:2408.00874

work page doi:10.48550/arxiv.2408 2024

[17] [17]

T. M. Khan, S. S. Naqvi, E. Meijering, Leveraging im- age complexity in macro-level neural network design for medical image segmentation, Scientific Reports 12 (2022) 22286

2022

[18] [18]

T. M. Khan, M. Arsalan, A. Robles-Kelly, E. Meijer- ing, Mkis-net: a light-weight multi-kernel network for medical image segmentation, in: International Confer- ence on Digital Image Computing: Techniques and Appli- cations (DICTA), 10.1109/DICTA56598.2022.10034573, 2022, pp. 1–8

work page doi:10.1109/dicta56598.2022.10034573 2022

[19] [19]

Iqbal, A

S. Iqbal, A. N. Qureshi, M. Alhussein, I. A. Choudhry, K. Aurangzeb, T. M. Khan, Fusion of textural and vi- sual information for medical image modality retrieval us- ing deep learning-based feature engineering, IEEE Access 11 (2023) 93238–93253

2023

[20] [20]

Iqbal, T

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, M. Usman, H. A. Khan, I. Razzak, Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices, IEEE journal of biomedical and health informatics (2023)

2023

[21] [21]

Qayyum, I

A. Qayyum, I. Razzak, M. Mazher, T. Khan, W. Ding, S. Niederer, Two-stage self-supervised contrastive learn- ing aided transformer for real-time medical image seg- mentation, IEEE Journal of Biomedical and Health In- formatics (2023)

2023

[22] [22]

Javed, T

S. Javed, T. M. Khan, A. Qayyum, H. Alinejad-Rokny, A. Sowmya, I. Razzak, Advancing medical image seg- mentation with mini-net: A lightweight solution tailored for efficient segmentation of medical images, arXiv preprint arXiv:2405.17520 (2024)

arXiv 2024

[23] [23]

Iqbal, T

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, E. Meijer- ing, Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation, Pattern Recognition 158 (2025) 111028

2025

[24] [24]

T. M. Khan, S. S. Naqvi, E. Meijering, Esdmr-net: A lightweight network with expand-squeeze and dual mul- tiscale residual connections for medical image segmenta- tion, Engineering Applications of Artificial Intelligence 133 (2024) 107995

2024

[25] [25]

Y . Xu, T. M. Khan, Y . Song, E. Meijering, Edge deep learning in computer vision and medical diagnostics: a comprehensive survey, Artificial Intelligence Review 58 (2025) 93

2025

[26] [26]

Safdar, S

M. Safdar, S. Iqbal, M. Mehmood, M. Ghafoor, T. M. Khan, I. Razzak, Focal modulation and bidirectional feature fusion network for medical image segmentation, arXiv preprint arXiv:2510.20933 (2025)

Pith/arXiv arXiv 2025

[27] [27]

T. M. Khan, Q. E. U. Haq, S. Iqbal, T. A. Soomro, Edge- based artificial intelligence: Understanding the evolution of hardware and software and future trends, Engineering Applications of Artificial Intelligence 174 (2026) 114526

2026

[28] [28]

J. Ma, F. Li, S. Kim, R. Asakereh, B.-H. Le, D.-K. Nguyen-Vu, A. Pfefferle, M. Wei, R. Gao, D. Lyu, S. Yang, L. Purucker, Z. Marinov, M. Staring, H. Lu, T. T. Dao, X. Ye, Z. Li, G. Brugnara, P. V ollmuth, M. Foltyn-Dumitru, J. Cho, M. A. Mahmutoglu, M. Bend- szus, I. Pflüger, A. Rastogi, D. Ni, X. Yang, G.-Q. Zhou, K. Wang, N. Heller, N. Papanikolopoulos,...

work page doi:10.48550/arxiv.2412 2024

[29] [29]

Peiris, M

H. Peiris, M. Hayat, Z. Chen, G. Egan, M. Harandi, VT- UNet: A robust volumetric transformer for accurate 3d tu- mor segmentation 13435 (2022) 162–172. doi:10.1007/ 978-3-031-16443-9_16

2022

[30] [30]

Isensee, P

F. Isensee, P. F. Jäger, S. A. A. Kohl, J. Petersen, K. H. Maier-Hein, No new-net, arXiv preprint arXiv:1809.10486 (2018).arXiv:1809.10486

Pith/arXiv arXiv 2018

[31] [31]

Gibson, W

E. Gibson, W. Li, C. Sudre, L. Fidon, D. I. Shakir, G. Wang, Z. Eaton-Rosen, R. Gray, T. Doel, Y . Hu, T. Whyntie, T. Vercauteren, M. J. Cardoso, M. Modat, D. C. Barratt, S. Ourselin, NiftyNet: A deep-learning platform for medical imaging, Computer Methods and Programs in Biomedicine 158 (2018) 113–122. doi:10. 1016/j.cmpb.2018.01.025

2018

[32] [32]

Specaugment on large scale datasets

H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y . Iwamoto, X. Han, Y .-W. Chen, J. Wu, UNet 3+: A full-scale con- nected UNet for medical image segmentation, in: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2020, pp. 1055–1059. doi:10.1109/ICASSP40776.2020.9053405

work page doi:10.1109/icassp40776.2020.9053405 2020

[33] [33]

J. M. J. Valanarasu, V . A. Sindagi, I. Hacihaliloglu, V . M. Patel, KiU-Net: Towards accurate segmentation of biomedical images using over-complete representations, in: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2020, volume 12264 ofLecture Notes in Computer Science, Springer, 2020, pp. 363–373. doi:10.1007/978-3-030-59719-1_36

work page doi:10.1007/978-3-030-59719-1_36 2020

[34] [34]

Kervadec, J

H. Kervadec, J. Bouchtiba, C. Desrosiers, E. Granger, J. Dolz, I. Ben Ayed, Boundary loss for highly unbal- anced segmentation, Medical Image Analysis 67 (2021) 101851. doi:10.1016/j.media.2020.101851

work page doi:10.1016/j.media.2020.101851 2021

[35] [35]

S. Shit, J. C. Paetzold, A. Sekuboyina, I. Ezhov, A. Unger, A. Zhylka, J. P. W. Pluim, U. Bauer, B. H. Menze, clDice: A novel topology-preserving loss function for tubular structure segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16560–16569.arXiv:2003.07311

arXiv 2021

[36] [36]

A. Myronenko, 3D MRI brain tumor segmentation using autoencoder regularization, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, volume 11384 ofLecture Notes in Computer Science, Springer, 2019, pp. 311–320. doi:10.1007/ 978-3-030-11726-9_28

2019

[37] [37]

Hatamizadeh, V

A. Hatamizadeh, V . Nath, Y . Tang, D. Yang, H. R. Roth, D. Xu, Swin UNETR: Swin transformers for seman- tic segmentation of brain tumors in MRI images, arXiv preprint arXiv:2201.01266 (2022).arXiv:2201.01266

arXiv 2022

[38] [38]

J. Ma, F. Li, B. Wang, U-Mamba: Enhancing long-range dependency for biomedical image segmentation, arXiv preprint arXiv:2401.04722 (2024).arXiv:2401.04722

Pith/arXiv arXiv 2024

[39] [39]

Radiology: Artificial Intelligence5(5), e230024 (Sep 2023)

J. Wasserthal, H.-C. Breit, M. T. Meyer, M. Pradella, D. Hinck, A. W. Sauter, T. Heye, D. Boll, J. Cyriac, S. Yang, M. Bach, M. Segeroth, TotalSegmentator: Ro- bust segmentation of 104 anatomic structures in CT im- ages, Radiology: Artificial Intelligence 5 (2023) e230024. doi:10.1148/ryai.230024

work page doi:10.1148/ryai.230024 2023

[40] [40]

J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, Y . Zhou, TransUNet: Transformers make strong encoders for medical image segmentation, arXiv preprint arXiv:2102.04306 (2021).arXiv:2102.04306

Pith/arXiv arXiv 2021

[41] [41]

Z. Deng, X. Huang, D. Li, X. Yuan, MISSFormer: An effective medical image segmentation transformer, arXiv preprint arXiv:2109.07162 (2021).arXiv:2109.07162

arXiv 2021

[42] [42]

Cheng, J

J. Cheng, J. Ye, Z. Deng, J. Chen, T. Li, H. Wang, Y . Su, Z. Huang, J. Chen, L. Jiang, H. Sun, J. He, S. Zhang, M. Zhu, Y . Qiao, SAM-Med2D, arXiv preprint arXiv:2308.16184 (2023).arXiv:2308.16184

arXiv 2023

[43] [43]

J. Wu, Z. Wang, M. Hong, W. Ji, H. Fu, Y . Xu, M. Xu, Y . Jin, Medical SAM adapter: Adapting segment anything model for medical image segmentation, Medical Image Analysis (2025) 103547. doi:10.1016/j.media.2025. 103547, also available as arXiv:2304.12620

work page doi:10.1016/j.media.2025 2025

[44] [44]

J. Ma, Z. Yang, S. Kim, B. Chen, M. Baharoon, A. Fal- lahpour, R. Asakereh, H. Lyu, B. Wang, MedSAM2: Segment anything in 3d medical images and videos, arXiv preprint arXiv:2504.03600 (2025). doi:10.48550/ arXiv.2504.03600.arXiv:2504.03600

arXiv 2025

[45] [45]

Bibliographic details to be ver- ified; citation supplied by the experiment implementation record

GLAD-Net, GLAD-Net: Global-local adaptive fusion and cross-stage distillation for cross-level multi-scale medical image segmentation, n.d. Bibliographic details to be ver- ified; citation supplied by the experiment implementation record

[46] [46]

Katar, O

C. Katar, O. B. Eryilmaz, E. M. Eksioglu, Att-Next for skin lesion segmentation with topological awareness, Expert Systems with Applications 282 (2025) 127637. doi:10.1016/j.eswa.2025.127637. 41

work page doi:10.1016/j.eswa.2025.127637 2025