MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models
Pith reviewed 2026-06-28 01:52 UTC · model grok-4.3
The pith
Dataset knowledge cards make medical segmentation design start from data requirements rather than architecture search
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MS-DKC records dataset evidence through image/acquisition, morphology, supervision, context-dependence, and deployment-risk descriptors. These descriptors are mapped to failure modes, design priors, and risk-aligned criteria, making segmentation design more traceable than architecture-first comparison. Evaluation on DRIVE, ISIC2018, and ACDC shows that this produces dataset-conditioned recommendations, such as detail-preserving models and topology-aware metrics for DRIVE or class-balanced supervision for ACDC, supporting that different datasets require different priors, operating points, and evidence before a model can be judged appropriate.
What carries the argument
The MS-DKC framework, which records dataset evidence in five descriptor categories and maps them to failure modes, design priors, and risk-aligned criteria for model selection and adaptation.
If this is right
- Vessel datasets like DRIVE favor detail-preserving models, sensitivity-aware optimization, and topology-aware metrics over standard Dice-only training.
- Lesion datasets like ISIC2018 benefit from validation-constrained score-function selection that avoids augmentation when it harms boundary or risk profiles.
- Multi-class cardiac datasets like ACDC call for four-class softmax, class-balanced losses, and class-wise surface distance evaluation.
- Model comparisons become valid only after dataset descriptors have fixed the operating point and evidence requirements.
Where Pith is reading between the lines
- The same descriptor categories could be applied to non-segmentation tasks such as detection or registration to test consistency of the mappings.
- Automated dataset scanners might generate MS-DKC entries from raw images, reducing the manual effort needed before model selection.
- Clinical deployment pipelines could require an MS-DKC review step before approving a segmentation model for a new site or scanner.
Load-bearing premise
The five descriptor categories and their mappings to failure modes and design priors are sufficient and accurate for determining appropriate models across medical segmentation tasks.
What would settle it
A controlled test on a held-out medical segmentation dataset in which every model selected via MS-DKC mappings underperforms a model chosen by standard architecture search or random selection on the risk-aligned metrics.
Figures
read the original abstract
Medical image segmentation is often framed as a search for stronger architectures, but this can obscure a more fundamental question: what does the dataset require from the model? In medical imaging, this requirement is shaped by foreground occupancy, morphology, boundary ambiguity, topology sensitivity, annotation quality, acquisition variation, and operating point. This paper introduces the Medical Segmentation Dataset Knowledge Card (MS-DKC), a framework for making these factors explicit. MS-DKC records dataset evidence through image/acquisition, morphology, supervision, context-dependence, and deployment-risk descriptors. These descriptors are mapped to failure modes, design priors, and risk-aligned criteria, making segmentation design more traceable than architecture-first comparison. We evaluate MS-DKC on DRIVE, ISIC2018, and ACDC, representing distinct regimes. DRIVE contains sparse, thin, branching vessels, favoring detail-preserving models, sensitivity-aware optimization, threshold analysis, and topology-aware metrics. DKC-TNet-v2 achieved Dice 0.8044 and IoU 0.6730 with 35103 parameters, while SA-UNetv2-DKC-AmbRef reached Dice 0.8141, IoU 0.6865, sensitivity 0.8265, specificity 0.9804, and AUC 0.9853. ISIC2018 involves compact but appearance-variable lesions; validation-constrained score-function selection on Att-Next-Topo/ATTNext produced MS-DKC-AttNextTopo-VCSF-NoAug with Dice 0.8872, IoU 0.8214, precision 0.9173, Boundary F1 0.4878, and ASSD 4.13, while plausible additions failed to improve the risk-aligned profile. ACDC provides a multi-class cardiac case, where MS-DKC recommends four-class softmax segmentation, class-balanced Dice/CE supervision, and class-wise surface evaluation. Overall, the results support dataset-conditioned design: different datasets require different priors, operating points, and evidence before a model can be judged appropriate.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Medical Segmentation Dataset Knowledge Card (MS-DKC) framework, which uses five descriptor categories (image/acquisition, morphology, supervision, context-dependence, deployment-risk) to explicitly record dataset evidence and map it to failure modes, design priors, and risk-aligned evaluation criteria. It applies the framework to three datasets representing distinct regimes—DRIVE (sparse thin vessels), ISIC2018 (compact variable lesions), and ACDC (multi-class cardiac structures)—and reports tailored model results such as DKC-TNet-v2 achieving Dice 0.8044 on DRIVE and Att-Next-Topo achieving Dice 0.8872 on ISIC2018, concluding that these support dataset-conditioned design over architecture-first approaches.
Significance. If the descriptor-to-prior mappings hold under controlled testing, the framework could shift medical segmentation research toward more traceable, dataset-aware design choices that reduce inappropriate model selection. The manuscript earns credit for grounding claims in public datasets (DRIVE, ISIC2018, ACDC) and standard metrics while providing concrete examples of how descriptors translate into choices like sensitivity-aware optimization or class-balanced losses. However, the significance remains provisional without evidence that the mappings are necessary rather than incidental.
major comments (2)
- [Evaluation sections on DRIVE, ISIC2018, and ACDC] Evaluation sections on DRIVE, ISIC2018, and ACDC: the reported metric profiles (e.g., Dice 0.8044 / IoU 0.6730 for DKC-TNet-v2 on DRIVE; Dice 0.8872 / Boundary F1 0.4878 for MS-DKC-AttNextTopo-VCSF-NoAug on ISIC2018; four-class softmax on ACDC) are produced after applying the five-descriptor mappings, yet no baseline runs with a fixed architecture, default loss, or generic operating point are presented on the same datasets. Without these controls, it is not possible to determine whether the observed outcomes require the MS-DKC mappings or would arise from any reasonable model choice, directly undermining the central claim that the results support dataset-conditioned design.
- [Framework description and abstract] Framework description and abstract: the claim that the five descriptor categories are sufficient to determine appropriate models across medical segmentation tasks rests on the mappings to failure modes and priors, but no ablation, sensitivity analysis, or comparison against alternative categorizations is provided to test whether these categories capture the necessary factors or whether different groupings would produce equivalent design recommendations.
minor comments (1)
- [Abstract] Abstract: the statement that 'plausible additions failed to improve the risk-aligned profile' on ISIC2018 lacks detail on what the additions were or how failure was quantified, reducing reproducibility of that specific claim.
Simulated Author's Rebuttal
We thank the referee for the constructive comments that identify opportunities to strengthen the empirical support for the MS-DKC framework. We address each major comment below.
read point-by-point responses
-
Referee: Evaluation sections on DRIVE, ISIC2018, and ACDC: the reported metric profiles (e.g., Dice 0.8044 / IoU 0.6730 for DKC-TNet-v2 on DRIVE; Dice 0.8872 / Boundary F1 0.4878 for MS-DKC-AttNextTopo-VCSF-NoAug on ISIC2018; four-class softmax on ACDC) are produced after applying the five-descriptor mappings, yet no baseline runs with a fixed architecture, default loss, or generic operating point are presented on the same datasets. Without these controls, it is not possible to determine whether the observed outcomes require the MS-DKC mappings or would arise from any reasonable model choice, directly undermining the central claim that the results support dataset-conditioned design.
Authors: We agree that the lack of controlled baseline comparisons limits the ability to attribute performance gains specifically to the MS-DKC mappings. In the revised manuscript we will add experiments on all three datasets using fixed standard architectures (e.g., U-Net) with default losses and operating points to provide the necessary controls and clarify whether the reported results depend on the dataset-specific priors. revision: yes
-
Referee: Framework description and abstract: the claim that the five descriptor categories are sufficient to determine appropriate models across medical segmentation tasks rests on the mappings to failure modes and priors, but no ablation, sensitivity analysis, or comparison against alternative categorizations is provided to test whether these categories capture the necessary factors or whether different groupings would produce equivalent design recommendations.
Authors: The five categories were chosen to reflect recurring challenges documented in the medical segmentation literature. The manuscript presents the framework as an initial structured approach rather than asserting that the categories are provably minimal or optimal. We will add a dedicated discussion section in the revision that justifies the chosen categories, notes potential alternative groupings, and illustrates how different categorizations could alter design recommendations. revision: partial
Circularity Check
No circularity: framework is descriptive and evaluations are empirical applications on public data.
full rationale
The paper introduces MS-DKC as a set of five descriptor categories (image/acquisition, morphology, supervision, context-dependence, deployment-risk) that are mapped to failure modes and design priors. These mappings are presented as explicit records rather than derived quantities. The case studies on DRIVE, ISIC2018, and ACDC apply the descriptors to select models, losses, and metrics, but report standard public metrics without any equations, fitted parameters, or self-citations that reduce the central claim to its own inputs by construction. The derivation chain consists of dataset description followed by application; no step equates a reported outcome to a quantity defined inside the paper itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The five descriptor categories capture the key factors that shape what a segmentation model requires from a dataset.
Reference graph
Works this paper leans on
-
[1]
Ronneberger, P
O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolu- tional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Inter- vention – MICCAI 2015, volume 9351 ofLecture Notes in Computer Science, Springer, 2015, pp. 234–241. doi:10. 1007/978-3-319-24574-4_28
2015
-
[2]
F. Milletari, N. Navab, S.-A. Ahmadi, V-Net: Fully con- volutional neural networks for volumetric medical im- age segmentation, in: 2016 Fourth International Con- ference on 3D Vision (3DV), IEEE, 2016, pp. 565–571. doi:10.1109/3DV.2016.79
-
[3]
Çiçek, A
Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, O. Ronneberger, 3D U-Net: Learning dense volumet- ric segmentation from sparse annotation, in: Medi- cal Image Computing and Computer-Assisted Interven- tion – MICCAI 2016, volume 9901 ofLecture Notes in Computer Science, Springer, 2016, pp. 424–432. doi:10. 1007/978-3-319-46723-8_49
2016
-
[4]
O. Oktay, J. Schlemper, L. Le Folgoc, M. Lee, M. Hein- rich, K. Misawa, K. Mori, S. McDonagh, N. Y . Hammerla, B. Kainz, B. Glocker, D. Rueckert, Attention U-Net: Learning where to look for the pancreas, arXiv preprint arXiv:1804.03999 (2018).arXiv:1804.03999
Pith/arXiv arXiv 2018
-
[5]
Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. Liang, UNet++: Redesigning skip connections to exploit multi- scale features in image segmentation, IEEE Transactions on Medical Imaging 39 (2020) 1856–1867. doi:10.1109/ TMI.2019.2959609
arXiv 2020
-
[6]
Isensee, P
F. Isensee, P. F. Jäger, S. A. A. Kohl, J. Petersen, K. H. Maier-Hein, nnU-Net: A self-configuring method for deep learning-based biomedical image segmenta- tion, Nature Methods 18 (2021) 203–211. doi:10.1038/ s41592-020-01008-z
2021
-
[7]
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
A. Hatamizadeh, Y . Tang, V . Nath, D. Yang, A. Myro- nenko, B. Landman, H. R. Roth, D. Xu, UNETR: Trans- formers for 3d medical image segmentation, in: Proceed- ings of the IEEE/CVF Winter Conference on Applica- tions of Computer Vision (W ACV), 2022, pp. 1748–1758. doi:10.1109/WACV51458.2022.00181
-
[8]
H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, M. Wang, Swin-Unet: Unet-like pure transformer for medical image segmentation, in: Computer Vision – ECCV 2022 Workshops, volume 13803 ofLecture Notes in Computer Science, Springer, 2023, pp. 205–218. doi:10.1007/978-3-031-25066-8_9. 39
-
[9]
H.-Y . Zhou, J. Guo, Y . Zhang, L. Yu, L. Wang, Y . Yu, nnFormer: Interleaved transformer for volumetric seg- mentation, arXiv preprint arXiv:2109.03201 (2021). arXiv:2109.03201
arXiv 2021
-
[10]
W. Wang, C. Chen, M. Ding, H. Yu, S. Zha, J. Li, Trans- BTS: Multimodal brain tumor segmentation using trans- former, in: Medical Image Computing and Computer- Assisted Intervention – MICCAI 2021, volume 12901 of Lecture Notes in Computer Science, Springer, 2021, pp. 109–119. doi:10.1007/978-3-030-87193-2_11
-
[11]
Y . Zhang, H. Liu, Q. Hu, TransFuse: Fusing trans- formers and CNNs for medical image segmentation, in: Medical Image Computing and Computer-Assisted In- tervention – MICCAI 2021, volume 12901 ofLecture Notes in Computer Science, Springer, 2021, pp. 14–24. doi:10.1007/978-3-030-87193-2_2
-
[12]
J. Chen, J. Mei, X. Li, Y . Lu, Q. Yu, Q. Wei, X. Luo, Y . Xie, E. Adeli, Y . Wang, M. P. Lungren, S. Zhang, L. Xing, L. Lu, A. Yuille, Y . Zhou, TransUNet: Rethink- ing the U-Net architecture design for medical image seg- mentation through the lens of transformers, Medical Im- age Analysis 97 (2024) 103280. doi:10.1016/j.media. 2024.103280
-
[13]
Z. Xing, T. Ye, Y . Yang, G. Liu, L. Zhu, SegMamba: Long-range sequential modeling Mamba for 3d medi- cal image segmentation, in: Medical Image Comput- ing and Computer-Assisted Intervention – MICCAI 2024, Springer, 2024
2024
-
[14]
T. D. Q. Dang, H. H. Nguyen, A. Tiulpin, LoG-VMamba: Local-global vision Mamba for medical image segmenta- tion, in: Proceedings of the Asian Conference on Com- puter Vision (ACCV), 2024, pp. 548–565
2024
-
[15]
J. Ma, Y . He, F. Li, L. Han, C. You, B. Wang, Segment anything in medical images, Nature Communications 15 (2024) 654. doi:10.1038/s41467-024-44824-z
-
[16]
J. Zhu, A. Hamdi, Y . Qi, Y . Jin, J. Wu, Med- ical SAM 2: Segment medical images as video via segment anything model 2, arXiv preprint arXiv:2408.00874 (2024). doi:10.48550/arXiv.2408. 00874.arXiv:2408.00874
-
[17]
T. M. Khan, S. S. Naqvi, E. Meijering, Leveraging im- age complexity in macro-level neural network design for medical image segmentation, Scientific Reports 12 (2022) 22286
2022
-
[18]
T. M. Khan, M. Arsalan, A. Robles-Kelly, E. Meijer- ing, Mkis-net: a light-weight multi-kernel network for medical image segmentation, in: International Confer- ence on Digital Image Computing: Techniques and Appli- cations (DICTA), 10.1109/DICTA56598.2022.10034573, 2022, pp. 1–8
-
[19]
Iqbal, A
S. Iqbal, A. N. Qureshi, M. Alhussein, I. A. Choudhry, K. Aurangzeb, T. M. Khan, Fusion of textural and vi- sual information for medical image modality retrieval us- ing deep learning-based feature engineering, IEEE Access 11 (2023) 93238–93253
2023
-
[20]
Iqbal, T
S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, M. Usman, H. A. Khan, I. Razzak, Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices, IEEE journal of biomedical and health informatics (2023)
2023
-
[21]
Qayyum, I
A. Qayyum, I. Razzak, M. Mazher, T. Khan, W. Ding, S. Niederer, Two-stage self-supervised contrastive learn- ing aided transformer for real-time medical image seg- mentation, IEEE Journal of Biomedical and Health In- formatics (2023)
2023
- [22]
-
[23]
Iqbal, T
S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, E. Meijer- ing, Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation, Pattern Recognition 158 (2025) 111028
2025
-
[24]
T. M. Khan, S. S. Naqvi, E. Meijering, Esdmr-net: A lightweight network with expand-squeeze and dual mul- tiscale residual connections for medical image segmenta- tion, Engineering Applications of Artificial Intelligence 133 (2024) 107995
2024
-
[25]
Y . Xu, T. M. Khan, Y . Song, E. Meijering, Edge deep learning in computer vision and medical diagnostics: a comprehensive survey, Artificial Intelligence Review 58 (2025) 93
2025
-
[26]
M. Safdar, S. Iqbal, M. Mehmood, M. Ghafoor, T. M. Khan, I. Razzak, Focal modulation and bidirectional feature fusion network for medical image segmentation, arXiv preprint arXiv:2510.20933 (2025)
Pith/arXiv arXiv 2025
-
[27]
T. M. Khan, Q. E. U. Haq, S. Iqbal, T. A. Soomro, Edge- based artificial intelligence: Understanding the evolution of hardware and software and future trends, Engineering Applications of Artificial Intelligence 174 (2026) 114526
2026
-
[28]
J. Ma, F. Li, S. Kim, R. Asakereh, B.-H. Le, D.-K. Nguyen-Vu, A. Pfefferle, M. Wei, R. Gao, D. Lyu, S. Yang, L. Purucker, Z. Marinov, M. Staring, H. Lu, T. T. Dao, X. Ye, Z. Li, G. Brugnara, P. V ollmuth, M. Foltyn-Dumitru, J. Cho, M. A. Mahmutoglu, M. Bend- szus, I. Pflüger, A. Rastogi, D. Ni, X. Yang, G.-Q. Zhou, K. Wang, N. Heller, N. Papanikolopoulos,...
-
[29]
Peiris, M
H. Peiris, M. Hayat, Z. Chen, G. Egan, M. Harandi, VT- UNet: A robust volumetric transformer for accurate 3d tu- mor segmentation 13435 (2022) 162–172. doi:10.1007/ 978-3-031-16443-9_16
2022
-
[30]
F. Isensee, P. F. Jäger, S. A. A. Kohl, J. Petersen, K. H. Maier-Hein, No new-net, arXiv preprint arXiv:1809.10486 (2018).arXiv:1809.10486
Pith/arXiv arXiv 2018
-
[31]
Gibson, W
E. Gibson, W. Li, C. Sudre, L. Fidon, D. I. Shakir, G. Wang, Z. Eaton-Rosen, R. Gray, T. Doel, Y . Hu, T. Whyntie, T. Vercauteren, M. J. Cardoso, M. Modat, D. C. Barratt, S. Ourselin, NiftyNet: A deep-learning platform for medical imaging, Computer Methods and Programs in Biomedicine 158 (2018) 113–122. doi:10. 1016/j.cmpb.2018.01.025
2018
-
[32]
Specaugment on large scale datasets
H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y . Iwamoto, X. Han, Y .-W. Chen, J. Wu, UNet 3+: A full-scale con- nected UNet for medical image segmentation, in: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2020, pp. 1055–1059. doi:10.1109/ICASSP40776.2020.9053405
-
[33]
J. M. J. Valanarasu, V . A. Sindagi, I. Hacihaliloglu, V . M. Patel, KiU-Net: Towards accurate segmentation of biomedical images using over-complete representations, in: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2020, volume 12264 ofLecture Notes in Computer Science, Springer, 2020, pp. 363–373. doi:10.1007/978-3-030-59719-1_36
-
[34]
H. Kervadec, J. Bouchtiba, C. Desrosiers, E. Granger, J. Dolz, I. Ben Ayed, Boundary loss for highly unbal- anced segmentation, Medical Image Analysis 67 (2021) 101851. doi:10.1016/j.media.2020.101851
-
[35]
S. Shit, J. C. Paetzold, A. Sekuboyina, I. Ezhov, A. Unger, A. Zhylka, J. P. W. Pluim, U. Bauer, B. H. Menze, clDice: A novel topology-preserving loss function for tubular structure segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16560–16569.arXiv:2003.07311
arXiv 2021
-
[36]
A. Myronenko, 3D MRI brain tumor segmentation using autoencoder regularization, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, volume 11384 ofLecture Notes in Computer Science, Springer, 2019, pp. 311–320. doi:10.1007/ 978-3-030-11726-9_28
2019
-
[37]
A. Hatamizadeh, V . Nath, Y . Tang, D. Yang, H. R. Roth, D. Xu, Swin UNETR: Swin transformers for seman- tic segmentation of brain tumors in MRI images, arXiv preprint arXiv:2201.01266 (2022).arXiv:2201.01266
arXiv 2022
-
[38]
J. Ma, F. Li, B. Wang, U-Mamba: Enhancing long-range dependency for biomedical image segmentation, arXiv preprint arXiv:2401.04722 (2024).arXiv:2401.04722
Pith/arXiv arXiv 2024
-
[39]
Radiology: Artificial Intelligence5(5), e230024 (Sep 2023)
J. Wasserthal, H.-C. Breit, M. T. Meyer, M. Pradella, D. Hinck, A. W. Sauter, T. Heye, D. Boll, J. Cyriac, S. Yang, M. Bach, M. Segeroth, TotalSegmentator: Ro- bust segmentation of 104 anatomic structures in CT im- ages, Radiology: Artificial Intelligence 5 (2023) e230024. doi:10.1148/ryai.230024
-
[40]
J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, Y . Zhou, TransUNet: Transformers make strong encoders for medical image segmentation, arXiv preprint arXiv:2102.04306 (2021).arXiv:2102.04306
Pith/arXiv arXiv 2021
-
[41]
Z. Deng, X. Huang, D. Li, X. Yuan, MISSFormer: An effective medical image segmentation transformer, arXiv preprint arXiv:2109.07162 (2021).arXiv:2109.07162
arXiv 2021
- [42]
-
[43]
J. Wu, Z. Wang, M. Hong, W. Ji, H. Fu, Y . Xu, M. Xu, Y . Jin, Medical SAM adapter: Adapting segment anything model for medical image segmentation, Medical Image Analysis (2025) 103547. doi:10.1016/j.media.2025. 103547, also available as arXiv:2304.12620
-
[44]
J. Ma, Z. Yang, S. Kim, B. Chen, M. Baharoon, A. Fal- lahpour, R. Asakereh, H. Lyu, B. Wang, MedSAM2: Segment anything in 3d medical images and videos, arXiv preprint arXiv:2504.03600 (2025). doi:10.48550/ arXiv.2504.03600.arXiv:2504.03600
arXiv 2025
-
[45]
Bibliographic details to be ver- ified; citation supplied by the experiment implementation record
GLAD-Net, GLAD-Net: Global-local adaptive fusion and cross-stage distillation for cross-level multi-scale medical image segmentation, n.d. Bibliographic details to be ver- ified; citation supplied by the experiment implementation record
-
[46]
C. Katar, O. B. Eryilmaz, E. M. Eksioglu, Att-Next for skin lesion segmentation with topological awareness, Expert Systems with Applications 282 (2025) 127637. doi:10.1016/j.eswa.2025.127637. 41
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.