Recognition: unknown
Deep Learning-Based Segmentation of Peritoneal Cancer Index Regions from CT Imaging
Pith reviewed 2026-05-07 07:53 UTC · model grok-4.3
The pith
A deep learning model using nnU-Net segments the radiological Peritoneal Cancer Index regions on CT scans with an overall Dice score of 0.82, approaching interobserver agreement of 0.88.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a deep learning-based approach to automatically segment the rPCI regions on CT. We evaluate nnU-Net and Swin UNETR on 62 CT scans with rPCI regions manually annotated by three clinical researchers and validated by two expert radiologists. nnU-Net achieved an overall Dice of 0.82, approaching interobserver agreement (0.88) and outperforming Swin UNETR (0.76), with remaining challenges primarily in right flank and small-bowel regions. These results demonstrate feasibility of automated rPCI segmentation, laying the foundation for non-invasive, imaging-based assessment.
What carries the argument
nnU-Net architecture trained for semantic segmentation of the 13 consensus-defined 3D rPCI regions on CT, with performance measured by Dice similarity coefficient, 95th percentile Hausdorff distance, and average surface distance in five-fold cross-validation.
If this is right
- Automated rPCI segmentation makes non-invasive assessment of peritoneal metastases feasible as an alternative to diagnostic laparoscopy.
- The method can support standardized imaging-based peritoneal cancer index evaluation in clinical workflows.
- nnU-Net outperforms Swin UNETR, indicating its suitability as the primary architecture for this task.
- Further refinement is needed for the right flank and small-bowel regions where accuracy remains lower.
- This segmentation step provides the structural foundation for future automated calculation of full rPCI scores.
Where Pith is reading between the lines
- Embedding the model in radiology viewing software could allow radiologists to obtain rPCI region outlines in seconds during routine CT review.
- Pairing the segmentation output with separate tumor-burden classifiers would enable end-to-end automated rPCI scoring without manual region tracing.
- Multi-center validation studies would be required to confirm whether the 0.82 Dice holds across scanner vendors and patient demographics not represented in the original 62 scans.
- Widespread adoption could reduce the number of diagnostic laparoscopies performed solely for PCI assessment, lowering procedural risks and healthcare costs.
Load-bearing premise
The manually annotated rPCI regions by three researchers and validated by two radiologists constitute reliable ground truth, and the 62 CT scans are representative of the broader clinical population and scanner variability.
What would settle it
Applying the trained nnU-Net to an independent set of at least 50 CT scans acquired on different scanners or from other institutions and obtaining an overall Dice below 0.70 would show that the reported performance does not generalize.
read the original abstract
Peritoneal metastases are currently assessed using diagnostic laparoscopy to determine Sugarbaker's Peritoneal Cancer Index (sPCI), which works by dividing the abdomen into 13 regions and scoring each region based on tumor size. A recent consensus study defined 3D regions to facilitate a radiological PCI (rPCI), providing standardized anatomical regions for imaging-based assessment. Despite its clinical value, sPCI is invasive and lacks a standardized imaging counterpart. In this study, we propose a deep learning-based approach to automatically segment the rPCI regions on CT. We evaluate nnU-Net and Swin UNETR on 62 CT scans with rPCI regions manually annotated by three clinical researchers and validated by two expert radiologists. Performance was assessed using five-fold cross-validation with the Dice Similarity Coefficient (Dice), 95th percentile Hausdorff distance and Average Surface Distance. nnU-Net achieved an overall Dice of 0.82, approaching interobserver agreement (0.88) and outperforming Swin UNETR (0.76), with remaining challenges primarily in right flank and small-bowel regions. These results demonstrate feasibility of automated rPCI segmentation, laying the foundation for non-invasive, imaging-based assessment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to demonstrate the feasibility of deep learning-based automatic segmentation of the 13 rPCI regions from CT scans. Using nnU-Net and Swin UNETR on 62 manually annotated CT scans evaluated via 5-fold cross-validation, nnU-Net achieves an overall Dice of 0.82, approaching the reported interobserver agreement of 0.88 and outperforming Swin UNETR at 0.76. Challenges remain in the right flank and small-bowel regions. The work positions this as a step toward non-invasive, imaging-based peritoneal cancer assessment.
Significance. If validated on external data, this could significantly impact clinical practice by providing a non-invasive alternative to diagnostic laparoscopy for determining the Peritoneal Cancer Index. The proximity to interobserver variability is encouraging and provides a relevant benchmark. The study employs standard practices in medical image segmentation (nnU-Net, 5-fold CV, Dice/Hausdorff metrics), which is a strength. However, the limited dataset size and lack of diversity testing temper the immediate significance.
major comments (2)
- [Abstract and Methods] Abstract and Methods: The claim that nnU-Net 'approaches interobserver agreement (0.88)' is difficult to assess without details on how the interobserver Dice was calculated. Specifically, it is unclear if this was computed on the same 5-fold splits or a separate held-out set, and whether the three annotators' labels were used consistently for both model training and interobserver evaluation. This detail is critical as it directly affects the interpretation of the 0.82 vs. 0.88 gap.
- [Results and Discussion] Results and Discussion: The evaluation relies exclusively on internal 5-fold cross-validation within a 62-scan dataset. No external validation set, multi-center data, or assessment across different scanner protocols is provided. This is load-bearing for the feasibility claim, as CT segmentation performance often degrades with variations in acquisition parameters not represented in the training distribution.
minor comments (3)
- [Abstract] Abstract: Details on patient demographics (age, sex, tumor types), CT scanner models, and acquisition parameters (slice thickness, contrast use) are missing, which are important for assessing the representativeness of the 62 scans.
- [Results] Results: It would be helpful to report per-region Dice scores in a table to identify which of the 13 rPCI regions contribute most to the overall score and the challenges in right flank and small-bowel.
- [Methods] Methods: Provide more details on the exact number of cases per fold, class imbalance handling for the 13 regions, and any statistical significance testing of the Dice differences between models.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. We have addressed each major comment below and will revise the paper to improve clarity and transparency regarding our methods and limitations.
read point-by-point responses
-
Referee: [Abstract and Methods] Abstract and Methods: The claim that nnU-Net 'approaches interobserver agreement (0.88)' is difficult to assess without details on how the interobserver Dice was calculated. Specifically, it is unclear if this was computed on the same 5-fold splits or a separate held-out set, and whether the three annotators' labels were used consistently for both model training and interobserver evaluation. This detail is critical as it directly affects the interpretation of the 0.82 vs. 0.88 gap.
Authors: We agree that the Methods section should have provided more explicit details on this point. The interobserver agreement of 0.88 was computed separately from the 5-fold cross-validation: the three clinical researchers independently annotated all 62 scans, and the value represents the average pairwise Dice score across all annotator pairs for the 13 regions. This evaluation used the full dataset and was not performed on the CV splits. Model training and testing in the 5-fold CV used the expert-validated consensus annotations as ground truth. We will revise the Methods to add a clear subsection describing the interobserver protocol, its independence from the CV procedure, and how annotator labels were handled for training versus evaluation. revision: yes
-
Referee: [Results and Discussion] Results and Discussion: The evaluation relies exclusively on internal 5-fold cross-validation within a 62-scan dataset. No external validation set, multi-center data, or assessment across different scanner protocols is provided. This is load-bearing for the feasibility claim, as CT segmentation performance often degrades with variations in acquisition parameters not represented in the training distribution.
Authors: We acknowledge that the lack of external validation is a genuine limitation for claiming broad feasibility. Our experiments are confined to a single-center cohort of 62 scans acquired under consistent protocols, and we do not have external or multi-scanner data available to conduct additional validation experiments at present. In the revised manuscript we will expand the Discussion to explicitly highlight this constraint, discuss the risks of domain shift in CT imaging, and outline the need for future multi-center studies. The internal 5-fold CV results, particularly the proximity to interobserver agreement, still provide a meaningful initial demonstration of feasibility within comparable data distributions. revision: partial
- We do not have access to external multi-center or multi-scanner datasets and therefore cannot perform or report new external validation experiments in the current revision.
Circularity Check
No significant circularity: empirical ML evaluation against independent annotations
full rationale
The paper is a standard empirical machine-learning study that trains nnU-Net and Swin UNETR on 62 CT scans whose rPCI regions were manually annotated by three researchers and validated by radiologists. Performance is measured via 5-fold cross-validation using Dice, Hausdorff, and surface-distance metrics computed directly against those external human labels. No mathematical derivations, self-referential equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the reported pipeline. The interobserver agreement figure (0.88) is an independent human benchmark, not derived from the models. The evaluation therefore remains self-contained against external ground truth and does not reduce to its own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- nnU-Net and Swin UNETR training hyperparameters
axioms (2)
- domain assumption The 13 rPCI regions defined by recent consensus can be reliably and consistently annotated on contrast-enhanced CT by trained observers.
- standard math Five-fold cross-validation on 62 scans yields an unbiased estimate of generalization performance.
Reference graph
Works this paper leans on
-
[1]
European Journal of Cance r 181, 1–2 (2023) https://doi.org/10.1016/J.EJCA.2022.12.008
Rijken, A., Erning, F.N., Rovers, K.P., Lemmens, V.E.P.P., Hingh, I.H.J.T .: On the origin of peritoneal metastases. European Journal of Cance r 181, 1–2 (2023) https://doi.org/10.1016/J.EJCA.2022.12.008
-
[2]
Pleura and Peritoneum 7(2), 51–61 (2022) https://doi.org/10.1515/pp-2021-0152
Noiret, B., Piessen, G., Eveno, C.: Update of randomized controlle d tri- als evaluating cytoreductive surgery (CRS) and hyperthermic intr aperitoneal chemotherapy (HIPEC) in prevention and therapy of peritoneal m etas- tasis: A systematic review. Pleura and Peritoneum 7(2), 51–61 (2022) https://doi.org/10.1515/pp-2021-0152
-
[3]
Drugs 83(2), 159–180 (2023) https://doi.org/10.1007/S40265-022-01828-7
Guchelaar, N.A.D., Noordman, B.J., Koolen, S.L.W., Mostert, B., Mads en, E.V.E., Burger, J.W.A., Brandt-Kerkhof, A.R.M., Creemers, G.J., Hingh, I.H.J.T., Luyer, M., Bins, S., Meerten, E., Lagarde, S.M., Verhoef, C., Wijnhoven, B.P.L., Mathijssen, R.H.J.: Intraperitoneal Chemothera py for Unresectable Peritoneal Surface Malignancies. Drugs 83(2), 159–180 (...
-
[4]
BJS Open 3(6), 812–821 (2019) https://doi.org/10.1002/BJS5.50193
Hentzen, J.E.K.R., Plas, W.Y., Constansia, R.D.N., Been, L.B., Hoogwa- ter, F.J.H., Ginkel, R.J., Dam, G.M., Hemmer, P.H.J., Kruijff, S.: Role of diagnostic laparoscopy in patients with suspicion of colorectal pe ri- toneal metastases to evaluate suitability for cytoreductive surg ery with hyperthermic intraperitoneal chemotherapy. BJS Open 3(6), 812–821 (...
-
[5]
Jacquet, P., Sugarbaker, P.H.: In: Sugarbaker, P.H. (ed.) Clinica l research methodologies in diagnosis and staging of patients with per i- toneal carcinomatosis, pp. 359–374. Springer, Boston, MA (199 6). https://doi.org/10.1007/978-1-4613-1247-5 23 . https://doi.org/10.1007/978-1- 4613-1247-5 23
-
[6]
European Radiology 30(6), 3101–3112 (2020) https://doi.org/10.1007/s00330-019-06524-x
Sant, I., Engbersen, M.P., Bhairosing, P.A., Lambregts, D.M.J., Bee ts- Tan, R.G.H., Driel, W.J., Aalbers, A.G.J., Kok, N.F.M., Lahaye, M.J.: 11 Diagnostic performance of imaging for the detection of peritoneal metas- tases: a meta-analysis. European Radiology 30(6), 3101–3112 (2020) https://doi.org/10.1007/s00330-019-06524-x
-
[7]
Journal of Visceral Surgery 155, 293–303 (2018) https://doi.org/10.1016/j.jviscsurg.2018.01.002
Dohan, A., Hobeika, C., Najah, H., Pocard, M., Rousset, P., Eveno , C.: Preoperative assessment of peritoneal carcinomatosis of col- orectal origin. Journal of Visceral Surgery 155, 293–303 (2018) https://doi.org/10.1016/j.jviscsurg.2018.01.002
-
[8]
Jo urnal of surgical oncology 102(6), 565–570 (2010) https://doi.org/10.1002/JSO.21601
Esquivel, J., Chua, T.C., Stojadinovic, A., Melero, J.T., Levine, E.A., G utman, M., Howard, R., Piso, P., Nissan, A., Gomez-Portilla, A., Gonzalez-Bayo n, L., Gonzalez-Moreno, S., Shen, P., Stewart, J.H., Sugarbaker, P.H., Ba rone, R.M., Hoefer, R., Morris, D.L., Sardi, A., Sticca, R.P.: Accuracy and clinical r elevance of computed tomography scan interp...
-
[9]
E uropean Radiology 2025, 1–11 (2025) https://doi.org/10.1007/S00330-025-11762-3
Tops-Welten, M.W., Ewals, L.J.S., Hellemond, I.E.G., Piek, J.M.J., Lahaye , M.J., De Hingh, I.H.J.T., Nederend, J., Luyer, M.D.P.: Defining region boundar ies to assess the peritoneal cancer index on imaging: a Delphi study. E uropean Radiology 2025, 1–11 (2025) https://doi.org/10.1007/S00330-025-11762-3
-
[10]
Informatik aktuell, 22 (2018) https://doi.org/10.1007/978-3-658-25326-4 7
Isensee, F., Petersen, J., Klein, A., Zimmerer, D., Jaeger, P.F., K ohl, S., Wasserthal, J., Koehler, G., Norajitra, T., Wirkert, S., Maier-Hein, K .H.: nnU- Net: Self-adapting Framework for U-Net-Based Medical Image Se gmentation. Informatik aktuell, 22 (2018) https://doi.org/10.1007/978-3-658-25326-4 7
-
[11]
U-Net: Convolutional Networks for Biomedical Image Segmentation
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional N etworks for Biomedical Image Segmentation (2015). https://arxiv.org/abs/1505.04597
work page internal anchor Pith review arXiv 2015
-
[12]
In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin Transformer: Hierarchical Vision Transformer using Shifted Windo ws. Proceed- ings of the IEEE International Conference on Computer Vision, 99 92–10002 (2021) https://doi.org/10.1109/ICCV48922.2021.00986
-
[13]
Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H.R., Xu, D.: S win UNETR: Swin Transformers for Semantic Segmentation of Brain Tum ors in MRI Images. Lecture Notes in Computer Science (including subseries Le cture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12962 LNCS , 272–284 (2022) https://doi.org/10.1007/978-3-031-08999-2 22
-
[14]
Hafeez, M., Sattar, A., Farooqui, W.A.: Inter observer reliability for peritoneal carcinomatosis at computed tomography. JPMA. T he Journal of the Pakistan Medical Association 73(5), 973–977 (2023) https://doi.org/10.47391/JPMA.6167 12
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.