Recognition: no theorem link
DosimeTron: Automating Personalized Monte Carlo Radiation Dosimetry in PET/CT with Agentic AI
Pith reviewed 2026-05-10 18:55 UTC · model grok-4.3
The pith
An agentic AI system automates the full pipeline of patient-specific Monte Carlo radiation dosimetry from PET/CT data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DosimeTron autonomously executed complex dosimetry pipelines across diverse prompt configurations and achieved high dosimetric agreement with OpenDose3D at clinically acceptable processing times, demonstrating the feasibility of agentic AI for patient-specific Monte Carlo dosimetry in PET/CT.
What carries the argument
The agentic AI framework that uses GPT-5.2 as reasoning engine together with 23 tools exposed via four Model Context Protocol servers to handle DICOM metadata extraction, image preprocessing, Monte Carlo simulation, organ segmentation, and dosimetric reporting through natural-language interaction.
Load-bearing premise
The system and its tools will continue to perform with zero errors and matching accuracy when applied to live clinical data and arbitrary user prompts outside the controlled retrospective dataset.
What would settle it
Execution failures, pipeline errors, or mean absolute percentage differences above 5 percent for most organs when DosimeTron is run on a fresh collection of patient PET/CT scans from additional scanner models or with live hospital data.
read the original abstract
Purpose: To develop and evaluate DosimeTron, an agentic AI system for automated patient-specific MC internal radiation dosimetry in PET/CT examinations. Materials and Methods: In this retrospective study, DosimeTron was evaluated on a publicly available PSMA-PET/CT dataset comprising 597 studies from 378 male patients acquired on three scanner models (18-F, n = 369; 68-Ga, n = 228). The system uses GPT-5.2 as its reasoning engine and 23 tools exposed via four Model Context Protocol servers, automating DICOM metadata extraction, image preprocessing, MC simulation, organ segmentation, and dosimetric reporting through natural-language interaction. Agentic performance was assessed using diverse prompt templates spanning single-turn instructions of varying specificity and multi-turn conversational exchanges, monitored via OpenTelemetry traces. Dosimetric accuracy was validated against OpenDose3D across 114 cases and 22 organs using Pearson's r, Lin's concordance correlation coefficient (CCC), and Bland-Altman analysis. Results: Across all prompt templates and all runs, no execution failures, pipeline errors, or hallucinated outputs were observed. Pearson's r ranged from 0.965 to 1.000 (median 0.997; all p < 0.001) and CCC from 0.963 to 1.000 (median 0.996). Mean absolute percentage difference was below 5% for 19 of 22 organs (median 2.5%). Total per-study processing time (SD) was 32.3 (6.0) minutes. Conclusion: DosimeTron autonomously executed complex dosimetry pipelines across diverse prompt configurations and achieved high dosimetric agreement with OpenDose3D at clinically acceptable processing times, demonstrating the feasibility of agentic AI for patient-specific Monte Carlo dosimetry in PET/CT.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents DosimeTron, an agentic AI system using GPT-5.2 as the reasoning engine and 23 tools via four Model Context Protocol servers to automate the full pipeline of DICOM handling, image preprocessing, organ segmentation, Monte Carlo simulation, and dosimetric reporting for patient-specific internal radiation dosimetry in PET/CT. In a retrospective evaluation on a public PSMA-PET/CT dataset of 597 studies, the system recorded zero execution failures or hallucinations across single- and multi-turn prompt templates; dosimetric outputs on 114 cases and 22 organs showed Pearson r from 0.965 to 1.000 (median 0.997) and CCC from 0.963 to 1.000 (median 0.996) versus OpenDose3D, with MAPE below 5% for 19 organs and mean processing time of 32.3 minutes.
Significance. If the reported zero-failure rate and high concordance metrics hold under broader conditions, the work establishes technical feasibility for agentic AI to autonomously manage complex, multi-step Monte Carlo dosimetry workflows on sizable public datasets. The independent validation against OpenDose3D, absence of observed pipeline errors, and clinically plausible processing times constitute concrete strengths that could support future automation efforts in personalized radiation dosimetry.
major comments (2)
- [Results] Results section: dosimetric agreement is reported only on a subset of 114 cases drawn from the 597-study cohort; without explicit selection criteria or analysis of whether these cases are representative with respect to scanner model, tracer (18-F vs 68-Ga), or organ coverage, it is unclear whether the median r = 0.997 and CCC = 0.996 generalize to the full dataset or to more challenging anatomies.
- [Conclusion] Conclusion: the claim that DosimeTron demonstrates feasibility of agentic AI for patient-specific Monte Carlo dosimetry rests on retrospective, pre-curated data and scripted prompts. No evidence is provided that the same zero-error performance and <5% MAPE persist under live scanner variability, motion artifacts, atypical anatomy, or open-ended clinician prompts that could trigger tool mis-calls or segmentation drift; this gap directly limits the strength of the clinical-feasibility assertion.
minor comments (2)
- [Materials and Methods] Materials and Methods: the description of the 23 tools and four MCP servers is high-level; expanding the list of tool functions (especially DICOM metadata extraction and MC parameter handling) would improve reproducibility.
- The manuscript would benefit from an explicit limitations paragraph addressing the retrospective design, controlled prompt conditions, and absence of prospective or multi-center testing.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive review. The comments highlight important aspects of generalizability and scope that we address below with proposed revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Results] Results section: dosimetric agreement is reported only on a subset of 114 cases drawn from the 597-study cohort; without explicit selection criteria or analysis of whether these cases are representative with respect to scanner model, tracer (18-F vs 68-Ga), or organ coverage, it is unclear whether the median r = 0.997 and CCC = 0.996 generalize to the full dataset or to more challenging anatomies.
Authors: We agree that the manuscript did not explicitly describe the selection criteria for the 114-case validation subset. The subset was obtained via stratified random sampling to maintain proportional representation of the three scanner models and both tracers (18-F, n=369; 68-Ga, n=228) while ensuring at least one case per major organ segmentation category present in the full cohort. In the revised manuscript we will insert a dedicated paragraph in the Materials and Methods section detailing the sampling procedure, together with a supplementary table comparing scanner distribution, tracer type, patient age/weight, and organ coverage statistics between the 114 cases and the full 597 studies. This addition will allow readers to evaluate representativeness directly. revision: yes
-
Referee: [Conclusion] Conclusion: the claim that DosimeTron demonstrates feasibility of agentic AI for patient-specific Monte Carlo dosimetry rests on retrospective, pre-curated data and scripted prompts. No evidence is provided that the same zero-error performance and <5% MAPE persist under live scanner variability, motion artifacts, atypical anatomy, or open-ended clinician prompts that could trigger tool mis-calls or segmentation drift; this gap directly limits the strength of the clinical-feasibility assertion.
Authors: We concur that the present study is confined to retrospective, pre-curated data and structured prompts, and therefore cannot furnish direct evidence of performance under live clinical conditions. The reported zero-failure rate and dosimetric concordance establish technical feasibility within this controlled retrospective setting, which constitutes a necessary first step. In revision we will (i) rephrase the Conclusion to state that the work demonstrates feasibility for retrospective automated Monte Carlo dosimetry pipelines and (ii) add a new Limitations paragraph that explicitly enumerates the untested scenarios (scanner variability, motion, atypical anatomy, open-ended prompts) and calls for prospective validation studies. These changes will align the strength of the claims with the evidence provided. revision: partial
Circularity Check
No circularity: empirical validation against external reference
full rationale
The paper describes an agentic AI dosimetry pipeline and reports its performance on a public retrospective PSMA-PET/CT dataset of 597 studies. Dosimetric outputs are compared directly to the independent external tool OpenDose3D using Pearson r, CCC, and MAPE on 114 cases; no equations, parameters, or predictions are fitted to the target metrics or defined in terms of themselves. All load-bearing claims (zero execution failures, r/CCC >0.96, <5% MAPE, 32.3 min runtime) are statistical summaries of observed runs under controlled conditions, not derivations that reduce to the inputs by construction. Self-citations are absent from the central evaluation chain.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Total-Body PET/CT: Current Applications and Future Perspectives
Tan H, Gu Y, Yu H, et al. Total-Body PET/CT: Current Applications and Future Perspectives. American Journal of Roentgenology. 2020;215(2):325–337. doi: 10.2214/AJR.19.22705
-
[2]
Radiotheranostics: a roadmap for future development
Herrmann K, Schwaiger M, Lewis JS, et al. Radiotheranostics: a roadmap for future development. Lancet Oncol. 2020;21(3):e146–e156. doi: 10.1016/S1470-2045(19)30821-6
-
[3]
Bolch WE, Eckerman KF, Sgouros G, Thomas SR. MIRD Pamphlet No. 21: A Generalized Schema for Radiopharmaceutical Dosimetry—Standardization of Nomenclature. Journal of Nuclear Medicine. 2009;50(3):477–484. doi: 10.2967/jnumed.108.056036
-
[4]
Stabin MG. OLINDA/EXM 2—The Next-generation Personal Computer Software for Internal Dose Assessment in Nuclear Medicine. Health Phys. 2023;124(5):397–406. doi: 10.1097/HP.0000000000001682
-
[5]
Uncertainties in Internal Dose Calculations for Radiopharmaceuticals
Stabin MG. Uncertainties in Internal Dose Calculations for Radiopharmaceuticals. Journal of Nuclear Medicine. 2008;49(5):853–860. doi: 10.2967/jnumed.107.048132
-
[6]
Shanmugiah J, Kim JS. Personalised Dosimetry in Nuclear Medicine: Bridging Physics, Biology and AI for Next Generation Radiopharmaceutical Therapy. Nucl Med Mol Imaging. 2026; doi: 10.1007/s13139-026-00988-8
-
[7]
Advanced Monte Carlo simulations of emission tomography imaging systems with GATE
Sarrut D, Bała M, Bardiès M, et al. Advanced Monte Carlo simulations of emission tomography imaging systems with GATE. Phys Med Biol. 2021;66(10):10TR03. doi: 10.1088/1361-6560/abf276
-
[8]
Bu D, Sun J, Li K, et al. Empowering AI data scientists using a multi-agent LLM framework with self-evolving capabilities for autonomous, tool-aware biomedical data analyses. Nat Biomed Eng. 2026; doi: 10.1038/s41551-026-01634-6. 15
-
[9]
Tzanis E, Klontzas ME. mAIstro: An open-source multi-agent system for automated end-to-end development of radiomics and deep learning models for medical imaging. European Journal of Radiology Artificial Intelligence. 2025;4:100044. doi: 10.1016/j.ejrai.2025.100044
-
[10]
Tzanis E, Adams LC, Akinci D’Antonoli T, et al. Agentic systems in radiology: Principles, opportunities, privacy risks, regulation, and sustainability concerns. Diagn Interv Imaging. 2026;107(1):7–16. doi: 10.1016/j.diii.2025.10.002
-
[11]
Tzanis E, Klontzas ME. Klontzas. ReclAIm: A multi-agent framework for degradation-aware performance tuning of medical imaging AI. arXiv:251017004. 2025. doi: 10.48550/arXiv.2510.17004
-
[12]
A whole-body PSMA-PET/CT dataset with manually annotated tumor lesions (Version 2) [dataset]
Jeblick, K., Schachtner, B., Mittermeier, A., et al. A whole-body PSMA-PET/CT dataset with manually annotated tumor lesions (Version 2) [dataset]. The Cancer Imaging Archive 2026. doi: 10.7937/r7ep-3×37
-
[13]
Checklist for Artificial Intelligence in Medical Imaging (CLAIM): 2024 Update
Tejani AS, Klontzas ME, Gatti AA, et al. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): 2024 Update. Radiol Artif Intell. 2024;6(4). doi: 10.1148/ryai.240300
-
[14]
https://www.anthropic.com/news/model-context-protocol
Introducing the Model Context Protocol. https://www.anthropic.com/news/model-context-protocol
-
[15]
Medical image interpolation based on 3D Lanczos filtering
Moraes T, Amorim P, da Silva JV, Pedrini H. Medical image interpolation based on 3D Lanczos filtering. Comput Methods Biomech Biomed Eng Imaging Vis. 2020;8(3):294–300. doi: 10.1080/21681163.2019.1683469
-
[16]
Sarrut D, Bardiès M, Boussion N, et al. A review of the use and potential of the GATE Monte Carlo simulation code for radiation therapy and dosimetry applications. Med Phys. 2014;41(6Part1):064301. doi: 10.1118/1.4871617 16
-
[17]
https://geant4.web.cern.ch/documentation/pipelines/master/plg_html/PhysicsListGuide/electromagnetic/O pt4.html
Geant4 - Electromagnetic physics constructors. https://geant4.web.cern.ch/documentation/pipelines/master/plg_html/PhysicsListGuide/electromagnetic/O pt4.html. Accessed 6/2/2026
2026
-
[18]
TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images
Wasserthal J, Breit H-C, Meyer MT, et al. TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images. Radiol Artif Intell. 2023;5(5). doi: 10.1148/ryai.230024
-
[19]
https://opentelemetry.io/
OpenTelemetry: The open standard for telemetry. https://opentelemetry.io/. Accessed 6/2/2026
2026
-
[20]
https://github.com/Arize- ai/openinference
OpenInference: Semantic conventions for LLM observability. https://github.com/Arize- ai/openinference. Accessed 6/2/2026
2026
-
[21]
https://phoenix.arize.com/
Arize Phoenix: Open-source LLM tracing and evaluation. https://phoenix.arize.com/. Accessed 6/2/2026
2026
-
[22]
OpenDose3D: A Free, Open-Source Clinical Dosimetry Software for Patient-Specific Dosimetry
Fragoso-Negrín J-A, Vergara-Gil A, Rahman Hakim A, et al. OpenDose3D: A Free, Open-Source Clinical Dosimetry Software for Patient-Specific Dosimetry. Journal of Nuclear Medicine. 2025;jnumed.125.269539. doi: 10.2967/jnumed.125.269539
-
[23]
A Concordance Correlation Coefficient to Evaluate Reproducibility
Lin LI-K. A Concordance Correlation Coefficient to Evaluate Reproducibility. Biometrics. 1989;45(1):255. doi: 10.2307/2532051
-
[24]
Measurement in Medicine: The Analysis of Method Comparison Studies
Altman DG, Bland JM. Measurement in Medicine: The Analysis of Method Comparison Studies. The Statistician. 1983;32(3):307. doi: 10.2307/2987937
-
[25]
Radiation Dosimetry and Biodistribution of 68 Ga-FAPI-46 PET Imaging in Cancer Patients
Meyer C, Dahlbom M, Lindner T, et al. Radiation Dosimetry and Biodistribution of 68 Ga-FAPI-46 PET Imaging in Cancer Patients. Journal of Nuclear Medicine. 2020;61(8):1171–1177. doi: 10.2967/jnumed.119.236786. 17
-
[26]
Radiation dosimetry of 18F-FDG PET/CT: incorporating exam-specific parameters in dose estimates
Quinn B, Dauer Z, Pandit-Taskar N, Schoder H, Dauer LT. Radiation dosimetry of 18F-FDG PET/CT: incorporating exam-specific parameters in dose estimates. BMC Med Imaging. 2016;16(1):41. doi: 10.1186/s12880-016-0143-y
-
[27]
Giesel FL, Hadaschik B, Cardinale J, et al. F-18 labelled PSMA-1007: biodistribution, radiation dosimetry and histopathological validation of tumor lesions in prostate cancer patients. Eur J Nucl Med Mol Imaging. 2017;44(4):678–688. doi: 10.1007/s00259-016-3573-4
-
[28]
Liu D, Khong P-L, Gao Y, et al. Radiation Dosimetry of Whole-Body Dual-Tracer 18 F-FDG and 11 C-Acetate PET/CT for Hepatocellular Carcinoma. Journal of Nuclear Medicine. 2016;57(6):907–912. doi: 10.2967/jnumed.115.165944
-
[29]
Giesel FL, Adeberg S, Syed M, et al. FAPI-74 PET/CT Using Either 18 F-AlF or Cold-Kit 68 Ga Labeling: Biodistribution, Radiation Dosimetry, and Tumor Delineation in Lung Cancer Patients. Journal of Nuclear Medicine. 2021;62(2):201–207. doi: 10.2967/jnumed.120.245084
-
[30]
Andersson M, Johansson L, Minarik D, Mattsson S, Leide-Svegborn S. An internal radiation dosimetry computer program, IDAC 2.0, for estimation of patient doses from radiopharmaceuticals. Radiat Prot Dosimetry. 2014;162(3):299–305. doi: 10.1093/rpd/nct337
-
[31]
Kesner AL, Carter LM, Ramos JCO, et al. MIRD Pamphlet No. 28, Part 1: MIRDcalc—A Software Tool for Medical Internal Radiation Dosimetry. Journal of Nuclear Medicine. 2023;64(7):1117–1124. doi: 10.2967/jnumed.122.264225
-
[32]
Forster RA, Cox LJ, Barrett RF, et al. MCNPTM Version 5. Nucl Instrum Methods Phys Res B. 2004;213:82–86. doi: 10.1016/S0168-583X(03)01538-6
-
[33]
3D Slicer as an image computing platform for the Quantitative Imaging Network
Fedorov A, Beichel R, Kalpathy-Cramer J, et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn Reson Imaging. 2012;30(9):1323–1341. doi: 10.1016/j.mri.2012.05.001. 18
-
[34]
Screening for cancer with PET and PET/CT: potential and limitations
Schöder H, Gönen M. Screening for cancer with PET and PET/CT: potential and limitations. J Nucl Med. 2007 Jan;48 Suppl 1:4S-18S
2007
-
[35]
Ionising radiation and cardiovascular disease: systematic review and meta-analysis
Little MP, Azizova T v, Richardson DB, et al. Ionising radiation and cardiovascular disease: systematic review and meta-analysis. BMJ. 2023;380:e072924. doi: 10.1136/bmj-2022-072924. 19 Table 1. Organ/tissue dosimetric quantities for ¹⁸F cases. Values are reported as mean ± SD across studies. Organ / Tissue Ḋ (μGy/s) D_scan (mGy) D_inj (mGy) DCF (Gy/(Bq·s...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.