Towards Virtual Clinical Trials of Radiology AI with Conditional Generative Modeling

Aditya V. Kulkarni; Benjamin D. Killeen; Bohua Wan; Mathias Unberath; Michael Oberst; Nathan Drenkow; Paul H. Yi

arxiv: 2502.09688 · v1 · pith:TC3OEK4Gnew · submitted 2025-02-13 · 💻 cs.CV · cs.AI· cs.LG

Towards Virtual Clinical Trials of Radiology AI with Conditional Generative Modeling

Benjamin D. Killeen , Bohua Wan , Aditya V. Kulkarni , Nathan Drenkow , Michael Oberst , Paul H. Yi , Mathias Unberath This is my paper

classification 💻 cs.CV cs.AIcs.LG

keywords modelmodelsclinicaldataradiologydegradationdiversegenerative

0 comments

read the original abstract

Artificial intelligence (AI) is poised to transform healthcare by enabling personalized and efficient care through data-driven insights. Although radiology is at the forefront of AI adoption, in practice, the potential of AI models is often overshadowed by severe failures to generalize: AI models can have performance degradation of up to 20% when transitioning from controlled test environments to clinical use by radiologists. This mismatch raises concerns that radiologists will be misled by incorrect AI predictions in practice and/or grow to distrust AI, rendering these promising technologies practically ineffectual. Exhaustive clinical trials of AI models on abundant and diverse data is thus critical to anticipate AI model degradation when encountering varied data samples. Achieving these goals, however, is challenging due to the high costs of collecting diverse data samples and corresponding annotations. To overcome these limitations, we introduce a novel conditional generative AI model designed for virtual clinical trials (VCTs) of radiology AI, capable of realistically synthesizing full-body CT images of patients with specified attributes. By learning the joint distribution of images and anatomical structures, our model enables precise replication of real-world patient populations with unprecedented detail at this scale. We demonstrate meaningful evaluation of radiology AI models through VCTs powered by our synthetic CT study populations, revealing model degradation and facilitating algorithmic auditing for bias-inducing data attributes. Our generative AI approach to VCTs is a promising avenue towards a scalable solution to assess model robustness, mitigate biases, and safeguard patient care by enabling simpler testing and evaluation of AI models in any desired range of diverse patient populations.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Non-intrusive Body Composition Assessment from Full-body mmWave Scans
eess.IV 2026-05 conditional novelty 7.0

Multi-task learning on synthetic mmWave point clouds from clinical images enables regression of VAT and BFP from real mmWave scans with MAE of 1.0 L and 3.2%.
Non-intrusive Body Composition Assessment from Full-body mmWave Scans
eess.IV 2026-05 unverdicted novelty 6.0

A multi-task learning model trained on synthetic mmWave-like point clouds estimates VAT and BFP from real full-body mmWave scans through clothing with mean absolute errors of 1.0 L and 3.2%.
2D Versus 3D Diffusion for In Silico Training of Interventional X-ray AI Models
eess.IV 2026-06 unverdicted novelty 5.0

2D diffusion-generated synthetic X-rays enable training of anatomical landmark detectors that generalize to real images with performance rivaling real-data training.