Clinically Aware Synthetic Image Generation for Concept Coverage in Chest X-ray Models

Ajitha Rajan; Amy Rafferty; Rishi Ramaesh

arxiv: 2603.15525 · v2 · submitted 2026-03-16 · 💻 cs.CV · cs.HC

Clinically Aware Synthetic Image Generation for Concept Coverage in Chest X-ray Models

Amy Rafferty , Rishi Ramaesh , Ajitha Rajan This is my paper

Pith reviewed 2026-05-15 09:50 UTC · model grok-4.3

classification 💻 cs.CV cs.HC

keywords synthetic chest X-raysclinical concept perturbationanatomical fidelitymodel calibrationconcept coverageCARPAchest X-ray classification

0 comments

The pith

Anatomically grounded perturbations to clinical concepts generate synthetic chest X-rays that improve model performance and reliability on real test data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CARPA, a framework that generates synthetic chest X-rays by applying targeted perturbations to clinical concept vectors while preserving anatomical structure. This method expands the coverage of clinically meaningful concept combinations that are often missing from public training datasets. When models are fine-tuned on these synthetic images, they show consistent gains in precision-recall performance, reduced predictive uncertainty, and better calibration across multiple architectures. Structural analyses and expert radiologist reviews confirm that the generated images maintain high anatomical fidelity and clinical realism.

Core claim

CARPA produces anatomically faithful synthetic images with controlled concept insertions and deletions by perturbing clinical concept vectors while preserving anatomical structure, which expands clinically relevant concept coverage and leads to improved precision-recall, lower uncertainty, and better calibration when models are fine-tuned on the synthetic data and evaluated on held-out MIMIC-CXR benchmarks.

What carries the argument

The CARPA framework, which applies targeted perturbations to clinical concept vectors while preserving anatomical structure to enable controlled concept coverage in synthetic chest X-rays.

If this is right

Fine-tuning on CARPA-generated images improves precision-recall performance compared to prior concept perturbation methods across seven backbone architectures.
Models show reduced predictive uncertainty and improved calibration on held-out real data.
Structural and semantic analyses indicate high anatomical fidelity and strong concept alignment.
Expert radiologist evaluation confirms the realism and clinical agreement of the synthetic images.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such synthetic data augmentation could support safer clinical deployment of chest X-ray models by addressing gaps in concept coverage.
Similar perturbation approaches might be adapted to other medical imaging domains where anatomical constraints are critical.
Combining CARPA with existing real datasets could reduce reliance on large annotated collections for training reliable models.

Load-bearing premise

Targeted perturbations to clinical concept vectors preserve anatomical structure accurately enough to generate realistic synthetic images that enhance model performance without introducing artifacts or biases.

What would settle it

Demonstrating that fine-tuned models on CARPA images fail to outperform baselines on held-out MIMIC-CXR data or that radiologists consistently identify artifacts in the synthetic images would falsify the central claim.

Figures

Figures reproduced from arXiv: 2603.15525 by Ajitha Rajan, Amy Rafferty, Rishi Ramaesh.

read the original abstract

Deep learning models for chest X-ray diagnosis are constrained by limited coverage of clinically meaningful concept combinations in publicly available training datasets. While synthetic image generation has been explored to increase data diversity, existing methods rarely enforce clinical or anatomical constraints, limiting utility for improving model reliability. We propose CARPA, a clinically aware and anatomically grounded framework for synthetic chest X-ray generation that applies targeted perturbations to clinical concept vectors while preserving anatomical structure. By producing anatomically faithful synthetic images with controlled concept insertions and deletions, CARPA expands clinically relevant concept coverage. We evaluate CARPA across seven backbone architectures by fine-tuning models on synthetic subsets and testing on a held-out MIMIC-CXR benchmark. Compared to prior concept perturbation approaches, fine-tuning on CARPA-generated images consistently improves precision-recall performance, reduces predictive uncertainty, and improves model calibration. Structural and semantic analyses demonstrate high anatomical fidelity, strong concept alignment, and low semantic uncertainty. Evaluation by two expert radiologists further confirms realism and clinical agreement. Together, these results show that anatomically grounded concept perturbations enable more effective use of synthetic data, improving both performance and reliability of chest X-ray classification models and supporting safer clinical deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CARPA gives a practical way to boost concept coverage in chest X-ray models via targeted synthetic images, with decent empirical backing, though the anatomical preservation step still needs tighter proof.

read the letter

The key takeaway is that this paper introduces CARPA, a framework that perturbs clinical concept vectors to create synthetic chest X-rays while claiming to keep anatomical structure intact. They fine-tune seven different architectures on these images and report gains in precision-recall, calibration, and lower uncertainty on a held-out MIMIC-CXR set, plus positive feedback from two radiologists on realism and clinical fit. That combination of broad testing and expert input is the strongest part of the work.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes CARPA, a clinically aware framework for synthetic chest X-ray generation that perturbs clinical concept vectors while preserving anatomical structure to expand concept coverage in training data. It reports consistent improvements in precision-recall performance, model calibration, and reduced uncertainty when fine-tuning seven architectures on CARPA-generated images and evaluating on held-out MIMIC-CXR data, backed by structural analyses, semantic evaluations, and review by two radiologists.

Significance. If the results are robust, this work could significantly advance the use of synthetic data in medical imaging by providing a method to generate clinically relevant variations without sacrificing anatomical fidelity. The evaluation across multiple architectures and the inclusion of expert radiologist assessment strengthen the potential impact for improving reliability in chest X-ray classification models.

major comments (2)

The central claim relies on the assertion that targeted perturbations preserve anatomical structure, but the provided description does not detail the concrete mechanisms, such as specific loss terms, constraints, or conditioning approaches, that enforce this preservation. This is load-bearing, as without it the improvements could stem from artifacts rather than true concept learning.
While structural and semantic analyses are mentioned, they are post-hoc and do not include quantitative measures of regional fidelity such as segmentation overlap or landmark errors on key anatomical structures, which would be necessary to substantiate the anatomical grounding claim.

minor comments (2)

The abstract could benefit from including specific quantitative improvements (e.g., exact deltas in AUC or calibration error) to better contextualize the gains.
Clarify the exact number of synthetic images generated and the proportion used in fine-tuning relative to real data.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive summary and constructive major comments. We address each point below and will revise the manuscript accordingly to strengthen the presentation of our methods and evaluations.

read point-by-point responses

Referee: The central claim relies on the assertion that targeted perturbations preserve anatomical structure, but the provided description does not detail the concrete mechanisms, such as specific loss terms, constraints, or conditioning approaches, that enforce this preservation. This is load-bearing, as without it the improvements could stem from artifacts rather than true concept learning.

Authors: We agree that explicit details on the anatomical preservation mechanisms are essential to support our central claims. Section 3.2 describes the CARPA framework as a conditional generative model that perturbs clinical concept vectors while enforcing anatomical fidelity via a composite loss combining reconstruction, adversarial, and perceptual terms derived from a frozen anatomical feature extractor. To make this load-bearing aspect fully transparent, we will add the precise loss equations, the conditioning architecture (including how concept vectors are injected without altering spatial anatomy), and any regularization constraints in the revised Methods section. revision: yes
Referee: While structural and semantic analyses are mentioned, they are post-hoc and do not include quantitative measures of regional fidelity such as segmentation overlap or landmark errors on key anatomical structures, which would be necessary to substantiate the anatomical grounding claim.

Authors: We acknowledge that region-specific quantitative metrics would provide stronger substantiation for anatomical fidelity. Our current evaluations report global structural similarity (SSIM/PSNR) and semantic concept alignment, supplemented by radiologist review. In the revision we will add Dice overlap scores for lungs, heart, and mediastinum (using an off-the-shelf segmentation model) as well as mean landmark localization errors on standard anatomical keypoints, computed between real and synthetic images, and include these results in a new quantitative table. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper proposes the CARPA framework for generating synthetic chest X-rays via targeted perturbations to clinical concept vectors while claiming to preserve anatomical structure. It then evaluates by fine-tuning seven backbone models on the synthetic subsets and measuring precision-recall, calibration, and uncertainty on a held-out MIMIC-CXR benchmark. No equations, self-definitional steps, fitted-input predictions, or self-citation load-bearing arguments appear in the abstract or described method; the reported gains are measured against an independent external test set rather than being constructed from the generation process itself. Structural analyses and radiologist review are post-hoc validation steps, not inputs that the performance claims reduce to by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the domain assumption that anatomically faithful synthetic images from controlled concept perturbations will improve model generalization and calibration. No explicit free parameters, invented entities, or detailed axioms are provided.

axioms (1)

domain assumption Synthetic images generated via targeted clinical concept perturbations preserve sufficient anatomical fidelity to serve as effective training data
Invoked in the description of CARPA and its evaluation for improving model performance

pith-pipeline@v0.9.0 · 5507 in / 1271 out tokens · 60361 ms · 2026-05-15T09:50:39.494891+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

[1]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Bannur, S., Hyland, S., Liu, Q., Perez-Garcia, F., Ilse, M., Castro, D.C., Boecking, B., Sharma, H., Bouzid, K., Thieme, A., et al.: Learning to exploit temporal struc- ture for biomedical vision-language processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 15016–15027 (2023)

work page 2023
[2]

Nature Machine Intelligence1(2019)

Begoli,E.,Bhattacharya,T.,Kusnezov,D.:Theneedforuncertaintyquantification in machine-assisted medical decision making. Nature Machine Intelligence1(2019)

work page 2019
[3]

Chaichuk, M., Gautam, S., Hicks, S., Tutubalina, E.: Prompt to polyp: Medical text-conditioned image synthesis with diffusion models (2025)

work page 2025
[4]

Chexpert plus: Hundreds of thousands of aligned radiology texts, im- ages and patients.arXiv preprint arXiv:2405.19538, 2024

Chambon, P., Delbrouck, J.B., Sounack, T., Huang, S.C., Chen, Z., Varma, M., Truong, S.Q., Chuong, C.T., Langlotz, C.P.: Chexpert plus: Augmenting a large chest x-ray dataset with text radiology reports, patient demographics and addi- tional image formats. arXiv:2405.19538 (2024)

work page arXiv 2024
[5]

Franchi, G., Trong, D.N., Belkhir, N., Xia, G., Pilzer, A.: Towards understanding and quantifying uncertainty for text-to-image generation (2024)

work page 2024
[6]

Neurocomputing321, 321–331 (Dec 2018)

Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing321, 321–331 (Dec 2018)

work page 2018
[7]

Ghesu, F.C., Georgescu, B., Gibson, E., Guendel, S., Kalra, M.K., Singh, R., Digu- marthy, S.R., Grbic, S., Comaniciu, D.: Quantifying and leveraging classification uncertainty for chest radiograph assessment (2019)

work page 2019
[8]

circulation101(23), e215–e220 (2000)

Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: Physiobank, phys- iotoolkit, and physionet: components of a new research resource for complex phys- iologic signals. circulation101(23), e215–e220 (2000)

work page 2000
[9]

Advances in Neural Information Processing Systems3(06 2014)

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Yere, Y.: Generative adversarial networks. Advances in Neural Information Processing Systems3(06 2014)

work page 2014
[10]

In: Advances in Neu- ral Information Processing Systems

Hernandez-Lobato, J.M., Hoffman, M.W., Ghahramani, Z.: Predictive entropy search for efficient global optimization of black-box functions. In: Advances in Neu- ral Information Processing Systems. pp. 918–926. Curran Associates Inc. (2014)

work page 2014
[11]

Advances in Neural Information Pro- cessing Systems pp

Hinton, G., Krizhevsky, A., Sutskever, I., Rachmad, Y.: Imagenet classification with deep convolutional neural networks. Advances in Neural Information Pro- cessing Systems pp. 1097–1105 (01 2012)

work page 2012
[12]

PhysioNet5(19), 1 (2023)

Holste, G., Wang, S., Jaiswal, A., Yang, Y., Lin, M., Peng, Y., Wang, A.: Cxr-lt: Multi-label long-tailed classification on chest x-rays. PhysioNet5(19), 1 (2023)

work page 2023
[13]

In: Proceedings of the AAAI conference on artificial intelligence

Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI conference on artificial intelligence. vol. 33, pp. 590–597 (2019)

work page 2019
[14]

Jiang, H., Kim, B., Guan, M.Y., Gupta, M.: To trust or not to trust a classifier (2018)

work page 2018
[15]

Scientific data6(1) (2019)

Johnson, A.E., Pollard, T.J., Berkowitz, S.J., Greenbaum, N.R., Lungren, M.P., Deng, C.y., Mark, R.G., Horng, S.: Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data6(1) (2019)

work page 2019
[16]

MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs

Johnson, A.E., Pollard, T.J., Greenbaum, N.R., Lungren, M.P., Deng, C.y., Peng, Y., Lu, Z., Mark, R.G., Berkowitz, S.J., Horng, S.: Mimic-cxr-jpg, a large publicly available database of labeled chest radiographs. arXiv:1901.07042 (2019) 10 A.Rafferty et al

work page internal anchor Pith review arXiv 1901
[17]

Medical Image Analysis88, 102846 (2023)

Kazerouni, A., Aghdam, E.K., Heidari, M., Azad, R., Fayyaz, M., Hacihaliloglu, I., Merhof, D.: Diffusion models in medical imaging: A comprehensive survey. Medical Image Analysis88, 102846 (2023)

work page 2023
[18]

npj Digital Medicine4(12 2021)

Kompa, B., Snoek, J., Beam, A.: Second opinion needed: communicating uncer- tainty in medical machine learning. npj Digital Medicine4(12 2021)

work page 2021
[19]

Kuhn, L., Gal, Y., Farquhar, S.: Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation (2023)

work page 2023
[20]

Medical Image Analysis42, 60–88 (Dec 2017)

Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A., van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Medical Image Analysis42, 60–88 (Dec 2017)

work page 2017
[21]

In: ML for Healthcare Conference

McDermott, M.B., Hsu, T.M.H., Weng, W.H., Ghassemi, M., Szolovits, P.: Chexpert++: Approximating the chexpert labeler for speed, differentiability, and probabilistic output. In: ML for Healthcare Conference. pp. 913–927. PMLR (2020)

work page 2020
[22]

In: European Conference on Computer Vision

Pérez-García, F., Bond-Taylor, S., Sanchez, P.P., van Breugel, B., Castro, D.C., Sharma, H., Salvatelli, V., Wetscherek, M.T., Richardson, H., Lungren, M.P., et al.: Radedit: stress-testing biomedical vision models via diffusion image editing. In: European Conference on Computer Vision. pp. 358–376. Springer (2024)

work page 2024
[23]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., Rombach, R.: Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv:2307.01952 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[24]

Posocco, N., Bonnefoy, A.: Estimating expected calibration errors (2021)

work page 2021
[25]

Rafferty, A., Ramaesh, R., Rajan, A.: Corpa: Adversarial image generation for chest x-rays using concept vector perturbations and generative models (2025)

work page 2025
[26]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)

work page 2022
[27]

IEEE transactions on medical imaging (09 2015)

Roth, H., Lu, L., Liu, J., Yao, J., Seff, A., Kim, L., Summers, R.: Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE transactions on medical imaging (09 2015)

work page 2015
[28]

Neurocomput.194(C), 87–94 (Jun 2016)

Shi, J., Zhou, S., Liu, X., Zhang, Q., Lu, M., Wang, T.: Stacked deep polynomial network based representation learning for tumor classification with small ultra- sound image dataset. Neurocomput.194(C), 87–94 (Jun 2016)

work page 2016
[29]

Singhal, K., Azizi, S., Tu, T., Mahdavi, S.S., Wei, J., Chung, H.W., Scales, N., Tanwani, A., Cole-Lewis, H., Pfohl, S., et al., P.P.: Large language models encode clinical knowledge (2022)

work page 2022
[30]

Sundaram, S., Hulkund, N.: Gan-based data augmentation for chest x-ray classifi- cation (2021)

work page 2021
[31]

Tajbakhsh, N., Shin, J., Gurudu, S., Hurst, R.T., Kendall, C., Gotway, M., Liang, J.: Convolutional neural networks for medical image analysis: Fine tuning or full training? IEEE Transactions on Medical Imaging35, 1–1 (03 2016)

work page 2016
[32]

Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., Zhang, W., Cui, B., Yang,M.H.:Diffusionmodels:Acomprehensivesurveyofmethodsandapplications (2025)

work page 2025
[33]

Nature Communications16(04 2025)

Zambrano Chaves, J.M., Huang, S.C., Xu, Y., Xu, H., Usuyama, N., Zhang, S., Wang, F., Xie, Y., Khademi, M., Yang, Z.e.a.: A clinically accessible small mul- timodal radiology model and evaluation metric for chest x-ray findings. Nature Communications16(04 2025)

work page 2025

[1] [1]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Bannur, S., Hyland, S., Liu, Q., Perez-Garcia, F., Ilse, M., Castro, D.C., Boecking, B., Sharma, H., Bouzid, K., Thieme, A., et al.: Learning to exploit temporal struc- ture for biomedical vision-language processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 15016–15027 (2023)

work page 2023

[2] [2]

Nature Machine Intelligence1(2019)

Begoli,E.,Bhattacharya,T.,Kusnezov,D.:Theneedforuncertaintyquantification in machine-assisted medical decision making. Nature Machine Intelligence1(2019)

work page 2019

[3] [3]

Chaichuk, M., Gautam, S., Hicks, S., Tutubalina, E.: Prompt to polyp: Medical text-conditioned image synthesis with diffusion models (2025)

work page 2025

[4] [4]

Chexpert plus: Hundreds of thousands of aligned radiology texts, im- ages and patients.arXiv preprint arXiv:2405.19538, 2024

Chambon, P., Delbrouck, J.B., Sounack, T., Huang, S.C., Chen, Z., Varma, M., Truong, S.Q., Chuong, C.T., Langlotz, C.P.: Chexpert plus: Augmenting a large chest x-ray dataset with text radiology reports, patient demographics and addi- tional image formats. arXiv:2405.19538 (2024)

work page arXiv 2024

[5] [5]

Franchi, G., Trong, D.N., Belkhir, N., Xia, G., Pilzer, A.: Towards understanding and quantifying uncertainty for text-to-image generation (2024)

work page 2024

[6] [6]

Neurocomputing321, 321–331 (Dec 2018)

Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing321, 321–331 (Dec 2018)

work page 2018

[7] [7]

Ghesu, F.C., Georgescu, B., Gibson, E., Guendel, S., Kalra, M.K., Singh, R., Digu- marthy, S.R., Grbic, S., Comaniciu, D.: Quantifying and leveraging classification uncertainty for chest radiograph assessment (2019)

work page 2019

[8] [8]

circulation101(23), e215–e220 (2000)

Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: Physiobank, phys- iotoolkit, and physionet: components of a new research resource for complex phys- iologic signals. circulation101(23), e215–e220 (2000)

work page 2000

[9] [9]

Advances in Neural Information Processing Systems3(06 2014)

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Yere, Y.: Generative adversarial networks. Advances in Neural Information Processing Systems3(06 2014)

work page 2014

[10] [10]

In: Advances in Neu- ral Information Processing Systems

Hernandez-Lobato, J.M., Hoffman, M.W., Ghahramani, Z.: Predictive entropy search for efficient global optimization of black-box functions. In: Advances in Neu- ral Information Processing Systems. pp. 918–926. Curran Associates Inc. (2014)

work page 2014

[11] [11]

Advances in Neural Information Pro- cessing Systems pp

Hinton, G., Krizhevsky, A., Sutskever, I., Rachmad, Y.: Imagenet classification with deep convolutional neural networks. Advances in Neural Information Pro- cessing Systems pp. 1097–1105 (01 2012)

work page 2012

[12] [12]

PhysioNet5(19), 1 (2023)

Holste, G., Wang, S., Jaiswal, A., Yang, Y., Lin, M., Peng, Y., Wang, A.: Cxr-lt: Multi-label long-tailed classification on chest x-rays. PhysioNet5(19), 1 (2023)

work page 2023

[13] [13]

In: Proceedings of the AAAI conference on artificial intelligence

Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., et al.: Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI conference on artificial intelligence. vol. 33, pp. 590–597 (2019)

work page 2019

[14] [14]

Jiang, H., Kim, B., Guan, M.Y., Gupta, M.: To trust or not to trust a classifier (2018)

work page 2018

[15] [15]

Scientific data6(1) (2019)

Johnson, A.E., Pollard, T.J., Berkowitz, S.J., Greenbaum, N.R., Lungren, M.P., Deng, C.y., Mark, R.G., Horng, S.: Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data6(1) (2019)

work page 2019

[16] [16]

MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs

Johnson, A.E., Pollard, T.J., Greenbaum, N.R., Lungren, M.P., Deng, C.y., Peng, Y., Lu, Z., Mark, R.G., Berkowitz, S.J., Horng, S.: Mimic-cxr-jpg, a large publicly available database of labeled chest radiographs. arXiv:1901.07042 (2019) 10 A.Rafferty et al

work page internal anchor Pith review arXiv 1901

[17] [17]

Medical Image Analysis88, 102846 (2023)

Kazerouni, A., Aghdam, E.K., Heidari, M., Azad, R., Fayyaz, M., Hacihaliloglu, I., Merhof, D.: Diffusion models in medical imaging: A comprehensive survey. Medical Image Analysis88, 102846 (2023)

work page 2023

[18] [18]

npj Digital Medicine4(12 2021)

Kompa, B., Snoek, J., Beam, A.: Second opinion needed: communicating uncer- tainty in medical machine learning. npj Digital Medicine4(12 2021)

work page 2021

[19] [19]

Kuhn, L., Gal, Y., Farquhar, S.: Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation (2023)

work page 2023

[20] [20]

Medical Image Analysis42, 60–88 (Dec 2017)

Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A., van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Medical Image Analysis42, 60–88 (Dec 2017)

work page 2017

[21] [21]

In: ML for Healthcare Conference

McDermott, M.B., Hsu, T.M.H., Weng, W.H., Ghassemi, M., Szolovits, P.: Chexpert++: Approximating the chexpert labeler for speed, differentiability, and probabilistic output. In: ML for Healthcare Conference. pp. 913–927. PMLR (2020)

work page 2020

[22] [22]

In: European Conference on Computer Vision

Pérez-García, F., Bond-Taylor, S., Sanchez, P.P., van Breugel, B., Castro, D.C., Sharma, H., Salvatelli, V., Wetscherek, M.T., Richardson, H., Lungren, M.P., et al.: Radedit: stress-testing biomedical vision models via diffusion image editing. In: European Conference on Computer Vision. pp. 358–376. Springer (2024)

work page 2024

[23] [23]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., Rombach, R.: Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv:2307.01952 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[24] [24]

Posocco, N., Bonnefoy, A.: Estimating expected calibration errors (2021)

work page 2021

[25] [25]

Rafferty, A., Ramaesh, R., Rajan, A.: Corpa: Adversarial image generation for chest x-rays using concept vector perturbations and generative models (2025)

work page 2025

[26] [26]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)

work page 2022

[27] [27]

IEEE transactions on medical imaging (09 2015)

Roth, H., Lu, L., Liu, J., Yao, J., Seff, A., Kim, L., Summers, R.: Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE transactions on medical imaging (09 2015)

work page 2015

[28] [28]

Neurocomput.194(C), 87–94 (Jun 2016)

Shi, J., Zhou, S., Liu, X., Zhang, Q., Lu, M., Wang, T.: Stacked deep polynomial network based representation learning for tumor classification with small ultra- sound image dataset. Neurocomput.194(C), 87–94 (Jun 2016)

work page 2016

[29] [29]

Singhal, K., Azizi, S., Tu, T., Mahdavi, S.S., Wei, J., Chung, H.W., Scales, N., Tanwani, A., Cole-Lewis, H., Pfohl, S., et al., P.P.: Large language models encode clinical knowledge (2022)

work page 2022

[30] [30]

Sundaram, S., Hulkund, N.: Gan-based data augmentation for chest x-ray classifi- cation (2021)

work page 2021

[31] [31]

Tajbakhsh, N., Shin, J., Gurudu, S., Hurst, R.T., Kendall, C., Gotway, M., Liang, J.: Convolutional neural networks for medical image analysis: Fine tuning or full training? IEEE Transactions on Medical Imaging35, 1–1 (03 2016)

work page 2016

[32] [32]

Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., Zhang, W., Cui, B., Yang,M.H.:Diffusionmodels:Acomprehensivesurveyofmethodsandapplications (2025)

work page 2025

[33] [33]

Nature Communications16(04 2025)

Zambrano Chaves, J.M., Huang, S.C., Xu, Y., Xu, H., Usuyama, N., Zhang, S., Wang, F., Xie, Y., Khademi, M., Yang, Z.e.a.: A clinically accessible small mul- timodal radiology model and evaluation metric for chest x-ray findings. Nature Communications16(04 2025)

work page 2025