Generalizability vs. Robustness: Adversarial Examples for Medical Imaging

Magdalini Paschali , Sailesh Conjeti , Fernando Navarro , Nassir Navab

Authors on Pith no claims yet

classification 💻 cs.CV

keywords modelsdataperformancerobustnessadversarialexamplesextensivegeneralizability

read the original abstract

In this paper, for the first time, we propose an evaluation method for deep learning models that assesses the performance of a model not only in an unseen test scenario, but also in extreme cases of noise, outliers and ambiguous input data. To this end, we utilize adversarial examples, images that fool machine learning models, while looking imperceptibly different from original data, as a measure to evaluate the robustness of a variety of medical imaging models. Through extensive experiments on skin lesion classification and whole brain segmentation with state-of-the-art networks such as Inception and UNet, we show that models that achieve comparable performance regarding generalizability may have significant variations in their perception of the underlying data manifold, leading to an extensive performance gap in their robustness.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ROAST: Risk-aware Outlier-exposure for Adversarial Selective Training of Anomaly Detectors Against Evasion Attacks
cs.CR 2026-03 unverdicted novelty 6.0

ROAST selectively trains anomaly detectors on less vulnerable patient data with targeted outlier exposure, boosting recall by 16.2% in black-box settings and reducing training time by 88.3%.