arxiv: 2605.05026 · v1 · submitted 2026-05-06 · 💻 cs.CV · cs.AI

Recognition: 3 theorem links

· Lean Theorem

Local Intrinsic Dimension Unveils Hallucinations in Diffusion Models

Bartlomiej Sobieski , Matthew Tivnan , Dawid P{\l}udowski , Micha{\l} Jan W{\l}odarczyk , Pengfei Jin , Przemyslaw Biecek , Quanzheng Li

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:00 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords diffusion modelshallucinationslocal intrinsic dimensionintrinsic quenchinggenerative modelsmanifold learningmedical imaging

0 comments

The pith

Diffusion models hallucinate when local intrinsic dimension spikes on their manifolds, and deflating it cuts those errors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper treats structural hallucinations in diffusion models as instabilities on the learned manifold rather than simple data mismatches. It shows that these instabilities are driven primarily by high local intrinsic dimension at certain points. The authors introduce Intrinsic Quenching to lower that dimension directly during generation. If the link holds, the method reduces anomalies like malformed anatomy while matching or beating existing correction techniques across benchmarks and medical imaging tasks.

Core claim

Hallucinations arise as instabilities on the model-induced manifold and are primarily driven by local intrinsic dimension. A filter based on these instabilities matches or exceeds temporal filters. Intrinsic Quenching, which deflates LID, reduces hallucinations more effectively than standard baselines on multiple benchmarks and improves anatomical consistency in medical image generation.

What carries the argument

Local intrinsic dimension (LID) on the model-induced manifold, which the paper identifies as the main source of generation instabilities; Intrinsic Quenching (IQ) serves as the direct mechanism that lowers LID to stabilize outputs.

If this is right

A manifold-instability filter for detecting hallucinations performs at least as well as temporal filters.
Directly reducing LID via IQ produces fewer structural violations than current hallucination-reduction methods.
The approach improves consistency in medical imaging tasks that require anatomical fidelity.
IQ works without needing changes to the base diffusion model training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If LID drives hallucinations in images, the same mechanism may appear in diffusion models for audio or 3D data, opening tests in those domains.
Combining IQ with existing regularization could further stabilize generation while preserving diversity.
Monitoring LID during sampling might serve as a real-time diagnostic for when a model is about to produce an invalid structure.

Load-bearing premise

That local intrinsic dimension is the primary driver of the observed manifold instabilities and that lowering it will reliably cut hallucinations without creating new artifacts or lowering overall sample quality.

What would settle it

Running Intrinsic Quenching on standard benchmarks and observing no drop in hallucination rate or a rise in new artifacts or quality degradation would disprove the central claim.

Figures

Figures reproduced from arXiv: 2605.05026 by Bartlomiej Sobieski, Dawid P{\l}udowski, Matthew Tivnan, Micha{\l} Jan W{\l}odarczyk, Pengfei Jin, Przemyslaw Biecek, Quanzheng Li.

**Figure 1.** Figure 1: A diffusion model-induced manifold, approximating the data manifold, invents spurious dimensions in unstable regions, leading to structural hallucinations. In particular, structural hallucinations [Kim et al., 2024] fail to recover correct object forms, generating anomalies like six-fingered hands or misaligned eyes. Lacking a formal definition, identifying them relies strictly on human perception, restr… view at source ↗

**Figure 2.** Figure 2: Left. Estimated densities for the values of each filter across correct and hallucinated samples. Middle. Separability and classification performance metrics for each filter. Right. Evolution of LID filter performance across the time interval. Vertical line indicates the optimal timestep. All results are reported on 11kHands dataset with EDM. Assumption 1 implies that LIDθ(x0) measures the effective dimensi… view at source ↗

**Figure 3.** Figure 3: Hallucination depicts the initial unconditionally generated sample by EDM on 11kHands. For a set of manually selected timesteps, LIDθ gradients are visualized to highlight their coarse-tofine transition and perceptual connection to hallucinated content. Correction displays the sample generated with IQ, the proposed corrective mechanism applied in the [0.02, 0.05] interval. The contribution of theorem 1 is… view at source ↗

**Figure 4.** Figure 4: Qualitative comparison of all sampling methods across six datasets. AAM is omitted view at source ↗

**Figure 5.** Figure 5: Examples of the user study interface from the view at source ↗

**Figure 6.** Figure 6: Performance curves of the LID filter across diffusion timesteps. The optimal timestep for classification and separability metrics is indicated by a vertical line in each case. B.5 Extended results We provide the average HR and UP in all considered cases, together with 95% CIs, in table 5. These results are based on the annotations of two independent experts (as mentioned in section B.2) using varying sampl… view at source ↗

**Figure 7.** Figure 7: Ablation of the adaptive step size magnitude view at source ↗

**Figure 8.** Figure 8: Ablation of the time interval bounds for view at source ↗

**Figure 9.** Figure 9: Ablation of the filtering parameter q on the 11kHands dataset. well as other metrics, improve when hallucinations occur frequently, and start to worsen once they decrease. B.6.3 Ablating filtering To ablate the filtering parameter q, we sweep it across the [0, 1] range, where q = 0 implies that IQ is applied unconditionally across all samples, while q = 1 means that it is never applied view at source ↗

**Figure 10.** Figure 10: Ablation of constant scaling on the 11kHands dataset. both RODSCAS and RODSSAS. These results confirm the superiority of IQ but also point toward an important research direction of reducing the computational cost while preserving the method’s performance view at source ↗

**Figure 11.** Figure 11: Qualitative comparison of all sampling methods on view at source ↗

**Figure 12.** Figure 12: Qualitative comparison of all sampling methods on view at source ↗

**Figure 13.** Figure 13: Qualitative comparison of all sampling methods on view at source ↗

**Figure 14.** Figure 14: Qualitative comparison of all sampling methods on view at source ↗

**Figure 15.** Figure 15: Qualitative comparison of all sampling methods on view at source ↗

**Figure 16.** Figure 16: High-resolution visualization of samples generated by each method on the view at source ↗

**Figure 17.** Figure 17: Ground truth brain CT scans from the RSNA dataset and their corresponding reconstructions obtained with SDB using 100-step sampling view at source ↗

**Figure 18.** Figure 18: Qualitative comparison of all sampling methods on view at source ↗

read the original abstract

Diffusion models are prone to generating structural hallucinations - samples that match the statistical properties of the training data yet defy underlying structural rules, resulting in anomalies like hands with more than five fingers. Recent research studied this failure mode from several viewpoints, offering partial explanations to their occurrence, such as mode interpolation. In this work, we propose a complementary perspective that treats hallucinations as instabilities on the model-induced manifold. We begin by showing that a hallucination filter based on such instabilities matches or exceeds the performance of the recently proposed temporal one. By tracing the source of these instabilities, we identify local intrinsic dimension (LID) as their primary driver and propose Intrinsic Quenching (IQ), a direct corrective mechanism that deflates it to alleviate hallucinations. IQ consistently outperforms standard hallucination reduction baselines across a wide array of benchmarks and offers a highly promising solution for enforcing anatomical consistency in downstream medical imaging tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper ties diffusion hallucinations to high local intrinsic dimension on the model manifold and introduces Intrinsic Quenching to deflate it, but the evidence for LID as the primary causal driver stays correlational.

read the letter

The main thing to know is that this work treats hallucinations as instabilities on the diffusion model's learned manifold, identifies local intrinsic dimension as the main driver behind them, and introduces Intrinsic Quenching as a targeted fix that reduces LID to improve output consistency. It reports that IQ beats standard baselines on a range of benchmarks and looks especially useful for keeping anatomical structure intact in medical imaging tasks. That framing is new relative to earlier explanations like mode interpolation or temporal filtering, and the initial result that an instability-based filter already matches or beats the temporal one is a clean starting point. The practical angle for medical applications gives it some immediate relevance if the gains hold up in real scans. The experiments appear to rest on empirical comparisons rather than purely theoretical derivations, which is fine as long as the metrics are clear and the baselines are fairly chosen. The soft spot is exactly the one the stress-test flags: the claim that LID is the primary driver comes from observed associations between elevated LID estimates and hallucinated regions, without a controlled intervention that changes only LID while holding score curvature, data density, and sampling noise fixed. Because of that, the step from correlation to designing IQ as a direct corrective mechanism rests on an assumption rather than a demonstrated isolation. If the full results include ablations that test whether deflating LID alone accounts for the improvement, or whether other factors move in tandem, that would strengthen the case; otherwise the superiority on benchmarks could be partly due to side effects. This paper is aimed at researchers who work on reliability and failure modes in generative models, especially those applying diffusion models to structured domains like medical imaging. A reader already thinking about manifold geometry or intrinsic dimension estimators would find the perspective useful even if they end up disagreeing on the causal weight. It deserves peer review because it surfaces a concrete, measurable problem with a fresh diagnostic tool and reports benchmark improvements that could be refined or extended in revision.

Referee Report

2 major / 1 minor

Summary. The paper claims that hallucinations in diffusion models are instabilities on the model-induced manifold, primarily driven by local intrinsic dimension (LID). It shows that a hallucination filter based on these instabilities matches or exceeds temporal filters, identifies LID as the main cause via tracing, and introduces Intrinsic Quenching (IQ) to deflate LID, claiming consistent outperformance over baselines on benchmarks and promise for anatomical consistency in medical imaging.

Significance. If the causal role of LID is demonstrated through controlled interventions and IQ reduces hallucinations without degrading sample quality or introducing artifacts, the work would provide a mechanistic handle on a key failure mode of diffusion models. This could be particularly valuable for high-stakes applications like medical imaging where structural fidelity is essential, extending beyond correlational observations in prior mode-interpolation studies.

major comments (2)

[Tracing the source of instabilities] The section tracing the source of instabilities to LID: the identification of LID as the 'primary driver' rests on observed associations between elevated LID estimates and unstable/hallucinated regions. No controlled intervention is described that modulates only LID (e.g., via targeted regularization) while holding fixed other entangled factors such as score-function curvature, data-density variations, and sampling noise. This makes the subsequent design of IQ as a 'direct corrective mechanism' rest on an untested causal assumption rather than a demonstrated mechanism.
[Results on benchmarks and medical imaging] The results section on IQ performance: the claim that IQ 'consistently outperforms standard hallucination reduction baselines across a wide array of benchmarks' is load-bearing for the contribution, yet the abstract and summary provide no quantitative details on metrics, effect sizes, statistical significance, or ablation controls (e.g., IQ vs. LID deflation alone). Without these, it is impossible to evaluate whether improvements are robust or specific to the proposed mechanism.

minor comments (1)

[Abstract] The abstract introduces 'Intrinsic Quenching (IQ)' without a concise statement of its algorithmic steps or how it specifically targets LID deflation (e.g., via score adjustment or sampling modification).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their detailed and insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment below.

read point-by-point responses

Referee: [Tracing the source of instabilities] The section tracing the source of instabilities to LID: the identification of LID as the 'primary driver' rests on observed associations between elevated LID estimates and unstable/hallucinated regions. No controlled intervention is described that modulates only LID (e.g., via targeted regularization) while holding fixed other entangled factors such as score-function curvature, data-density variations, and sampling noise. This makes the subsequent design of IQ as a 'direct corrective mechanism' rest on an untested causal assumption rather than a demonstrated mechanism.

Authors: We thank the referee for this important observation on establishing causality. Our tracing analysis in the manuscript demonstrates a strong association between elevated local intrinsic dimension and hallucinated regions through systematic examination of the model-induced manifold. While we did not include a controlled experiment that isolates LID modulation from other factors like score curvature or sampling noise—due to the practical difficulties in disentangling these in high-dimensional diffusion processes—we have strengthened the discussion to clarify that the identification of LID as primary is based on comparative analysis showing it outperforms other potential drivers in predictive power for instabilities. For IQ, it is explicitly designed to target and deflate LID, and empirical results show it reduces hallucinations more effectively than methods not targeting LID. In the revision, we have added a subsection discussing the challenges of causal isolation and how our approach provides a practical mechanism despite these entanglements. revision: partial
Referee: [Results on benchmarks and medical imaging] The results section on IQ performance: the claim that IQ 'consistently outperforms standard hallucination reduction baselines across a wide array of benchmarks' is load-bearing for the contribution, yet the abstract and summary provide no quantitative details on metrics, effect sizes, statistical significance, or ablation controls (e.g., IQ vs. LID deflation alone). Without these, it is impossible to evaluate whether improvements are robust or specific to the proposed mechanism.

Authors: We appreciate the referee pointing out the lack of quantitative specifics in the abstract. The full manuscript includes extensive quantitative results in the experiments section, with tables reporting metrics such as hallucination rates, FID scores, and other benchmarks, along with statistical significance tests and ablations comparing IQ to LID deflation variants. To address this, we have revised the abstract to include key quantitative highlights from the benchmark results and medical imaging tasks. We have also ensured that the summary and introduction now reference these details more explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on empirical tracing and benchmark comparisons

full rationale

The paper presents no mathematical derivation chain or equations that reduce its conclusions to inputs by construction. The identification of LID as the primary driver of manifold instabilities is described as the outcome of observational tracing, followed by an empirical proposal of IQ validated against baselines; this structure remains independent of self-referential definitions, fitted predictions renamed as forecasts, or load-bearing self-citations. The argument is self-contained against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; Intrinsic Quenching is introduced as a new mechanism but its internal formulation is not detailed.

pith-pipeline@v0.9.0 · 5482 in / 1113 out tokens · 54174 ms · 2026-05-08T18:00:51.241251+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

207 extracted references · 10 canonical work pages

[1]

Scaling Learning Algorithms Towards

Bengio, Yoshua and LeCun, Yann , booktitle =. Scaling Learning Algorithms Towards
[2]

and Osindero, Simon and Teh, Yee Whye , journal =

Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee Whye , journal =. A Fast Learning Algorithm for Deep Belief Nets , volume =
[3]

2016 , publisher=

Deep learning , author=. 2016 , publisher=

2016
[4]

International Conference on Learning Representations , year=

Explaining Image Classifiers by Counterfactual Generation , author=. International Conference on Learning Representations , year=
[5]

Public Library of Science , title =

Bach, Sebastian AND Binder, Alexander AND Montavon, Grégoire AND Klauschen, Frederick AND Müller, Klaus-Robert AND Samek, Wojciech , journal =. Public Library of Science , title =
[6]

IEEE journal on selected areas in information theory , year=

Theoretical perspectives on deep learning methods in inverse problems , author=. IEEE journal on selected areas in information theory , year=
[7]

Advances in neural information processing systems , year=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , year=
[8]

International Conference on Machine Learning , year=

Deep unsupervised learning using nonequilibrium thermodynamics , author=. International Conference on Machine Learning , year=
[9]

International Conference on Learning Representations , year=

Score-Based Generative Modeling through Stochastic Differential Equations , author=. International Conference on Learning Representations , year=
[10]

Liu, Guan-Horng and Vahdat, Arash and Huang, De-An and Theodorou, Evangelos and Nie, Weili and Anandkumar, Anima , booktitle=
[11]

Sur la th

Schr. Sur la th. Annales de l'institut Henri Poincar
[12]

1982 , publisher =

Reverse-time diffusion equation models , author=. 1982 , publisher =

1982
[13]

Advances in neural information processing systems , year=

Diffusion Models Beat GANs on Image Synthesis , author=. Advances in neural information processing systems , year=
[14]

International Conference on Machine Learning , pages=

Counterfactual visual explanations , author=. International Conference on Machine Learning , pages=. 2019 , organization=

2019
[15]

International Conference on Learning Representations , year=

Explanation by Progressive Exaggeration , author=. International Conference on Learning Representations , year=
[16]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Explaining in style: Training a gan to explain a classifier in stylespace , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[17]

International Conference on Artificial Intelligence and Statistics , pages=

Generating interpretable counterfactual explanations by implicit minimisation of epistemic and aleatoric uncertainties , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

2021
[18]

Advances in Neural Information Processing Systems , year=

Thiagarajan, Jayaraman and Narayanaswamy, Vivek Sivaraman and Rajan, Deepta and Liang, Jia and Chaudhari, Akshay and Spanias, Andreas , title=. Advances in Neural Information Processing Systems , year=
[19]

Proceedings of the Asian Conference on Computer Vision , pages=

Diffusion models for counterfactual explanations , author=. Proceedings of the Asian Conference on Computer Vision , pages=
[20]

Advances in Neural Information Processing Systems , year =

Augustin, Maximilian and Boreiko, Valentyn and Croce, Francesco and Hein, Matthias , title=. Advances in Neural Information Processing Systems , year =
[21]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Adversarial counterfactual visual explanations , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[22]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Repaint: Inpainting using denoising diffusion probabilistic models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[23]

European Conference on Computer Vision , year =

Fast diffusion-based counterfactuals for shortcut removal and generation , author=. European Conference on Computer Vision , year =
[24]

European Conference on Computer Vision , year =

Global Counterfactual Directions , author=. European Conference on Computer Vision , year =
[25]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Diffusion autoencoders: Toward a meaningful and decodable representation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[26]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

Text-to-image models for counterfactual explanations: a black-box approach , author=. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=
[27]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[28]

arXiv preprint arXiv:2310.06668 , year=

Farid, Karim and Schrodi, Simon and Argus, Max and Brox, Thomas , title=. arXiv preprint arXiv:2310.06668 , year=

work page arXiv
[29]

arXiv preprint arXiv:2406.01649 , year=

Motzkus, Franz and Hellert, Christian and Schmid, Ute , title=. arXiv preprint arXiv:2406.01649 , year=

work page arXiv
[30]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year=

Analyzing and Explaining Image Classifiers via Diffusion Guidance , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year=
[31]

Advances in Neural Information Processing Systems , year=

Laion-5b: An open large-scale dataset for training next generation image-text models , author=. Advances in Neural Information Processing Systems , year=
[32]

2009 IEEE conference on computer vision and pattern recognition , pages=

Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

2009
[33]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
[34]

Breakthroughs in Statistics: Foundations and basic theory , year=

Robbins, Herbert E , title=. Breakthroughs in Statistics: Foundations and basic theory , year=
[35]

Proceedings of the IEEE conference on computer vision and pattern recognition , year=

The unreasonable effectiveness of deep features as a perceptual metric , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , year=
[36]

International Conference on Machine Learning , pages=

Axiomatic attribution for deep networks , author=. International Conference on Machine Learning , pages=. 2017 , organization=

2017
[37]

Captum: A unified and generic model inter- pretability library for PyTorch,

Captum: A unified and generic model interpretability library for pytorch , author=. arXiv preprint arXiv:2009.07896 , year =

work page arXiv 2009
[38]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Segment anything , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[39]

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

Grounding dino: Marrying dino with grounded pre-training for open-set object detection , author=. arXiv preprint arXiv:2303.05499 , year=

work page Pith review arXiv
[40]

Advances in Neural Information Processing Systems , year=

Improving diffusion models for inverse problems using manifold constraints , author=. Advances in Neural Information Processing Systems , year=
[41]

Advances in Neural Information Processing Systems , year=

Denoising diffusion restoration models , author=. Advances in Neural Information Processing Systems , year=
[42]

Advances in Neural Information Processing Systems , year=

Direct Diffusion Bridge using Data Consistency for Inverse Problems , author=. Advances in Neural Information Processing Systems , year=
[43]

arXiv preprint arXiv:2406.17399 , year=

GradCheck: Analyzing classifier guidance gradients for conditional diffusion sampling , author=. arXiv preprint arXiv:2406.17399 , year=

work page arXiv
[44]

International Conference on Learning Representations , year=

Adam: A Method for Stochastic Optimization , author=. International Conference on Learning Representations , year=
[45]

ACM SIGGRAPH 2022 conference proceedings , pages=

Palette: Image-to-image diffusion models , author=. ACM SIGGRAPH 2022 conference proceedings , pages=

2022
[46]

Advances in neural information processing systems , year=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , year=
[47]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Exploring simple siamese representation learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[48]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Cycle-consistent counterfactuals by latent transformations , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[49]

2022 , organization=

Holzinger, Andreas and Saranti, Anna and Molnar, Christoph and Biecek, Przemyslaw and Samek, Wojciech , booktitle=. 2022 , organization=

2022
[50]

International Conference on Machine Learning , year=

Position paper: Do not explain (vision models) without context , author=. International Conference on Machine Learning , year=
[51]

The Eleventh International Conference on Learning Representations , year=

Diffusion Posterior Sampling for General Noisy Inverse Problems , author=. The Eleventh International Conference on Learning Representations , year=
[52]

International Conference on Machine Learning , pages=

Learning transferable visual models from natural language supervision , author=. International Conference on Machine Learning , pages=. 2021 , organization=

2021
[53]

2nd International Conference on Learning Representations, Workshop Track Proceedings , year =

Karen Simonyan and Andrea Vedaldi and Andrew Zisserman , title =. 2nd International Conference on Learning Representations, Workshop Track Proceedings , year =
[54]

PloS one , year=

On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation , author=. PloS one , year=
[55]

International Conference on Machine Learning , pages=

Learning important features through propagating activation differences , author=. International Conference on Machine Learning , pages=. 2017 , organization=

2017
[56]

International Conference on Learning Representations, Workshop Track Proceedings , year=

Striving for simplicity: The all convolutional net , author=. International Conference on Learning Representations, Workshop Track Proceedings , year=
[57]

International journal of computer vision , year=

Grad-CAM: visual explanations from deep networks via gradient-based localization , author=. International journal of computer vision , year=
[58]

Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 , pages=

Visualizing and understanding convolutional networks , author=. Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 , pages=. 2014 , organization=

2014
[59]

Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , booktitle=
[60]

and Lee, Su-In , title =

Lundberg, Scott M. and Lee, Su-In , title =. 2017 , booktitle =

2017
[61]

Computational Optimal Transport: With Applications to Data Science , year =

Peyr\'. Computational Optimal Transport: With Applications to Data Science , year =. Foundations and Trends in Machine Learning , pages =
[62]

McCann , year =

Robert J. McCann , year =. A Convexity Principle for Interacting Gases , booktitle =
[63]

An introduction to the mathematical theory of inverse problems , publisher =

Kirsch, Andreas , year =. An introduction to the mathematical theory of inverse problems , publisher =
[64]

Advances in Neural Information Processing Systems , year=

Designing counterfactual generators using deep model inversion , author=. Advances in Neural Information Processing Systems , year=
[65]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Octet: Object-aware counterfactual explanations , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[66]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages =

Beyond trivial counterfactual explanations with diverse valuable explanations , author =. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages =
[67]

European Conference on Computer Vision , pages =

STEEX: steering counterfactual explanations with semantics , author =. European Conference on Computer Vision , pages =
[68]

International Conference on Machine Learning , pages =

GANMEX: One-vs-one attributions using GAN-based model explainability , author =. International Conference on Machine Learning , pages =
[69]

International Conference on Learning Representations , year=

Generating Natural Adversarial Examples , author=. International Conference on Learning Representations , year=
[70]

Joint European Conference on Machine Learning and Knowledge Discovery in Databases , pages =

Interpretable counterfactual explanations guided by prototypes , author =. Joint European Conference on Machine Learning and Knowledge Discovery in Databases , pages =
[71]

DAGM German Conference on Pattern Recognition , pages=

Sparse visual counterfactual explanations in image space , author=. DAGM German Conference on Pattern Recognition , pages=. 2022 , organization=

2022
[72]

arXiv preprint arXiv:2204.12143 , year=

Deeper Insights into the Robustness of ViTs towards Common Corruptions , author=. arXiv preprint arXiv:2204.12143 , year=

work page arXiv
[73]

2019 , booktitle=

Robustness (Python Library) , author=. 2019 , booktitle=

2019
[74]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Swin transformer: Hierarchical vision transformer using shifted windows , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
[75]

Advances in neural information processing systems , year=

Pytorch: An imperative style, high-performance deep learning library , author=. Advances in neural information processing systems , year=
[76]

International Conference on Learning Representations , year=

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , author=. International Conference on Learning Representations , year=
[77]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

A convnet for the 2020s , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[78]

International Conference on Learning Representations , year=

Very deep convolutional networks for large-scale image recognition , author=. International Conference on Learning Representations , year=
[79]

Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 , pages=

U-net: Convolutional networks for biomedical image segmentation , author=. Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 , pages=. 2015 , organization=

2015
[80]

International Conference on Machine Learning , pages =

Position: Explain to Question not to Justify , author =. International Conference on Machine Learning , pages =. 2024 , volume =

2024

Showing first 80 references.