Hallucination-Aware Diffusion Sampling for Inverse Problems via Robust Prior Updates
Pith reviewed 2026-06-28 15:09 UTC · model grok-4.3
The pith
Robust prior updates reduce measurement-conditioned hallucinations in diffusion inverse solvers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Diffusion inverse solvers can be decomposed into a prior update and a measurement-conditioning step, with hallucinations entering via the prior-side proposal. Robust Prior Update (RPU) probes the local stability of the diffusion prior update, re-anchors the resulting displacement at the current iterate, and leaves the measurement update unchanged. When applied to DPS on FFHQ box inpainting, Gaussian deblurring, and motion deblurring, RPU improves PSNR and LPIPS; human studies show 91.9% blind non-tie preference and 91.1% ground-truth-assisted non-tie preference on FFHQ box inpainting, with similar favoring among non-ties on ImageNet.
What carries the argument
Robust Prior Update (RPU) module that probes local stability of the diffusion prior update and re-anchors the displacement at the current iterate.
If this is right
- RPU improves PSNR and LPIPS over DPS on box inpainting, Gaussian deblurring, and motion deblurring for FFHQ images.
- In human judgments on FFHQ box inpainting, RPU receives 91.9% of blind non-tie majority preferences and 91.1% of ground-truth-assisted non-tie preferences.
- The improvement holds especially when the prior shapes weakly constrained content.
- The measurement-conditioning step remains unchanged while only the prior update is modified.
Where Pith is reading between the lines
- RPU could be instantiated in other Bayes-rule-based diffusion inverse solvers beyond DPS to test whether the faithfulness gain generalizes.
- The separation into prior and measurement steps suggests that similar stability checks might apply to non-diffusion generative models used for inverse problems.
- Evaluating RPU on additional inverse tasks such as super-resolution or compressed sensing would clarify the range of problems where prior robustness matters most.
Load-bearing premise
Hallucinations enter diffusion inverse solvers through the prior-side proposal before the measurement correction is applied.
What would settle it
Applying RPU to DPS on the FFHQ and ImageNet inverse problems and finding no improvement or a decrease in PSNR, LPIPS, or human faithfulness preferences would falsify the claim.
Figures
read the original abstract
Diffusion-based inverse problem solvers can produce realistic reconstructions, but realism alone does not ensure that the recovered details are supported by the measurement. We study this failure as measurement-conditioned hallucination: visually meaningful content that is either implausible or inconsistent with the measured instance. Our analysis separates Bayes-rule-based diffusion inverse solvers into a prior update and a measurement-conditioning step, showing that hallucinated content can enter through the prior-side proposal before the measurement correction is applied. Motivated by this view, we propose Robust Prior Update (RPU), a solver-level module that probes the local stability of the diffusion prior update, re-anchors the resulting displacement at the current iterate, and leaves the measurement update unchanged. We instantiate RPU in DPS and evaluate it on FFHQ and ImageNet inverse problems using automatic metrics and human faithfulness studies. On FFHQ, RPU improves PSNR and LPIPS over DPS across box inpainting, Gaussian deblurring, and motion deblurring. In human judgments, RPU receives 91.9% of blind non-tie majority preferences and 91.1% of ground-truth-assisted non-tie preferences on FFHQ box inpainting, while the ImageNet Gaussian reader study is tie-heavy but favors RPU among non-tie cases. These results support a targeted claim: robustifying the prior update can improve instance faithfulness in diffusion inverse solvers, especially when the prior shapes weakly constrained content.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes measurement-conditioned hallucinations in Bayes-rule-based diffusion inverse solvers by separating them into an independent prior update (where hallucinations can enter) and a measurement-conditioning step. It proposes Robust Prior Update (RPU), a module that probes local stability of the prior update, re-anchors the displacement, and leaves the measurement update unchanged. RPU is instantiated in DPS and evaluated on FFHQ and ImageNet inverse problems (box inpainting, Gaussian/motion deblurring), reporting PSNR/LPIPS gains and strong human-study preferences (e.g., 91.9% non-tie majority on FFHQ box inpainting).
Significance. If the separation and attribution hold, RPU offers a lightweight, solver-level intervention that improves instance faithfulness without altering the measurement term, supported by both automatic metrics and human faithfulness studies. The targeted claim about prior-side robustness for weakly constrained content is falsifiable via the reported protocols.
major comments (2)
- [§2] §2 (analysis of Bayes-rule solvers): the separation into an independent prior update followed by measurement correction is load-bearing for the motivation, yet the standard DPS formulation combines the unconditional score and measurement gradient into a single update at each timestep; this coupling means inconsistent content is not cleanly proposed first and then corrected, weakening the attribution that robustifying only the prior proposal selectively suppresses hallucinations.
- [§4.2] §4.2 (human study protocol): the reported 91.9% and 91.1% non-tie preferences on FFHQ box inpainting are central to the faithfulness claim, but the manuscript does not detail how ties are defined, how ground-truth-assisted judgments are elicited, or inter-rater agreement; without these, the preference percentages cannot be interpreted as evidence that RPU improves instance faithfulness over DPS.
minor comments (2)
- [Tables 1-2] Table 1 and 2: clarify whether the reported PSNR/LPIPS deltas are statistically significant across the 5 random seeds or runs mentioned in the experimental setup.
- [Alg. 1] Notation: the definition of the stability probe in the RPU algorithm (Alg. 1) uses an unspecified threshold; make the hyperparameter explicit and report its sensitivity.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. Below we respond point-by-point to the two major comments. We agree that additional clarification and protocol details are needed and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [§2] §2 (analysis of Bayes-rule solvers): the separation into an independent prior update followed by measurement correction is load-bearing for the motivation, yet the standard DPS formulation combines the unconditional score and measurement gradient into a single update at each timestep; this coupling means inconsistent content is not cleanly proposed first and then corrected, weakening the attribution that robustifying only the prior proposal selectively suppresses hallucinations.
Authors: We acknowledge that the DPS update rule is a single combined step. Our Section 2 analysis nevertheless decomposes the update mathematically into the prior-update term (unconditional score) and the measurement-gradient term to isolate where hallucinated content can first appear. RPU is applied only to the prior component while leaving the measurement term untouched. To address the coupling concern, we will add an explicit paragraph in the revised Section 2 explaining the interaction of the two terms within the combined update and why selectively stabilizing the prior term remains a targeted intervention. revision: partial
-
Referee: [§4.2] §4.2 (human study protocol): the reported 91.9% and 91.1% non-tie preferences on FFHQ box inpainting are central to the faithfulness claim, but the manuscript does not detail how ties are defined, how ground-truth-assisted judgments are elicited, or inter-rater agreement; without these, the preference percentages cannot be interpreted as evidence that RPU improves instance faithfulness over DPS.
Authors: The referee is correct that the current manuscript lacks sufficient protocol details. In the revision we will expand Section 4.2 with: (i) the exact definition used to classify a judgment as a tie, (ii) the step-by-step procedure for eliciting ground-truth-assisted judgments, and (iii) inter-rater agreement statistics. These additions will allow readers to properly interpret the reported preference percentages. revision: yes
Circularity Check
No circularity; additive module with independent analysis
full rationale
The provided abstract and context present the core contribution as an analysis that separates Bayes-rule-based solvers into prior update and measurement-conditioning steps, followed by an additive RPU module that probes stability in the prior update only. No equations, fitted parameters, or self-citations are exhibited that reduce the claimed separation or the RPU construction to its own inputs by definition. The derivation does not rename known results, smuggle ansatzes via self-citation, or treat a fitted quantity as a prediction. The central claim remains an empirical proposal supported by external evaluations rather than a self-referential re-derivation.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Compressed sensing mri
Michael Lustig, David L Donoho, Juan M Santos, and John M Pauly. Compressed sensing mri. IEEE signal processing magazine, 25(2):72–82, 2008
2008
-
[2]
Image super-resolution using deep convolutional networks.IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2015
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Image super-resolution using deep convolutional networks.IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2015
2015
-
[3]
Image super-resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726, 2022
Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J Fleet, and Mohammad Norouzi. Image super-resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726, 2022
2022
-
[4]
Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information.IEEE Transactions on information theory, 52(2):489–509, 2006
Emmanuel J Candès, Justin Romberg, and Terence Tao. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information.IEEE Transactions on information theory, 52(2):489–509, 2006
2006
-
[5]
Compressed sensing.IEEE Transactions on information theory, 52(4):1289– 1306, 2006
David L Donoho. Compressed sensing.IEEE Transactions on information theory, 52(4):1289– 1306, 2006
2006
-
[6]
Phase retrieval algorithms: a comparison.Applied optics, 21(15):2758–2769, 1982
James R Fienup. Phase retrieval algorithms: a comparison.Applied optics, 21(15):2758–2769, 1982
1982
-
[7]
SIAM, 2005
Albert Tarantola.Inverse problem theory and methods for model parameter estimation. SIAM, 2005
2005
-
[8]
Theoretical perspectives on deep learning methods in inverse problems.IEEE journal on selected areas in information theory, 3(3):433–453, 2023
Jonathan Scarlett, Reinhard Heckel, Miguel RD Rodrigues, Paul Hand, and Yonina C Eldar. Theoretical perspectives on deep learning methods in inverse problems.IEEE journal on selected areas in information theory, 3(3):433–453, 2023
2023
-
[9]
Survey of hallucination in natural language generation
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation. ACM computing surveys, 55(12):1–38, 2023
2023
-
[10]
Understanding hal- lucinations in diffusion models through mode interpolation.Advances in neural information processing systems, 37:134614–134644, 2024
Sumukh K Aithal, Pratyush Maini, Zachary Lipton, and J Zico Kolter. Understanding hal- lucinations in diffusion models through mode interpolation.Advances in neural information processing systems, 37:134614–134644, 2024
2024
-
[11]
Yiqi Tian, Pengfei Jin, Mingze Yuan, Na Li, Bo Zeng, and Quanzheng Li. Rods: Robust optimization inspired diffusion sampling for detecting and reducing hallucination in generative models.arXiv preprint arXiv:2507.12201, 2025
-
[12]
As a sequence of before stack migrations
Patrick Lailly. As a sequence of before stack migrations. InConference on inverse scattering– Theory and application, volume 11, pages 206–220. Siam, 1983
1983
-
[13]
Solving audio inverse problems with a diffusion model
Eloi Moliner, Jaakko Lehtinen, and Vesa Välimäki. Solving audio inverse problems with a diffusion model. InICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023. 12
2023
-
[14]
Solving inverse problems in medical imaging with score-based generative models, 2021
Yang Song, Liyue Shen, Lei Xing, and Stefano Ermon. Solving inverse problems in medical imaging with score-based generative models.arXiv preprint arXiv:2111.08005, 2021
-
[15]
Score-based diffusion models for accelerated mri.Medical image analysis, 80:102479, 2022
Hyungjin Chung and Jong Chul Ye. Score-based diffusion models for accelerated mri.Medical image analysis, 80:102479, 2022
2022
-
[16]
On hallucinations in tomographic image reconstruction.IEEE transactions on medical imaging, 40(11):3249–3260, 2021
Sayantan Bhadra, Varun A Kelkar, Frank J Brooks, and Mark A Anastasio. On hallucinations in tomographic image reconstruction.IEEE transactions on medical imaging, 40(11):3249–3260, 2021
2021
-
[17]
The troublesome kernel: On hallucinations, no free lunches, and the accuracy-stability tradeoff in inverse problems.SIAM Review, 67(1):73–104, 2025
Nina M Gottschling, Vegard Antun, Anders C Hansen, and Ben Adcock. The troublesome kernel: On hallucinations, no free lunches, and the accuracy-stability tradeoff in inverse problems.SIAM Review, 67(1):73–104, 2025
2025
-
[18]
Hallu- cination index: An image quality metric for generative reconstruction models
Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, and Quanzheng Li. Hallu- cination index: An image quality metric for generative reconstruction models. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 449–458. Springer, 2024
2024
-
[19]
Tackling hallucination from conditional models for medical image reconstruction with dynamicdps
Seunghoi Kim, Henry FJ Tregidgo, Matteo Figini, Chen Jin, Sarang Joshi, and Daniel C Alexander. Tackling hallucination from conditional models for medical image reconstruction with dynamicdps. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 593–603. Springer, 2025
2025
-
[20]
A Survey on Diffusion Models for Inverse Problems
Giannis Daras, Hyungjin Chung, Chieh-Hsin Lai, Yuki Mitsufuji, Jong Chul Ye, Peyman Milanfar, Alexandros G Dimakis, and Mauricio Delbracio. A survey on diffusion models for inverse problems.arXiv preprint arXiv:2410.00083, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[21]
Improving diffusion models for inverse problems using manifold constraints, 2022
Hyungjin Chung, Byeongsu Sim, Dohoon Ryu, and Jong Chul Ye. Improving diffusion models for inverse problems using manifold constraints, 2022
2022
-
[22]
McCann, Marc L
Hyungjin Chung, Jeongsol Kim, Michael T. McCann, Marc L. Klasky, and Jong Chul Ye. Diffusion posterior sampling for general noisy inverse problems. InInternational Conference on Learning Representations, 2023
2023
-
[23]
Denoising diffusion restoration models
Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song. Denoising diffusion restoration models. InAdvances in Neural Information Processing Systems, 2022
2022
-
[24]
Zero-shot image restoration using denoising diffusion null-space model, 2022
Yinhuai Wang, Jiwen Yu, and Jian Zhang. Zero-shot image restoration using denoising diffusion null-space model, 2022
2022
-
[25]
Pseudoinverse-guided diffusion models for inverse problems
Jiaming Song, Arash Vahdat, Morteza Mardani, and Jan Kautz. Pseudoinverse-guided diffusion models for inverse problems. InInternational Conference on Learning Representations, 2023
2023
-
[26]
Denoising diffusion models for plug-and-play image restoration, 2023
Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, and Luc Van Gool. Denoising diffusion models for plug-and-play image restoration, 2023
2023
-
[27]
Morteza Mardani, Jiaming Song, Jan Kautz, and Arash Vahdat. A variational perspective on solving inverse problems with diffusion models.arXiv preprint arXiv:2305.04391, 2023
-
[28]
Compressed sensing using generative models
Ashish Bora, Ajil Jalal, Eric Price, and Alexandros G Dimakis. Compressed sensing using generative models. InInternational conference on machine learning, pages 537–546. PMLR, 2017. 13
2017
-
[29]
Solving inverse problems with ambient diffusion
Giannis Daras and Alex Dimakis. Solving inverse problems with ambient diffusion. InNeurIPS 2023 Workshop on Deep Learning and Inverse Problems, 2023
2023
-
[30]
Prac- tical and asymptotically exact conditional sampling in diffusion models.Advances in Neural Information Processing Systems, 36:31372–31403, 2023
Luhuan Wu, Brian Trippe, Christian Naesseth, David Blei, and John P Cunningham. Prac- tical and asymptotically exact conditional sampling in diffusion models.Advances in Neural Information Processing Systems, 36:31372–31403, 2023
2023
-
[31]
Kelkar, Frank J
Sayantan Bhadra, Varun A. Kelkar, Frank J. Brooks, and Mark A. Anastasio. On hallucinations in tomographic image reconstruction, 2020
2020
-
[32]
Looks too good to be true: An information-theoretic analysis of hallucinations in generative restoration models
Regev Cohen, Idan Kligvasser, Ehud Rivlin, and Daniel Freedman. Looks too good to be true: An information-theoretic analysis of hallucinations in generative restoration models. In Advances in Neural Information Processing Systems, 2024
2024
-
[33]
Hallucination index: An image quality metric for generative reconstruction models, 2024
Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, and Quanzheng Li. Hallucination index: An image quality metric for generative reconstruction models, 2024
2024
-
[34]
Seunghoi Kim, Henry F. J. Tregidgo, Matteo Figini, Chen Jin, Sarang Joshi, and Daniel C. Alexander. Tackling hallucination from conditional models for medical image reconstruction with dynamicdps, 2025
2025
-
[35]
Robust optimization–methodology and applications
Aharon Ben-Tal and Arkadi Nemirovski. Robust optimization–methodology and applications. Mathematical programming, 92(3):453–480, 2002
2002
-
[36]
Theory and applications of robust optimization.SIAM review, 53(3):464–501, 2011
Dimitris Bertsimas, David B Brown, and Constantine Caramanis. Theory and applications of robust optimization.SIAM review, 53(3):464–501, 2011
2011
-
[37]
ZhengsongLuandBoZeng. Two-stagedistributionallyrobustoptimization: Intuitiveunderstand- ing and algorithm development from the primal perspective.arXiv preprint arXiv:2412.20708, 2024
-
[38]
Towards Deep Learning Models Resistant to Adversarial Attacks
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks.arXiv preprint arXiv:1706.06083, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[39]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. Virtual adversarial training: a regularization method for supervised and semi-supervised learning.IEEE transactions on pattern analysis and machine intelligence, 41(8):1979–1993, 2018
1979
-
[40]
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret, Ariel Kleiner, Hossein Mobahi, and Behnam Neyshabur. Sharpness-aware mini- mization for efficiently improving generalization.arXiv preprint arXiv:2010.01412, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[41]
Solving inverse problems in medical imaging with score-based generative models, 2021
Yang Song, Liyue Shen, Lei Xing, and Stefano Ermon. Solving inverse problems in medical imaging with score-based generative models, 2021
2021
-
[42]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018
2018
-
[43]
Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in neural information processing systems, 30, 2017
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in neural information processing systems, 30, 2017. 14 A Proofs and Details for Section 4 A.1 Proof of Proposition 1 Proof.Fixσand writeg=g σ,L g =L g,σ, andF=f+g. The split u...
2017
-
[44]
the fixed-σk contraction in Proposition 1 holds uniformly insideBσk
-
[45]
consecutive basins overlap, and the output of theσk stage lies insideBσk+1
-
[46]
GT R/D/T
the objective drift is controlled on the relevant basin, for example sup x∈Bσk ∩Bσk+1 Fσk+1(x)−F σk(x) ≤∆ k, X k ∆k <∞;(38) 4.F σk →F 0 locally uniformly asσk ↓0. Under these assumptions, the iterate can be passed from one local basin to the next while preserving the fixed-σ descent behavior up to the summable drift terms. Thus the continuation procedure ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.