CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing

Gitta Kutyniok; Ines Rosellon-Inclan; Jean-Luc Starck; Jianfei Li

arxiv: 2512.09806 · v2 · pith:E2IARPAMnew · submitted 2025-12-10 · 💻 cs.CV · cs.AI

CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing

Jianfei Li , Ines Rosellon-Inclan , Gitta Kutyniok , Jean-Luc Starck This is my paper

Pith reviewed 2026-05-21 17:34 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords hallucinationdeep learningimage reconstructionwaveletshearletconformal predictionU-Netastronomical imaging

0 comments

The pith

CHEM identifies hallucination-prone regions in deep learning image reconstructions using wavelet and shearlet features plus conformal quantile regression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework to quantify and locate hallucinations, or unrealistic artifacts, that deep learning models produce during image reconstruction. Such artifacts matter in safety-critical uses like astronomy because they can distort analysis of real data. The approach projects predictions into wavelet and shearlet bases to isolate feature-level regions likely to contain hallucinations. Conformalized quantile regression then supplies distribution-free estimates of hallucination severity. The authors further show through approximation theory why U-shaped networks commonly used for these tasks tend to generate hallucinated outputs.

Core claim

The paper establishes that the Conformal Hallucination Estimation Metric (CHEM) localizes hallucination-prone regions at the level of image features by means of wavelet and shearlet representations and assesses hallucination levels via conformalized quantile regression in a distribution-free manner. A theoretical analysis characterizes CHEM's sensitivity to hallucinated artifacts and its connection to mean squared error. Adopting an approximation-theory viewpoint, the work explains why U-shaped networks are prone to hallucination-prone predictions.

What carries the argument

The Conformal Hallucination Estimation Metric (CHEM), which combines wavelet and shearlet representations for feature localization with conformalized quantile regression for distribution-free hallucination assessment.

If this is right

CHEM can highlight specific regions within a model's output image that are most likely to contain hallucinations.
The method supplies a distribution-free score for comparing hallucination tendencies across different reconstruction architectures.
Approximation theory analysis indicates that U-shaped networks inherently favor predictions containing hallucinations in reconstruction settings.
The framework applies to both astronomical image deconvolution on datasets such as CANDELS and natural-image super-resolution on datasets such as DIV2K.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

CHEM could be inserted into training loops to penalize hallucination-prone regions and encourage more reliable architectures.
The same wavelet-shearlet plus conformal pipeline might transfer to other inverse imaging problems such as denoising or inpainting.
Linking CHEM scores to downstream task performance could yield practical uncertainty maps for scientific image analysis pipelines.

Load-bearing premise

Hallucinated artifacts remain distinguishable from true signal features once projected into wavelet and shearlet bases, allowing conformal quantile regression to isolate them without being confounded by model biases or dataset artifacts.

What would settle it

In experiments on images with synthetically inserted known hallucinations, CHEM fails to assign high scores to the modified regions or its scores do not correlate with the size of the introduced artifacts.

Figures

Figures reproduced from arXiv: 2512.09806 by Gitta Kutyniok, Ines Rosellon-Inclan, Jean-Luc Starck, Jianfei Li.

**Figure 2.** Figure 2: U-shaped network architectures. The foundational com [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Quantifying hallucinations of a U-Net trained with [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Quantifying hallucinations of U-shaped networks trained with different loss functions using db8. The predicted images are [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: MSE/CHEM-FWHM curves under different dictionaries. This figure illustrates the effect of the chosen representation. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Image deconvolution task on the CANDELS dataset. [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: MSE/CHEM-FWHM curves. We analyze the changes in [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Evolution of the db8-based CHEM and training loss over [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Point spread functions (PSFs) with varying FWHM val [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 10.** Figure 10: Example: Tikhonet with a U-shaped denoising module. For SUNet, blue filled rectangles indicate Swin-Transformer blocks, [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗

**Figure 11.** Figure 11: Reproduction of the anlysis in Figure 4, now including coarse-scale coefficients. Incorporating all coefficients produces broader, [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗

read the original abstract

Deep learning-based methods have recently achieved significant success in image reconstruction problems. However, challenges have emerged, as these methods may generate unrealistic artifacts or hallucinations, which can interfere with analysis in safety-critical scenarios. This paper introduces a framework for quantifying and characterizing hallucinated artifacts in image reconstruction models. The proposed method, termed the Conformal Hallucination Estimation Metric (CHEM), enables the identification of hallucination-prone regions in model predictions. It leverages wavelet and shearlet representations to localize such regions at the level of image features, and uses conformalized quantile regression to assess hallucination levels in a distribution-free manner. A theoretical analysis is provided, characterizing the sensitivity of CHEM to hallucinated artifacts and its relationship to the mean squared error. Building on these insights and adopting a viewpoint grounded in approximation theory, we investigate why U-shaped networks, widely used architectures for image reconstruction, tend to hallucination-prone predictions. We assess the effectiveness of the proposed approach on astronomical image deconvolution using the CANDELS dataset with architectures such as U-Net, SwinUNet, and Learnlets, and on natural image super-resolution using the DIV2K dataset with models such as DRUNet, Unfolded DRS, RAM, and DPS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CHEM combines wavelets with conformal quantile regression to flag hallucination regions in image reconstruction, but the separation between artifacts and real features in coefficient space is not directly validated.

read the letter

The paper's core offering is CHEM, a metric that localizes hallucination-prone areas in reconstructed images using wavelet and shearlet bases, then scores them with conformalized quantile regression in a distribution-free way. It also sketches a theoretical link to mean squared error and an approximation-theory account of why U-nets produce hallucinations. They test the approach on astronomical deconvolution with CANDELS and natural-image super-resolution with DIV2K across several models including U-Net variants and DRUNet.

Referee Report

3 major / 2 minor

Summary. The paper introduces the Conformal Hallucination Estimation Metric (CHEM) for quantifying and localizing hallucinations in deep learning models for image reconstruction. CHEM combines wavelet and shearlet representations to identify hallucination-prone regions at the feature level with conformalized quantile regression to provide distribution-free hallucination level assessment. It includes a theoretical analysis characterizing CHEM's sensitivity to hallucinations and its relation to mean squared error, plus an approximation-theory explanation for why U-shaped networks tend to produce hallucination-prone outputs. The approach is evaluated on astronomical image deconvolution using the CANDELS dataset with U-Net, SwinUNet, and Learnlets, and on natural image super-resolution using DIV2K with DRUNet, Unfolded DRS, RAM, and DPS.

Significance. If the central claims hold, CHEM offers a principled, distribution-free tool for detecting and understanding hallucinations in image processing models, which is valuable for safety-critical domains such as astronomy. The combination of standard multiscale bases with conformal quantile regression provides a concrete way to localize issues without strong parametric assumptions, and the theoretical links to MSE and approximation theory supply useful insight into architectural tendencies. The dual-domain evaluation (astronomical and natural images) strengthens the empirical grounding.

major comments (3)

[Theoretical analysis] The abstract states a theoretical analysis relating CHEM to mean squared error and an approximation-theory explanation for U-Net hallucinations, but without the full derivations or explicit error bounds it is impossible to verify whether the central claims hold.
[Method and sensitivity analysis] The central claim requires that hallucinated artifacts produce reliably separable signatures in wavelet and shearlet coefficients so that conformal quantile regression can isolate them; the sensitivity analysis links CHEM to MSE but does not derive or empirically test a separation margin against ground-truth artifacts.
[Experiments] Post-hoc dataset choices on CANDELS appear in the experimental setup; this undermines the cross-dataset robustness claim for hallucination identification.

minor comments (2)

[§3] Clarify the precise definition of the conformal quantile regression thresholds and how they are computed from the calibration set.
[Discussion] Add a short discussion of computational overhead for the wavelet/shearlet transforms and conformal step relative to baseline inference.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment below, providing clarifications and indicating planned revisions where appropriate to strengthen the presentation of our results.

read point-by-point responses

Referee: [Theoretical analysis] The abstract states a theoretical analysis relating CHEM to mean squared error and an approximation-theory explanation for U-Net hallucinations, but without the full derivations or explicit error bounds it is impossible to verify whether the central claims hold.

Authors: We thank the referee for highlighting this point. The relation between CHEM and MSE is derived in Section 4 using the properties of conformal quantile regression applied to the multiscale coefficients, and the approximation-theory argument for U-shaped networks is developed from the perspective of how such architectures approximate high-frequency components. To make verification straightforward, we will insert the key derivation steps and explicit error bounds into the main text of the revised manuscript (with full proofs remaining in the appendix). revision: yes
Referee: [Method and sensitivity analysis] The central claim requires that hallucinated artifacts produce reliably separable signatures in wavelet and shearlet coefficients so that conformal quantile regression can isolate them; the sensitivity analysis links CHEM to MSE but does not derive or empirically test a separation margin against ground-truth artifacts.

Authors: The referee correctly notes that separability in the coefficient domain is central to the method. The existing sensitivity analysis shows how CHEM increases under perturbations that mimic hallucinations and connects this increase to MSE. We will strengthen the revision by adding both a theoretical derivation of a separation margin in the wavelet/shearlet domain and an empirical evaluation on synthetic ground-truth artifacts with controlled hallucination locations. revision: yes
Referee: [Experiments] Post-hoc dataset choices on CANDELS appear in the experimental setup; this undermines the cross-dataset robustness claim for hallucination identification.

Authors: We respectfully disagree with the characterization of post-hoc selection. The CANDELS dataset was selected a priori as a standard benchmark for astronomical deconvolution tasks, with the full experimental protocol (including model architectures, training procedures, and evaluation metrics) fixed before any results were obtained. The dual evaluation on CANDELS and DIV2K was designed from the outset to demonstrate applicability across domains. To improve clarity, we will expand the experimental section with an explicit statement of the pre-specified dataset rationale and protocol. revision: partial

Circularity Check

0 steps flagged

No significant circularity in CHEM derivation chain

full rationale

The paper defines CHEM by combining standard wavelet and shearlet bases for feature localization with conformalized quantile regression for distribution-free hallucination assessment. The theoretical sensitivity analysis relates CHEM to MSE as a characterization rather than deriving the metric itself from fitted parameters or self-referential inputs. Empirical evaluations on CANDELS and DIV2K datasets with multiple architectures provide external validation. No load-bearing steps reduce predictions to inputs by construction, and no self-citation chains or ansatzes are invoked to force uniqueness. The derivation remains self-contained against the stated benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the premise that hallucinations manifest as detectable deviations in wavelet/shearlet coefficients and that conformal quantile regression can be applied directly to these coefficients without additional modeling assumptions. No free parameters are explicitly named in the abstract. No new entities are postulated.

axioms (2)

domain assumption Hallucinated artifacts are sufficiently localized and distinguishable in wavelet and shearlet representations.
Invoked when the method uses these bases to localize hallucination-prone regions.
standard math Conformalized quantile regression yields valid coverage for hallucination scores without distributional assumptions.
Standard property of conformal prediction used to claim distribution-free assessment.

pith-pipeline@v0.9.0 · 5759 in / 1421 out tokens · 40991 ms · 2026-05-21T17:34:44.147539+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment Methods
stat.ML 2026-05 unverdicted novelty 7.0

Hallucinations in inverse problem reconstructions are fundamental to ill-posedness, with necessary and sufficient conditions plus computable bounds depending only on the forward model.

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · cited by 1 Pith paper · 5 internal anchors

[1]

Deep learning-based galaxy image deconvolution.Frontiers in Astronomy and Space Sciences, 9, 2022

Utsav Akhaury, Jean-Luc Starck, Pascale Jablonka, Fr ´ed´eric Courbin, and Kevin Michalewicz. Deep learning-based galaxy image deconvolution.Frontiers in Astronomy and Space Sciences, 9, 2022. Publisher: Frontiers

work page 2022
[2]

Akhaury, P

U. Akhaury, P. Jablonka, J.-L. Starck, and F. Courbin. Ground-based image deconvolution with Swin Transformer UNet.Astronomy & Astrophysics, 688:A6, 2024

work page 2024
[3]

Image-to-image regres- sion with distribution-free uncertainty quantification and ap- plications in imaging

Anastasios N Angelopoulos, Amit Pal Kohli, Stephen Bates, Michael Jordan, Jitendra Malik, Thayer Alshaabi, Srigokul Upadhyayula, and Yaniv Romano. Image-to-image regres- sion with distribution-free uncertainty quantification and ap- plications in imaging. InInternational Conference on Ma- chine Learning, pages 717–730. PMLR, 2022

work page 2022
[4]

John Wiley & Sons, 2013

Harrison H Barrett and Kyle J Myers.Foundations of Image Science. John Wiley & Sons, 2013

work page 2013
[5]

Nearly-tight VC-dimension and pseudodimen- sion bounds for piecewise linear neural networks.Journal of Machine Learning Research, 20(63):1–17, 2019

Peter L Bartlett, Nick Harvey, Christopher Liaw, and Abbas Mehrabian. Nearly-tight VC-dimension and pseudodimen- sion bounds for piecewise linear neural networks.Journal of Machine Learning Research, 20(63):1–17, 2019

work page 2019
[6]

Principal Uncertainty Quantifi- cation with Spatial Correlation for Image Restoration Prob- lems, 2024

Omer Belhasin, Yaniv Romano, Daniel Freedman, Ehud Rivlin, and Michael Elad. Principal Uncertainty Quantifi- cation with Spatial Correlation for Image Restoration Prob- lems, 2024. arXiv:2305.10124 [cs]

work page arXiv 2024
[7]

Chinmay Belthangady and Loic A. Royer. Applications, promises, and pitfalls of deep learning for fluorescence im- age reconstruction.Nature Methods, 16(12):1215–1225, 2019

work page 2019
[8]

Kelkar, Frank J

Sayantan Bhadra, Varun A. Kelkar, Frank J. Brooks, and Mark A. Anastasio. On hallucinations in tomographic im- age reconstruction.IEEE Transactions on Medical Imaging, 40(11):3249–3260, 2021

work page 2021
[9]

Swin-Unet: Unet-like pure transformer for medical image segmentation

Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xi- aopeng Zhang, Qi Tian, and Manning Wang. Swin-Unet: Unet-like pure transformer for medical image segmentation. InComputer Vision – ECCV 2022 Workshops, pages 205– 218, 2023

work page 2022
[10]

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L. Yuille, and Yuyin Zhou. TransUNet: Transformers make strong encoders for medical image segmentation, 2021. arXiv:2102.04306 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2021
[11]

Encoder-decoder with atrous separable convolution for semantic image segmentation

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018

work page 2018
[12]

Regev Cohen, Idan Kligvasser, Ehud Rivlin, and Daniel Freedman. Looks too good to be true: An information- theoretic analysis of hallucinations in generative restoration models.Advances in Neural Information Processing Sys- tems, 37:22596–22623, 2024

work page 2024
[13]

On the relationship between self-attention and convo- lutional layers

Jean Baptiste Cordonnier, Andreas Loukas, and Martin Jaggi. On the relationship between self-attention and convo- lutional layers. In8th International Conference on Learning Representations, ICLR 2020, 2020

work page 2020
[14]

Society for Industrial and Applied Mathematics, 1992

Ingrid Daubechies.Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, 1992

work page 1992
[15]

Courier Corporation, 1975

Philip J Davis.Interpolation and approximation. Courier Corporation, 1975

work page 1975
[16]

SUNet: Swin transformer UNet for image denoising

Chi-Mao Fan, Tsung-Jung Liu, and Kuan-Hsien Liu. SUNet: Swin transformer UNet for image denoising. In2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 2333–2337. IEEE, 2022

work page 2022
[17]

Gottschling, Vegard Antun, Anders C

Nina M. Gottschling, Vegard Antun, Anders C. Hansen, and Ben Adcock. The troublesome kernel: On hallucinations, no free lunches, and the accuracy-stability tradeoff in inverse problems.SIAM Review, 67(1):73–104, 2025. Publisher: Society for Industrial and Applied Mathematics. 9

work page 2025
[18]

Grogin, Dale D

Norman A. Grogin, Dale D. Kocevski, S. M. Faber, Henry C. Ferguson, Anton M. Koekemoer, Adam G. Riess, Viviana Acquaviva, David M. Alexander, Omar Almaini, Matthew L. N. Ashby, Marco Barden, Eric F. Bell, Fr ´ed´eric Bour- naud, Thomas M. Brown, Karina I. Caputi, Stefano Caser- tano, Paolo Cassata, Marco Castellano, Peter Challis, Ranga- Ram Chary, Edmond...

work page 2011
[19]

Error bounds for approximations with deep relu neural networks inw s,p norms.Analysis and Applications, 18(05):803–859, 2020

Ingo G ¨uhring, Gitta Kutyniok, and Philipp Petersen. Error bounds for approximations with deep relu neural networks inw s,p norms.Analysis and Applications, 18(05):803–859, 2020

work page 2020
[20]

Sparse multidimensional representations using anisotropic dilation and shear operators.Wavelets and Splines, 14:189–201, 2006

Kanghui Guo, Gitta Kutyniok, and Demetrio Labate. Sparse multidimensional representations using anisotropic dilation and shear operators.Wavelets and Splines, 14:189–201, 2006

work page 2006
[21]

On the rate of convergence of a classifier based on a transformer encoder.IEEE Transactions on Information Theory, 68(12): 8139–8155, 2022

Iryna Gurevych, Michael Kohler, and G ¨ozde G¨ul S ¸ahin. On the rate of convergence of a classifier based on a transformer encoder.IEEE Transactions on Information Theory, 68(12): 8139–8155, 2022

work page 2022
[22]

Framing U-Net via deep con- volutional framelets: Application to sparse-view CT.IEEE Transactions on Medical Imaging, 37(6):1418–1429, 2018

Yoseob Han and Jong Chul Ye. Framing U-Net via deep con- volutional framelets: Application to sparse-view CT.IEEE Transactions on Medical Imaging, 37(6):1418–1429, 2018

work page 2018
[23]

Advancing trans- former architecture in long-context large language models: A comprehensive survey, 2024

Yunpeng Huang, Jingwei Xu, Junyu Lai, Zixu Jiang, Taolue Chen, Zenan Li, Yuan Yao, Xiaoxing Ma, Lijuan Yang, Hao Chen, Shupeng Li, and Penghao Zhao. Advancing trans- former architecture in long-context large language models: A comprehensive survey, 2024. arXiv:2311.12351 [cs]

work page arXiv 2024
[24]

Jaeger, Simon A

Fabian Isensee, Paul F. Jaeger, Simon A. A. Kohl, Jens Petersen, and Klaus H. Maier-Hein. nnU-Net: a self- configuring method for deep learning-based biomedical im- age segmentation.Nature Methods, 18(2):203–211, 2021

work page 2021
[25]

Estimating the hallucination rate of genera- tive AI.Advances in Neural Information Processing Systems, 37:31154–31201, 2024

Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P Cunningham, and David Blei. Estimating the hallucination rate of genera- tive AI.Advances in Neural Information Processing Systems, 37:31154–31201, 2024

work page 2024
[26]

Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023

work page 2023
[27]

CANDELS: The Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey - The Hubble Space Telescope Observations, Imaging Data Products and Mosaics

Anton M. Koekemoer, S. M. Faber, Henry C. Ferguson, Nor- man A. Grogin, Dale D. Kocevski, David C. Koo, Kamson Lai, Jennifer M. Lotz, Ray A. Lucas, Elizabeth J. McGrath, Sara Ogaz, Abhijith Rajan, Adam G. Riess, Steve A. Rodney, Louis Strolger, Stefano Casertano, Marco Castellano, Tomas Dahlen, Mark Dickinson, Timothy Dolch, Adriano Fontana, Mauro Giavali...

work page internal anchor Pith review Pith/arXiv arXiv 2011
[28]

Explaining image classifiers with multiscale directional image representation

Stefan Kolek, Robert Windesheim, Hector Andrade-Loarca, Gitta Kutyniok, and Ron Levie. Explaining image classifiers with multiscale directional image representation. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18600–18609, 2023. 10

work page 2023
[29]

Advances in deep learning for medical image analysis: A comprehen- sive investigation.Journal of Statistical Theory and Practice, 19(1):9, 2025

Rajeev Ranjan Kumar, S Vishnu Shankar, Ronit Jaiswal, Mrinmoy Ray, Neeraj Budhlakoti, and KN Singh. Advances in deep learning for medical image analysis: A comprehen- sive investigation.Journal of Statistical Theory and Practice, 19(1):9, 2025

work page 2025
[30]

Conformal prediction masks: Visualiz- ing uncertainty in medical imaging

Gilad Kutiel, Regev Cohen, Michael Elad, Daniel Freedman, and Ehud Rivlin. Conformal prediction masks: Visualiz- ing uncertainty in medical imaging. InTrustworthy Machine Learning for Healthcare, pages 163–176, 2023

work page 2023
[31]

Springer Science & Business Media, 2012

Gitta Kutyniok and Demetrio Labate.Shearlets: Multiscale analysis for multivariate data. Springer Science & Business Media, 2012

work page 2012
[32]

Shearlab 3D: Faithful digital shearlet transforms based on compactly supported shearlets.ACM Transactions on Math- ematical Software (TOMS), 42(1):1–42, 2016

Gitta Kutyniok, Wang-Q Lim, and Rafael Reisenhofer. Shearlab 3D: Faithful digital shearlet transforms based on compactly supported shearlets.ACM Transactions on Math- ematical Software (TOMS), 42(1):1–42, 2016

work page 2016
[33]

Distribution- free uncertainty quantification for inverse problems: Appli- cation to weak lensing mass mapping.Astronomy & Astro- physics, 694:A267, 2025

Hubert Leterme, Jalal Fadili, and J-L Starck. Distribution- free uncertainty quantification for inverse problems: Appli- cation to weak lensing mass mapping.Astronomy & Astro- physics, 694:A267, 2025

work page 2025
[34]

Jianfei Li, Han Feng, and Xiaosheng Zhuang. Convolutional neural networks for spherical signal processing via area- regular spherical haar tight framelets.IEEE Transactions on Neural Networks and Learning Systems, 35(4):4400–4410, 2022

work page 2022
[35]

Approximation analysis of CNNs from a feature extraction view.Analysis and Applications, 22(03):635–654, 2024

Jianfei Li, Han Feng, and Ding-Xuan Zhou. Approximation analysis of CNNs from a feature extraction view.Analysis and Applications, 22(03):635–654, 2024

work page 2024
[36]

A survey on hallucination in large vision-language models,

Hanchao Liu, Wenyuan Xue, Yifei Chen, Dapeng Chen, Xiu- tian Zhao, Ke Wang, Liping Hou, Rongjun Li, and Wei Peng. A survey on hallucination in large vision-language models,

work page
[37]

arXiv:2402.00253 [cs]

work page internal anchor Pith review Pith/arXiv arXiv
[38]

Multi-level wavelet-CNN for image restoration

Pengju Liu, Hongzhi Zhang, Kai Zhang, Liang Lin, and Wangmeng Zuo. Multi-level wavelet-CNN for image restoration. InProceedings of the IEEE conference on com- puter vision and pattern recognition workshops, pages 773– 782, 2018

work page 2018
[39]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 9992–10002, 2021

work page 2021
[40]

U-Mamba: Enhancing long-range dependency for biomedical image segmentation,

Jun Ma, Feifei Li, and Bo Wang. U-Mamba: Enhancing long-range dependency for biomedical image segmentation,

work page
[41]

arXiv:2401.04722 [eess]

work page internal anchor Pith review Pith/arXiv arXiv
[42]

El- sevier, 1999

St ´ephane Mallat.A Wavelet Tour of Signal Processing. El- sevier, 1999

work page 1999
[43]

A review of deep learn- ing techniques for speech processing.Information Fusion, 99:101869, 2023

Ambuj Mehrish, Navonil Majumder, Rishabh Bharadwaj, Rada Mihalcea, and Soujanya Poria. A review of deep learn- ing techniques for speech processing.Information Fusion, 99:101869, 2023

work page 2023
[44]

U-Nets as belief propagation: Efficient classi- fication, denoising, and diffusion in generative hierarchical models, 2024

Song Mei. U-Nets as belief propagation: Efficient classi- fication, denoising, and diffusion in generative hierarchical models, 2024. arXiv:2404.18444 [cs]

work page arXiv 2024
[45]

Neural networks for functional approximation and system identifi- cation.Neural Computation, 9(1):143–159, 1997

Hrushikesh Narhar Mhaskar and Nahmwoo Hahm. Neural networks for functional approximation and system identifi- cation.Neural Computation, 9(1):143–159, 1997

work page 1997
[46]

Matthew J. Muckley, Bruno Riemenschneider, Alireza Radmanesh, Sunwoo Kim, Geunu Jeong, Jingyu Ko, Yohan Jun, Hyungseob Shin, Dosik Hwang, Mahmoud Mostapha, Simon Arberet, Dominik Nickel, Zaccharie Ramzi, Philippe Ciuciu, Jean-Luc Starck, Jonas Teuwen, Dimitrios Karkalousos, Chaoping Zhang, Anuroop Sriram, Zhengnan Huang, Nafissa Yakubova, Yvonne W. Lui, a...

work page 2020
[47]

Attention U-Net: Learning Where to Look for the Pancreas

Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa, Kensaku Mori, Steven McDonagh, Nils Y . Hammerla, Bernhard Kainz, Ben Glocker, and Daniel Rueckert. Attention U-Net: Learning where to look for the pancreas, 2018. arXiv:1804.03999 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018
[48]

U2-Net: Go- ing deeper with nested U-structure for salient object detec- tion.Pattern recognition, 106:107404, 2020

Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood De- hghan, Osmar R Zaiane, and Martin Jagersand. U2-Net: Go- ing deeper with nested U-structure for salient object detec- tion.Pattern recognition, 106:107404, 2020

work page 2020
[49]

Wavelets in the deep learning era.Journal of Mathematical Imaging and Vision, 65(1):240–251, 2023

Zaccharie Ramzi, Kevin Michalewicz, Jean-Luc Starck, Thomas Moreau, and Philippe Ciuciu. Wavelets in the deep learning era.Journal of Mathematical Imaging and Vision, 65(1):240–251, 2023

work page 2023
[50]

Conformalized quantile regression.Advances in Neural In- formation Processing Systems, 32, 2019

Yaniv Romano, Evan Patterson, and Emmanuel Candes. Conformalized quantile regression.Advances in Neural In- formation Processing Systems, 32, 2019

work page 2019
[51]

U- Net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- Net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical Image Com- puting and Computer-Assisted Intervention, pages 234–241. Springer, 2015

work page 2015
[52]

A comprehensive sur- vey of hallucination in large language, image, video and au- dio foundation models, 2024

Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, and Aman Chadha. A comprehensive sur- vey of hallucination in large language, image, video and au- dio foundation models, 2024. arXiv:2405.09589 [cs]

work page arXiv 2024
[53]

Angelopoulos, Stephen Bates, Yaniv Romano, and Phillip Isola

Swami Sankaranarayanan, Anastasios N. Angelopoulos, Stephen Bates, Yaniv Romano, and Phillip Isola. Seman- tic uncertainty intervals for disentangled latent spaces, 2022. arXiv:2207.10074 [cs]

work page arXiv 2022
[54]

SIAM Journal on Numerical Analysis, 6(2):161–183, 1969

Martin H Schultz.L ∞-multivariate approximation theory. SIAM Journal on Numerical Analysis, 6(2):161–183, 1969

work page 1969
[55]

Optimal approximation rate of ReLU networks in terms of width and depth.Journal de Math ´ematiques Pures et Appliqu´ees, 157: 101–135, 2022

Zuowei Shen, Haizhao Yang, and Shijun Zhang. Optimal approximation rate of ReLU networks in terms of width and depth.Journal de Math ´ematiques Pures et Appliqu´ees, 157: 101–135, 2022

work page 2022
[56]

Starck and F

J.-L. Starck and F. Murtagh.Astronomical Image and Data Analysis. Springer, 2006

work page 2006
[57]

Deep learning for a space-variant deconvolution in galaxy surveys.Astron- omy & Astrophysics, 641:A67, 2020

Florent Sureau, Alexis Lechat, and J-L Starck. Deep learning for a space-variant deconvolution in galaxy surveys.Astron- omy & Astrophysics, 641:A67, 2020

work page 2020
[58]

A mathematical explanation of UNet.Mathe- matical Foundations of Computing, 8(5):874–889, 2025

Xue-Cheng TAI, Hao LIU, Raymond H CHAN, and Lingfeng LI. A mathematical explanation of UNet.Mathe- matical Foundations of Computing, 8(5):874–889, 2025

work page 2025
[59]

Conformal risk control for semantic uncertainty quantifica- tion in computed tomography

Jacopo Teneggi, J Webster Stayman, and Jeremias Sulam. Conformal risk control for semantic uncertainty quantifica- tion in computed tomography. InInternational Conference on Medical Image Computing and Computer-Assisted Inter- vention, pages 45–55, 2025. 11

work page 2025
[60]

Hallucination index: An im- age quality metric for generative reconstruction models

Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, and Quanzheng Li. Hallucination index: An im- age quality metric for generative reconstruction models. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 449–458. Springer, 2024

work page 2024
[61]

A unified framework for U-Net design and analysis.Advances in Neural Information Processing Systems, 36:27745–27782, 2023

Christopher Williams, Fabian Falck, George Deligiannidis, Chris C Holmes, Arnaud Doucet, and Saifuddin Syed. A unified framework for U-Net design and analysis.Advances in Neural Information Processing Systems, 36:27745–27782, 2023

work page 2023
[62]

On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks.Applied and Computational Harmonic Analysis, page 101797, 2025

Yunfei Yang. On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks.Applied and Computational Harmonic Analysis, page 101797, 2025

work page 2025
[63]

Siren’s song in the ai ocean: A survey on hallucination in large language models.Computational Linguistics, pages 1–46, 2025

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, and Shuming Shi. Siren’s song in the ai ocean: A survey on hallucination in large language models.Computational Linguistics, pages 1–46, 2025

work page 2025
[64]

Road extraction by deep residual U-Net.IEEE Geoscience and Remote Sensing Letters, 15(5):749–753, 2018

Zhengxin Zhang, Qingjie Liu, and Yunhong Wang. Road extraction by deep residual U-Net.IEEE Geoscience and Remote Sensing Letters, 15(5):749–753, 2018

work page 2018
[65]

Unet++: A nested U-Net architecture for medical image segmentation

Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang. Unet++: A nested U-Net architecture for medical image segmentation. InInterna- tional Workshop on Deep Learning in Medical Image Anal- ysis, pages 3–11, 2018. Appendix The appendix provides additional explanations regarding the experimental setup and the theoretical results. I...

work page 2018
[66]

During the first three epochs, the learning rate grad- ually increases from zero to the initial value of2×10 −4 (warm-up phase)

The learning rate is adjusted using a two-stage scheduling strategy that combines a linear warm-up and cosine anneal- ing. During the first three epochs, the learning rate grad- ually increases from zero to the initial value of2×10 −4 (warm-up phase). Subsequently, a cosine annealing sched- uler progressively reduces the learning rate to a minimum value o...

work page 2022
[67]

However,Krep- resent the overall number of parameters in U-shaped net- works

If this tensor has more thanKchannels, the convolu- tion layer to produce this tensor has at leastKconvolu- tion kernels which have2Kparameters. However,Krep- resent the overall number of parameters in U-shaped net- works. This presents a contradiction, suggesting that the output dimension of each layer does not exceedmax{t, K}. Consequently, the fully co...

work page

[1] [1]

Deep learning-based galaxy image deconvolution.Frontiers in Astronomy and Space Sciences, 9, 2022

Utsav Akhaury, Jean-Luc Starck, Pascale Jablonka, Fr ´ed´eric Courbin, and Kevin Michalewicz. Deep learning-based galaxy image deconvolution.Frontiers in Astronomy and Space Sciences, 9, 2022. Publisher: Frontiers

work page 2022

[2] [2]

Akhaury, P

U. Akhaury, P. Jablonka, J.-L. Starck, and F. Courbin. Ground-based image deconvolution with Swin Transformer UNet.Astronomy & Astrophysics, 688:A6, 2024

work page 2024

[3] [3]

Image-to-image regres- sion with distribution-free uncertainty quantification and ap- plications in imaging

Anastasios N Angelopoulos, Amit Pal Kohli, Stephen Bates, Michael Jordan, Jitendra Malik, Thayer Alshaabi, Srigokul Upadhyayula, and Yaniv Romano. Image-to-image regres- sion with distribution-free uncertainty quantification and ap- plications in imaging. InInternational Conference on Ma- chine Learning, pages 717–730. PMLR, 2022

work page 2022

[4] [4]

John Wiley & Sons, 2013

Harrison H Barrett and Kyle J Myers.Foundations of Image Science. John Wiley & Sons, 2013

work page 2013

[5] [5]

Nearly-tight VC-dimension and pseudodimen- sion bounds for piecewise linear neural networks.Journal of Machine Learning Research, 20(63):1–17, 2019

Peter L Bartlett, Nick Harvey, Christopher Liaw, and Abbas Mehrabian. Nearly-tight VC-dimension and pseudodimen- sion bounds for piecewise linear neural networks.Journal of Machine Learning Research, 20(63):1–17, 2019

work page 2019

[6] [6]

Principal Uncertainty Quantifi- cation with Spatial Correlation for Image Restoration Prob- lems, 2024

Omer Belhasin, Yaniv Romano, Daniel Freedman, Ehud Rivlin, and Michael Elad. Principal Uncertainty Quantifi- cation with Spatial Correlation for Image Restoration Prob- lems, 2024. arXiv:2305.10124 [cs]

work page arXiv 2024

[7] [7]

Chinmay Belthangady and Loic A. Royer. Applications, promises, and pitfalls of deep learning for fluorescence im- age reconstruction.Nature Methods, 16(12):1215–1225, 2019

work page 2019

[8] [8]

Kelkar, Frank J

Sayantan Bhadra, Varun A. Kelkar, Frank J. Brooks, and Mark A. Anastasio. On hallucinations in tomographic im- age reconstruction.IEEE Transactions on Medical Imaging, 40(11):3249–3260, 2021

work page 2021

[9] [9]

Swin-Unet: Unet-like pure transformer for medical image segmentation

Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xi- aopeng Zhang, Qi Tian, and Manning Wang. Swin-Unet: Unet-like pure transformer for medical image segmentation. InComputer Vision – ECCV 2022 Workshops, pages 205– 218, 2023

work page 2022

[10] [10]

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L. Yuille, and Yuyin Zhou. TransUNet: Transformers make strong encoders for medical image segmentation, 2021. arXiv:2102.04306 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2021

[11] [11]

Encoder-decoder with atrous separable convolution for semantic image segmentation

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018

work page 2018

[12] [12]

Regev Cohen, Idan Kligvasser, Ehud Rivlin, and Daniel Freedman. Looks too good to be true: An information- theoretic analysis of hallucinations in generative restoration models.Advances in Neural Information Processing Sys- tems, 37:22596–22623, 2024

work page 2024

[13] [13]

On the relationship between self-attention and convo- lutional layers

Jean Baptiste Cordonnier, Andreas Loukas, and Martin Jaggi. On the relationship between self-attention and convo- lutional layers. In8th International Conference on Learning Representations, ICLR 2020, 2020

work page 2020

[14] [14]

Society for Industrial and Applied Mathematics, 1992

Ingrid Daubechies.Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, 1992

work page 1992

[15] [15]

Courier Corporation, 1975

Philip J Davis.Interpolation and approximation. Courier Corporation, 1975

work page 1975

[16] [16]

SUNet: Swin transformer UNet for image denoising

Chi-Mao Fan, Tsung-Jung Liu, and Kuan-Hsien Liu. SUNet: Swin transformer UNet for image denoising. In2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 2333–2337. IEEE, 2022

work page 2022

[17] [17]

Gottschling, Vegard Antun, Anders C

Nina M. Gottschling, Vegard Antun, Anders C. Hansen, and Ben Adcock. The troublesome kernel: On hallucinations, no free lunches, and the accuracy-stability tradeoff in inverse problems.SIAM Review, 67(1):73–104, 2025. Publisher: Society for Industrial and Applied Mathematics. 9

work page 2025

[18] [18]

Grogin, Dale D

Norman A. Grogin, Dale D. Kocevski, S. M. Faber, Henry C. Ferguson, Anton M. Koekemoer, Adam G. Riess, Viviana Acquaviva, David M. Alexander, Omar Almaini, Matthew L. N. Ashby, Marco Barden, Eric F. Bell, Fr ´ed´eric Bour- naud, Thomas M. Brown, Karina I. Caputi, Stefano Caser- tano, Paolo Cassata, Marco Castellano, Peter Challis, Ranga- Ram Chary, Edmond...

work page 2011

[19] [19]

Error bounds for approximations with deep relu neural networks inw s,p norms.Analysis and Applications, 18(05):803–859, 2020

Ingo G ¨uhring, Gitta Kutyniok, and Philipp Petersen. Error bounds for approximations with deep relu neural networks inw s,p norms.Analysis and Applications, 18(05):803–859, 2020

work page 2020

[20] [20]

Sparse multidimensional representations using anisotropic dilation and shear operators.Wavelets and Splines, 14:189–201, 2006

Kanghui Guo, Gitta Kutyniok, and Demetrio Labate. Sparse multidimensional representations using anisotropic dilation and shear operators.Wavelets and Splines, 14:189–201, 2006

work page 2006

[21] [21]

On the rate of convergence of a classifier based on a transformer encoder.IEEE Transactions on Information Theory, 68(12): 8139–8155, 2022

Iryna Gurevych, Michael Kohler, and G ¨ozde G¨ul S ¸ahin. On the rate of convergence of a classifier based on a transformer encoder.IEEE Transactions on Information Theory, 68(12): 8139–8155, 2022

work page 2022

[22] [22]

Framing U-Net via deep con- volutional framelets: Application to sparse-view CT.IEEE Transactions on Medical Imaging, 37(6):1418–1429, 2018

Yoseob Han and Jong Chul Ye. Framing U-Net via deep con- volutional framelets: Application to sparse-view CT.IEEE Transactions on Medical Imaging, 37(6):1418–1429, 2018

work page 2018

[23] [23]

Advancing trans- former architecture in long-context large language models: A comprehensive survey, 2024

Yunpeng Huang, Jingwei Xu, Junyu Lai, Zixu Jiang, Taolue Chen, Zenan Li, Yuan Yao, Xiaoxing Ma, Lijuan Yang, Hao Chen, Shupeng Li, and Penghao Zhao. Advancing trans- former architecture in long-context large language models: A comprehensive survey, 2024. arXiv:2311.12351 [cs]

work page arXiv 2024

[24] [24]

Jaeger, Simon A

Fabian Isensee, Paul F. Jaeger, Simon A. A. Kohl, Jens Petersen, and Klaus H. Maier-Hein. nnU-Net: a self- configuring method for deep learning-based biomedical im- age segmentation.Nature Methods, 18(2):203–211, 2021

work page 2021

[25] [25]

Estimating the hallucination rate of genera- tive AI.Advances in Neural Information Processing Systems, 37:31154–31201, 2024

Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P Cunningham, and David Blei. Estimating the hallucination rate of genera- tive AI.Advances in Neural Information Processing Systems, 37:31154–31201, 2024

work page 2024

[26] [26]

Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023

work page 2023

[27] [27]

CANDELS: The Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey - The Hubble Space Telescope Observations, Imaging Data Products and Mosaics

Anton M. Koekemoer, S. M. Faber, Henry C. Ferguson, Nor- man A. Grogin, Dale D. Kocevski, David C. Koo, Kamson Lai, Jennifer M. Lotz, Ray A. Lucas, Elizabeth J. McGrath, Sara Ogaz, Abhijith Rajan, Adam G. Riess, Steve A. Rodney, Louis Strolger, Stefano Casertano, Marco Castellano, Tomas Dahlen, Mark Dickinson, Timothy Dolch, Adriano Fontana, Mauro Giavali...

work page internal anchor Pith review Pith/arXiv arXiv 2011

[28] [28]

Explaining image classifiers with multiscale directional image representation

Stefan Kolek, Robert Windesheim, Hector Andrade-Loarca, Gitta Kutyniok, and Ron Levie. Explaining image classifiers with multiscale directional image representation. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18600–18609, 2023. 10

work page 2023

[29] [29]

Advances in deep learning for medical image analysis: A comprehen- sive investigation.Journal of Statistical Theory and Practice, 19(1):9, 2025

Rajeev Ranjan Kumar, S Vishnu Shankar, Ronit Jaiswal, Mrinmoy Ray, Neeraj Budhlakoti, and KN Singh. Advances in deep learning for medical image analysis: A comprehen- sive investigation.Journal of Statistical Theory and Practice, 19(1):9, 2025

work page 2025

[30] [30]

Conformal prediction masks: Visualiz- ing uncertainty in medical imaging

Gilad Kutiel, Regev Cohen, Michael Elad, Daniel Freedman, and Ehud Rivlin. Conformal prediction masks: Visualiz- ing uncertainty in medical imaging. InTrustworthy Machine Learning for Healthcare, pages 163–176, 2023

work page 2023

[31] [31]

Springer Science & Business Media, 2012

Gitta Kutyniok and Demetrio Labate.Shearlets: Multiscale analysis for multivariate data. Springer Science & Business Media, 2012

work page 2012

[32] [32]

Shearlab 3D: Faithful digital shearlet transforms based on compactly supported shearlets.ACM Transactions on Math- ematical Software (TOMS), 42(1):1–42, 2016

Gitta Kutyniok, Wang-Q Lim, and Rafael Reisenhofer. Shearlab 3D: Faithful digital shearlet transforms based on compactly supported shearlets.ACM Transactions on Math- ematical Software (TOMS), 42(1):1–42, 2016

work page 2016

[33] [33]

Distribution- free uncertainty quantification for inverse problems: Appli- cation to weak lensing mass mapping.Astronomy & Astro- physics, 694:A267, 2025

Hubert Leterme, Jalal Fadili, and J-L Starck. Distribution- free uncertainty quantification for inverse problems: Appli- cation to weak lensing mass mapping.Astronomy & Astro- physics, 694:A267, 2025

work page 2025

[34] [34]

Jianfei Li, Han Feng, and Xiaosheng Zhuang. Convolutional neural networks for spherical signal processing via area- regular spherical haar tight framelets.IEEE Transactions on Neural Networks and Learning Systems, 35(4):4400–4410, 2022

work page 2022

[35] [35]

Approximation analysis of CNNs from a feature extraction view.Analysis and Applications, 22(03):635–654, 2024

Jianfei Li, Han Feng, and Ding-Xuan Zhou. Approximation analysis of CNNs from a feature extraction view.Analysis and Applications, 22(03):635–654, 2024

work page 2024

[36] [36]

A survey on hallucination in large vision-language models,

Hanchao Liu, Wenyuan Xue, Yifei Chen, Dapeng Chen, Xiu- tian Zhao, Ke Wang, Liping Hou, Rongjun Li, and Wei Peng. A survey on hallucination in large vision-language models,

work page

[37] [37]

arXiv:2402.00253 [cs]

work page internal anchor Pith review Pith/arXiv arXiv

[38] [38]

Multi-level wavelet-CNN for image restoration

Pengju Liu, Hongzhi Zhang, Kai Zhang, Liang Lin, and Wangmeng Zuo. Multi-level wavelet-CNN for image restoration. InProceedings of the IEEE conference on com- puter vision and pattern recognition workshops, pages 773– 782, 2018

work page 2018

[39] [39]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 9992–10002, 2021

work page 2021

[40] [40]

U-Mamba: Enhancing long-range dependency for biomedical image segmentation,

Jun Ma, Feifei Li, and Bo Wang. U-Mamba: Enhancing long-range dependency for biomedical image segmentation,

work page

[41] [41]

arXiv:2401.04722 [eess]

work page internal anchor Pith review Pith/arXiv arXiv

[42] [42]

El- sevier, 1999

St ´ephane Mallat.A Wavelet Tour of Signal Processing. El- sevier, 1999

work page 1999

[43] [43]

A review of deep learn- ing techniques for speech processing.Information Fusion, 99:101869, 2023

Ambuj Mehrish, Navonil Majumder, Rishabh Bharadwaj, Rada Mihalcea, and Soujanya Poria. A review of deep learn- ing techniques for speech processing.Information Fusion, 99:101869, 2023

work page 2023

[44] [44]

U-Nets as belief propagation: Efficient classi- fication, denoising, and diffusion in generative hierarchical models, 2024

Song Mei. U-Nets as belief propagation: Efficient classi- fication, denoising, and diffusion in generative hierarchical models, 2024. arXiv:2404.18444 [cs]

work page arXiv 2024

[45] [45]

Neural networks for functional approximation and system identifi- cation.Neural Computation, 9(1):143–159, 1997

Hrushikesh Narhar Mhaskar and Nahmwoo Hahm. Neural networks for functional approximation and system identifi- cation.Neural Computation, 9(1):143–159, 1997

work page 1997

[46] [46]

Matthew J. Muckley, Bruno Riemenschneider, Alireza Radmanesh, Sunwoo Kim, Geunu Jeong, Jingyu Ko, Yohan Jun, Hyungseob Shin, Dosik Hwang, Mahmoud Mostapha, Simon Arberet, Dominik Nickel, Zaccharie Ramzi, Philippe Ciuciu, Jean-Luc Starck, Jonas Teuwen, Dimitrios Karkalousos, Chaoping Zhang, Anuroop Sriram, Zhengnan Huang, Nafissa Yakubova, Yvonne W. Lui, a...

work page 2020

[47] [47]

Attention U-Net: Learning Where to Look for the Pancreas

Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa, Kensaku Mori, Steven McDonagh, Nils Y . Hammerla, Bernhard Kainz, Ben Glocker, and Daniel Rueckert. Attention U-Net: Learning where to look for the pancreas, 2018. arXiv:1804.03999 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018

[48] [48]

U2-Net: Go- ing deeper with nested U-structure for salient object detec- tion.Pattern recognition, 106:107404, 2020

Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood De- hghan, Osmar R Zaiane, and Martin Jagersand. U2-Net: Go- ing deeper with nested U-structure for salient object detec- tion.Pattern recognition, 106:107404, 2020

work page 2020

[49] [49]

Wavelets in the deep learning era.Journal of Mathematical Imaging and Vision, 65(1):240–251, 2023

Zaccharie Ramzi, Kevin Michalewicz, Jean-Luc Starck, Thomas Moreau, and Philippe Ciuciu. Wavelets in the deep learning era.Journal of Mathematical Imaging and Vision, 65(1):240–251, 2023

work page 2023

[50] [50]

Conformalized quantile regression.Advances in Neural In- formation Processing Systems, 32, 2019

Yaniv Romano, Evan Patterson, and Emmanuel Candes. Conformalized quantile regression.Advances in Neural In- formation Processing Systems, 32, 2019

work page 2019

[51] [51]

U- Net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- Net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical Image Com- puting and Computer-Assisted Intervention, pages 234–241. Springer, 2015

work page 2015

[52] [52]

A comprehensive sur- vey of hallucination in large language, image, video and au- dio foundation models, 2024

Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, and Aman Chadha. A comprehensive sur- vey of hallucination in large language, image, video and au- dio foundation models, 2024. arXiv:2405.09589 [cs]

work page arXiv 2024

[53] [53]

Angelopoulos, Stephen Bates, Yaniv Romano, and Phillip Isola

Swami Sankaranarayanan, Anastasios N. Angelopoulos, Stephen Bates, Yaniv Romano, and Phillip Isola. Seman- tic uncertainty intervals for disentangled latent spaces, 2022. arXiv:2207.10074 [cs]

work page arXiv 2022

[54] [54]

SIAM Journal on Numerical Analysis, 6(2):161–183, 1969

Martin H Schultz.L ∞-multivariate approximation theory. SIAM Journal on Numerical Analysis, 6(2):161–183, 1969

work page 1969

[55] [55]

Optimal approximation rate of ReLU networks in terms of width and depth.Journal de Math ´ematiques Pures et Appliqu´ees, 157: 101–135, 2022

Zuowei Shen, Haizhao Yang, and Shijun Zhang. Optimal approximation rate of ReLU networks in terms of width and depth.Journal de Math ´ematiques Pures et Appliqu´ees, 157: 101–135, 2022

work page 2022

[56] [56]

Starck and F

J.-L. Starck and F. Murtagh.Astronomical Image and Data Analysis. Springer, 2006

work page 2006

[57] [57]

Deep learning for a space-variant deconvolution in galaxy surveys.Astron- omy & Astrophysics, 641:A67, 2020

Florent Sureau, Alexis Lechat, and J-L Starck. Deep learning for a space-variant deconvolution in galaxy surveys.Astron- omy & Astrophysics, 641:A67, 2020

work page 2020

[58] [58]

A mathematical explanation of UNet.Mathe- matical Foundations of Computing, 8(5):874–889, 2025

Xue-Cheng TAI, Hao LIU, Raymond H CHAN, and Lingfeng LI. A mathematical explanation of UNet.Mathe- matical Foundations of Computing, 8(5):874–889, 2025

work page 2025

[59] [59]

Conformal risk control for semantic uncertainty quantifica- tion in computed tomography

Jacopo Teneggi, J Webster Stayman, and Jeremias Sulam. Conformal risk control for semantic uncertainty quantifica- tion in computed tomography. InInternational Conference on Medical Image Computing and Computer-Assisted Inter- vention, pages 45–55, 2025. 11

work page 2025

[60] [60]

Hallucination index: An im- age quality metric for generative reconstruction models

Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, and Quanzheng Li. Hallucination index: An im- age quality metric for generative reconstruction models. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 449–458. Springer, 2024

work page 2024

[61] [61]

A unified framework for U-Net design and analysis.Advances in Neural Information Processing Systems, 36:27745–27782, 2023

Christopher Williams, Fabian Falck, George Deligiannidis, Chris C Holmes, Arnaud Doucet, and Saifuddin Syed. A unified framework for U-Net design and analysis.Advances in Neural Information Processing Systems, 36:27745–27782, 2023

work page 2023

[62] [62]

On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks.Applied and Computational Harmonic Analysis, page 101797, 2025

Yunfei Yang. On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks.Applied and Computational Harmonic Analysis, page 101797, 2025

work page 2025

[63] [63]

Siren’s song in the ai ocean: A survey on hallucination in large language models.Computational Linguistics, pages 1–46, 2025

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, and Shuming Shi. Siren’s song in the ai ocean: A survey on hallucination in large language models.Computational Linguistics, pages 1–46, 2025

work page 2025

[64] [64]

Road extraction by deep residual U-Net.IEEE Geoscience and Remote Sensing Letters, 15(5):749–753, 2018

Zhengxin Zhang, Qingjie Liu, and Yunhong Wang. Road extraction by deep residual U-Net.IEEE Geoscience and Remote Sensing Letters, 15(5):749–753, 2018

work page 2018

[65] [65]

Unet++: A nested U-Net architecture for medical image segmentation

Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang. Unet++: A nested U-Net architecture for medical image segmentation. InInterna- tional Workshop on Deep Learning in Medical Image Anal- ysis, pages 3–11, 2018. Appendix The appendix provides additional explanations regarding the experimental setup and the theoretical results. I...

work page 2018

[66] [66]

During the first three epochs, the learning rate grad- ually increases from zero to the initial value of2×10 −4 (warm-up phase)

The learning rate is adjusted using a two-stage scheduling strategy that combines a linear warm-up and cosine anneal- ing. During the first three epochs, the learning rate grad- ually increases from zero to the initial value of2×10 −4 (warm-up phase). Subsequently, a cosine annealing sched- uler progressively reduces the learning rate to a minimum value o...

work page 2022

[67] [67]

However,Krep- resent the overall number of parameters in U-shaped net- works

If this tensor has more thanKchannels, the convolu- tion layer to produce this tensor has at leastKconvolu- tion kernels which have2Kparameters. However,Krep- resent the overall number of parameters in U-shaped net- works. This presents a contradiction, suggesting that the output dimension of each layer does not exceedmax{t, K}. Consequently, the fully co...

work page