pith. sign in

arxiv: 2511.18152 · v3 · submitted 2025-11-22 · 💻 cs.CV · cs.AI

UnfoldLDM: Degradation-Aware Unfolding with Iterative Latent Diffusion Priors for Blind Image Restoration

Pith reviewed 2026-05-17 05:45 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords blind image restorationdeep unfolding networkslatent diffusion modelsdegradation estimationhigh-frequency recoveryover-smoothing correctionplug-and-play framework
0
0 comments X

The pith

UnfoldLDM integrates deep unfolding networks with latent diffusion priors to restore images when the degradation type is unknown in advance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to fix two core limitations in deep unfolding networks for blind image restoration: their reliance on a known degradation model and their bias toward over-smoothed outputs that lose fine textures. It replaces the standard gradient descent step with a multi-granularity degradation-aware module that estimates both the overall degradation and its component forms. The proximal step then uses a degradation-resistant latent diffusion model to supply invariant priors, followed by a transformer that explicitly restores high-frequency details. A sympathetic reader would care because most real-world image degradations arrive without labels, so a method that works without assuming a specific model could make restoration pipelines far more practical. If the approach holds, it also turns existing unfolding methods into a plug-in framework without redesign.

Core claim

UnfoldLDM integrates deep unfolding networks with latent diffusion models for blind image restoration. In each stage the multi-granularity degradation-aware module acts as the gradient descent step by treating the task as unknown degradation estimation and recovering both the holistic degradation matrix and its decomposed forms. The degradation-resistant LDM then extracts compact degradation-invariant priors from that output. Guided by these priors, the over-smoothing correction transformer recovers high-frequency components and enhances texture details, producing results that are both degradation-free and visually rich.

What carries the argument

The multi-granularity degradation-aware module that estimates unknown holistic and decomposed degradations, paired with the degradation-resistant latent diffusion model that supplies invariant priors and the over-smoothing correction transformer that restores high-frequency content.

If this is right

  • Achieves leading performance across multiple blind image restoration benchmarks.
  • Improves accuracy on downstream tasks that use the restored images.
  • Serves as a plug-and-play addition to existing deep unfolding networks without retraining their core structure.
  • Removes both degradation-specific dependency and over-smoothing bias in a single iterative framework.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same degradation-estimation module could be swapped into unfolding networks for other blind inverse problems such as joint deblurring and denoising.
  • Fewer unfolding stages might suffice once the diffusion prior is strong enough, which could be tested by ablating stage count on fixed compute budgets.
  • Real-world video restoration pipelines might adopt the approach if the per-frame cost remains acceptable.

Load-bearing premise

The multi-granularity degradation-aware module can reliably estimate both overall and component forms of unknown degradations in a blind setting without any predefined degradation model.

What would settle it

Run the method on a held-out test set containing entirely novel degradation combinations never encountered during training and check whether the estimated degradation matrix deviates sharply from ground-truth synthetic degradations while restoration metrics fall below plain diffusion baselines.

Figures

Figures reproduced from arXiv: 2511.18152 by Bowen Yang, Chengyu Fang, Chunming He, Fengyang Xiao, Rihan Zhang, Sina Farsiu, Yulun Zhang, Yunlong Lin, Zheng Chen.

Figure 1
Figure 1. Figure 1: Comparison between existing proximal gradient-based DUN-based methods ( [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Framework of our proposed UnfoldLDM. decoupled matrices, W ∈ R c×h×h and M ∈ R c×w×w: D = MT ⊗ W, (3) where ⊗ denotes the Kronecker product. This factorization is more efficient than directly learn￾ing D, whose dimension grows quadratically with image resolution. Also, the decomposed ones enhance structure awareness: W captures spatial transformations, while M models spectral or directional distortions. In… view at source ↗
Figure 3
Figure 3. Figure 3: Details of MGDA, DR-LDM, and OCFormer at the [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of image denoising. Our method restores a sharper letter “s” in “His” (left) and a more accurate “i” in “ion” (right). [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visualization on UIE, BIE, LLIE, and Deraining tasks. Only SOTA methods are selected for space limitation. [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
read the original abstract

Deep unfolding networks (DUNs) combine the interpretability of model-based methods with the learning ability of deep networks, yet remain limited for blind image restoration (BIR). Existing DUNs suffer from: (1) \textbf{Degradation-specific dependency}, as their optimization frameworks are tied to a known degradation model, making them unsuitable for BIR tasks; and (2) \textbf{Over-smoothing bias}, resulting from the direct feeding of gradient descent outputs, dominated by low-frequency content, into the proximal term, suppressing fine textures. To overcome these issues, we propose UnfoldLDM to integrate DUNs with latent diffusion model (LDM) for BIR. In each stage, UnfoldLDM employs a multi-granularity degradation-aware (MGDA) module as the gradient descent step. MGDA models BIR as an unknown degradation estimation problem and estimates both the holistic degradation matrix and its decomposed forms, enabling robust degradation removal. For the proximal step, we design a degradation-resistant LDM (DR-LDM) to extract compact degradation-invariant priors from the MGDA output. Guided by this prior, an over-smoothing correction transformer (OCFormer) explicitly recovers high-frequency components and enhances texture details. This unique combination ensures the final result is degradation-free and visually rich. Experiments show that our UnfoldLDM achieves a leading place on various BIR tasks and benefits downstream tasks. Moreover, our design is compatible with existing DUN-based methods, serving as a plug-and-play framework. Code will be released.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper proposes UnfoldLDM, a deep unfolding network (DUN) integrated with latent diffusion models (LDMs) for blind image restoration (BIR). It identifies two limitations in prior DUNs—degradation-specific dependency and over-smoothing bias—and addresses them via a multi-granularity degradation-aware (MGDA) module that estimates both holistic and decomposed unknown degradations as the gradient-descent step, a degradation-resistant LDM (DR-LDM) that extracts compact degradation-invariant priors, and an over-smoothing correction transformer (OCFormer) that recovers high-frequency textures in the proximal step. The authors claim leading performance across various BIR tasks, benefits to downstream applications, and plug-and-play compatibility with existing DUN-based methods.

Significance. If the empirical claims hold, the work offers a meaningful step toward making unfolding networks viable for fully blind restoration without hand-crafted degradation models, while using diffusion priors to counteract the low-frequency bias typical of proximal operators. The explicit compatibility design is a practical strength that could allow incremental adoption.

major comments (2)
  1. [§3.2] §3.2 (MGDA module description): the central claim that MGDA can accurately recover both the holistic degradation matrix and its decomposed components from data alone in arbitrary blind settings is load-bearing for the entire unfolding iteration, yet the manuscript provides no quantitative ablation of estimation error (e.g., matrix reconstruction error or downstream PSNR sensitivity) on out-of-distribution degradations; without this, systematic bias in the MGDA output would propagate directly into DR-LDM and OCFormer, undermining the claimed advantages over prior DUNs.
  2. [Table 2] Table 2 (main BIR results): the reported leading performance is presented without error bars, multiple random seeds, or statistical significance tests against the strongest baselines; given that BIR metrics are sensitive to degradation distribution shifts, this weakens the reliability of the cross-method ranking.
minor comments (3)
  1. [Abstract] The abstract states that the design is 'compatible with existing DUN-based methods' but does not include a concrete plug-and-play experiment (e.g., replacing only the proximal operator in a baseline DUN); adding this would strengthen the compatibility claim.
  2. [§3] Notation for the decomposed degradation forms inside MGDA could be introduced with an explicit equation early in §3 rather than relying on prose description.
  3. [Figure 3] Figure 3 (qualitative results) would benefit from zoomed insets on high-frequency regions to better illustrate the claimed texture recovery by OCFormer.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and outline the revisions we will make to improve the manuscript.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (MGDA module description): the central claim that MGDA can accurately recover both the holistic degradation matrix and its decomposed components from data alone in arbitrary blind settings is load-bearing for the entire unfolding iteration, yet the manuscript provides no quantitative ablation of estimation error (e.g., matrix reconstruction error or downstream PSNR sensitivity) on out-of-distribution degradations; without this, systematic bias in the MGDA output would propagate directly into DR-LDM and OCFormer, undermining the claimed advantages over prior DUNs.

    Authors: We agree that a dedicated quantitative analysis of MGDA's degradation estimation accuracy, especially under out-of-distribution conditions, would strengthen the claims. In the revised manuscript we will add an ablation study (new Table or Appendix) that reports matrix reconstruction error (e.g., Frobenius norm between estimated and ground-truth degradation matrices) and measures the sensitivity of final PSNR to controlled perturbations in MGDA outputs, evaluated on both in-distribution and deliberately shifted degradation distributions. This will directly address potential propagation of estimation bias. revision: yes

  2. Referee: [Table 2] Table 2 (main BIR results): the reported leading performance is presented without error bars, multiple random seeds, or statistical significance tests against the strongest baselines; given that BIR metrics are sensitive to degradation distribution shifts, this weakens the reliability of the cross-method ranking.

    Authors: We concur that variability reporting and statistical testing are important for robust claims in blind restoration. We have rerun all experiments in Table 2 using five independent random seeds and will update the table to show mean ± standard deviation. We will also add paired t-test p-values comparing UnfoldLDM against the strongest baselines to establish statistical significance of the reported gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper introduces UnfoldLDM as a novel integration of deep unfolding networks with latent diffusion models for blind image restoration. The MGDA module is presented as a new component that models degradation estimation directly, the DR-LDM extracts priors from its output, and OCFormer corrects textures; these are independent architectural choices rather than self-definitions or fitted inputs renamed as predictions. No load-bearing self-citations, uniqueness theorems from prior author work, or ansatzes smuggled via citation are invoked to force the central result. The method is explicitly described as compatible with existing DUNs and validated through experiments on multiple tasks, keeping the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 3 invented entities

The central claim rests on the effectiveness of three newly introduced algorithmic components whose performance is validated only within the proposed framework itself.

axioms (2)
  • domain assumption Existing DUNs are limited by degradation-specific dependency and over-smoothing bias for blind tasks
    Stated as the motivation for the new design in the abstract.
  • domain assumption Latent diffusion models can extract degradation-invariant priors from partially restored images
    Core assumption underlying the DR-LDM design.
invented entities (3)
  • MGDA module no independent evidence
    purpose: Models blind restoration as unknown degradation estimation and computes holistic plus decomposed degradation forms
    New module introduced for the gradient descent step
  • DR-LDM no independent evidence
    purpose: Extracts compact degradation-invariant priors from MGDA output
    New component for the proximal step
  • OCFormer no independent evidence
    purpose: Recovers high-frequency components and enhances texture details guided by the prior
    New transformer for over-smoothing correction

pith-pipeline@v0.9.0 · 5607 in / 1368 out tokens · 46394 ms · 2026-05-17T05:45:09.480862+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    MGDA models BIR as an unknown degradation estimation problem and estimates both the holistic degradation matrix and its decomposed forms... DR-LDM to extract compact degradation-invariant priors... OCFormer explicitly recovers high-frequency components

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. RIDE: Retinex-Informed Decoupling for Exposing Concealed Objects

    cs.CV 2026-05 unverdicted novelty 6.0

    RIDE applies Retinex-based homogeneous decomposition to improve foreground-background discriminability in concealed object segmentation tasks across multiple domains.

  2. Embedding-perturbed Exploration Preference Optimization for Flow Models

    cs.CV 2026-05 unverdicted novelty 5.0

    E²PO uses embedding-level perturbations to maintain intra-group variance and discriminative signal in RL-based preference optimization for generative flow models.

  3. ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models

    cs.CV 2026-05 unverdicted novelty 5.0

    ZeroIDIR restores illumination-degraded images via adaptive gamma correction followed by perturbed consistency diffusion, trained solely on degraded images without references.

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · cited by 3 Pith papers · 2 internal anchors

  1. [1]

    A high-quality denoising dataset for smartphone cameras

    Abdelrahman Abdelhamed, Stephen Lin, and Michael S Brown. A high-quality denoising dataset for smartphone cameras. In CVPR, pages 1692–1700, 2018. 6

  2. [2]

    An algorithm for total variation mini- mization and applications

    Antonin Chambolle. An algorithm for total variation mini- mization and applications. Journal of Mathematical imaging and vision, 20(1):89–97, 2004. 2

  3. [3]

    A polarization-aided trans- former for image deblurring via motion vector decomposi- tion

    Duosheng Chen, Shihao Zhou, Jinshan Pan, Jinglei Shi, Lishen Qu, and Jufeng Yang. A polarization-aided trans- former for image deblurring via motion vector decomposi- tion. In CVPR, pages 28061–28070, 2025. 6

  4. [4]

    Binarized diffusion model for image super-resolution

    Zheng Chen, Haotong Qin, Yong Guo, Xiongfei Su, Xin Yuan, Linghe Kong, and Yulun Zhang. Binarized diffusion model for image super-resolution. In NeurIPS, 2024. 1

  5. [5]

    Pugan: Phys- ical model-guided underwater image enhancement using gan with dual-discriminators

    Runmin Cong, Wenyu Yang, Wei Zhang, Chongyi Li, Chun- Le Guo, Qingming Huang, and Sam Kwong. Pugan: Phys- ical model-guided underwater image enhancement using gan with dual-discriminators. IEEE Transactions on Image Processing, 2023. 6

  6. [6]

    Pcgan: A noise robust conditional generative ad- versarial network for one shot learning

    Lizhen Deng, Chunming He, Guoxia Xu, Hu Zhu, and Hao Wang. Pcgan: A noise robust conditional generative ad- versarial network for one shot learning. IEEE Transactions on Intelligent Transportation Systems, 23(12):25249–25258,

  7. [7]

    Deepsn-net: Deep semi-smooth newton driven net- work for blind image restoration

    Xin Deng, Chenxiao Zhang, Lai Jiang, Jingyuan Xia, and Mai Xu. Deepsn-net: Deep semi-smooth newton driven net- work for blind image restoration. IEEE Trans. Pattern Anal. Mach. Intell., 2025. 1, 6, 8

  8. [8]

    Real-world image dehazing with coherence-based label gen- erator and cooperative unfolding network

    Chengyu Fang, Chunming He, Fengyang Xiao, Yulun Zhang, Longxiang Tang, Yuelin Zhang, Kai Li, and Xiu Li. Real-world image dehazing with coherence-based label gen- erator and cooperative unfolding network. NeurIPS, 2024. 5

  9. [9]

    Self-supervised non-uniform kernel estimation with flow-based motion prior for blind im- age deblurring

    Zhenxuan Fang, Fangfang Wu, Weisheng Dong, Xin Li, Jin- jian Wu, and Guangming Shi. Self-supervised non-uniform kernel estimation with flow-based motion prior for blind im- age deblurring. In CVPR, pages 18105–18114, 2023. 6

  10. [10]

    Removing rain from single images via a deep detail network

    Xueyang Fu, Jiabin Huang, Delu Zeng, Yue Huang, Xinghao Ding, and John Paisley. Removing rain from single images via a deep detail network. InCVPR, pages 3855–3863, 2017. 6

  11. [11]

    Rave: Residual vector embedding for clip-guided backlit im- age enhancement

    Tatiana Gaintseva, Martin Benning, and Gregory Slabaugh. Rave: Residual vector embedding for clip-guided backlit im- age enhancement. In ECCV, pages 412–428. Springer, 2024. 6

  12. [12]

    Efficient multi-scale network with learnable discrete wavelet transform for blind motion deblurring

    Xin Gao, Tianheng Qiu, Xinyu Zhang, Hanlin Bai, Kang Liu, Xuan Huang, Hu Wei, Guoying Zhang, and Huaping Liu. Efficient multi-scale network with learnable discrete wavelet transform for blind motion deblurring. In CVPR, pages 2733–2742, 2024. 6

  13. [13]

    Less confi- dence, less forgetting: Learning with a humbler teacher in exemplar-free class-incremental learning

    Zijian Gao, Kele Xu, Huiping Zhuang, Li Liu, Xinjun Mao, Bo Ding, Dawei Feng, and Huaimin Wang. Less confi- dence, less forgetting: Learning with a humbler teacher in exemplar-free class-incremental learning. Neural Networks, 179:106513, 2024. 2

  14. [14]

    Stabilizing zero-shot prediction: A novel antidote to forgetting in continual vision-language tasks

    Zijian Gao, Xingxing Zhang, Kele Xu, Xinjun Mao, and Huaimin Wang. Stabilizing zero-shot prediction: A novel antidote to forgetting in continual vision-language tasks. In Advances in Neural Information Processing Systems, pages 128462–128488. Curran Associates, Inc., 2024. 2

  15. [15]

    Maintaining fairness in logit-based knowledge distillation for class-incremental learning

    Zijian Gao, Shanhao Han, Xingxing Zhang, Kele Xu, Du- lan Zhou, Xinjun Mao, Yong Dou, and Huaimin Wang. Maintaining fairness in logit-based knowledge distillation for class-incremental learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16):16763–16771,

  16. [16]

    Knowledge memorization and rumination for pre-trained model-based class-incremental learning

    Zijian Gao, Wangwang Jia, Xingxing Zhang, Dulan Zhou, Kele Xu, Feng Dawei, Yong Dou, Xinjun Mao, and Huaimin Wang. Knowledge memorization and rumination for pre-trained model-based class-incremental learning. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 20523–20533, 2025. 2

  17. [17]

    Underwater ranker: Learn which is better and how to be better

    Chunle Guo, Ruiqi Wu, Xin Jin, Linghao Han, Weidong Zhang, Zhi Chai, and Chongyi Li. Underwater ranker: Learn which is better and how to be better. In AAAI, pages 702– 709, 2023. 6

  18. [18]

    Mambair: A simple baseline for im- age restoration with state-space model

    Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, and Shu-Tao Xia. Mambair: A simple baseline for im- age restoration with state-space model. In ECCV, 2024. 3, 6, 8

  19. [19]

    Reti-diff: Illumination degradation image restoration with retinex-based latent diffusion model

    Chunming He, Chengyu Fang, Yulun Zhang, Kai Li, Longx- iang Tang, Chenyu You, Fengyang Xiao, Zhenhua Guo, and Xiu Li. Reti-diff: Illumination degradation image restora- tion with retinex-based latent diffusion model.arXiv preprint arXiv:2311.11638, 2023. 1, 4, 6, 8

  20. [20]

    Hqg- net: Unpaired medical image enhancement with high-quality guidance

    Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longx- iang Tang, Yulun Zhang, Yaowei Wang, and Xiu Li. Hqg- net: Unpaired medical image enhancement with high-quality guidance. IEEE Transactions on Neural Networks and Learning Systems, 2023. 1, 2, 4

  21. [21]

    Degradation-resistant unfolding network for heterogeneous image fusion

    Chunming He, Kai Li, Guoxia Xu, Yulun Zhang, Runze Hu, Zhenhua Guo, and Xiu Li. Degradation-resistant unfolding network for heterogeneous image fusion. In ICCV, pages 12611–12621, 2023. 2

  22. [22]

    Camouflaged object detection with feature decomposition and edge reconstruc- tion

    Chunming He, Kai Li, Yachao Zhang, Longxiang Tang, Yu- lun Zhang, Zhenhua Guo, and Xiu Li. Camouflaged object detection with feature decomposition and edge reconstruc- tion. In CVPR, pages 22046–22055, 2023. 1

  23. [23]

    Unfoldir: Rethinking deep unfolding network in illumination degradation image restoration.arXiv preprint arXiv:2505.06683, 2025

    Chunming He, Rihan Zhang, Fengyang Xiao, Chengyu Fang, Longxiang Tang, Yulun Zhang, and Sina Farsiu. Unfoldir: Rethinking deep unfolding network in illu- mination degradation image restoration. arXiv preprint arXiv:2505.06683, 2025. 2

  24. [24]

    Run: Reversible unfolding network for concealed object segmentation

    Chunming He, Rihan Zhang, Fengyang Xiao, Chenyu Fang, Longxiang Tang, Yulun Zhang, Linghe Kong, Deng-Ping Fan, Kai Li, and Sina Farsiu. Run: Reversible unfolding network for concealed object segmentation. arXiv preprint arXiv:2501.18783, 2025. 2, 8

  25. [25]

    Global structure-aware diffusion pro- cess for low-light image enhancement

    HOU Jinhui, Zhiyu Zhu, Junhui Hou, LIU Hui, Huanqiang Zeng, and Hui Yuan. Global structure-aware diffusion pro- cess for low-light image enhancement. In NeurIPS, 2023. 6

  26. [26]

    Ivf-net: An in- frared and visible data fusion deep network for traffic ob- ject enhancement in intelligent transportation systems

    Mingye Ju, Chunming He, and Juping Liu. Ivf-net: An in- frared and visible data fusion deep network for traffic ob- ject enhancement in intelligent transportation systems. IEEE Trans. Intell. Transp. Syst., 2022. 8

  27. [27]

    All-inclusive image enhance- ment for degraded images exhibiting low-frequency corrup- tion

    Mingye Ju, Chunming He, Can Ding, Wenqi Ren, Lin Zhang, and Kai-Kuang Ma. All-inclusive image enhance- ment for degraded images exhibiting low-frequency corrup- tion. Trans. Circuits Syst. Video Technol., 2024. 1

  28. [28]

    Auto-Encoding Variational Bayes

    Diederik P Kingma and Max Welling. Auto-encoding varia- tional bayes. arXiv preprint arXiv:1312.6114, 2013. 5

  29. [29]

    Efficient frequency domain-based trans- formers for high-quality image deblurring

    Lingshun Kong, Jiangxin Dong, Jianjun Ge, Mingqiang Li, and Jinshan Pan. Efficient frequency domain-based trans- formers for high-quality image deblurring. In CVPR, pages 5886–5895, 2023. 6

  30. [30]

    An underwater image enhancement benchmark dataset and beyond

    Chongyi Li, Chunle Guo, Wenqi Ren, Runmin Cong, Jun- hui Hou, Sam Kwong, and Dacheng Tao. An underwater image enhancement benchmark dataset and beyond. IEEE Transactions on Image Processing, 29:4376–4389, 2019. 6

  31. [31]

    Fcdfusion: A fast, low color de- viation method for fusing visible and infrared image pairs

    Hesong Li and Ying Fu. Fcdfusion: A fast, low color de- viation method for fusing visible and infrared image pairs. Computational Visual Media, 11(1):195–211, 2025. 1

  32. [32]

    Noise calibration and spatial-frequency interactive network for stem image enhancement

    Hesong Li, Ziqi Wu, Ruiwen Shao, Tao Zhang, and Ying Fu. Noise calibration and spatial-frequency interactive network for stem image enhancement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Conference, pages 21287–21296, 2025. 1

  33. [33]

    Iterative prompt learning for unsupervised backlit image enhancement

    Zhexin Liang, Chongyi Li, Shangchen Zhou, Ruicheng Feng, and Chen Change Loy. Iterative prompt learning for unsupervised backlit image enhancement. In ICCV, pages 8094–8103, 2023. 6

  34. [34]

    Motion-adaptive separable collaborative filters for blind motion deblurring

    Chengxu Liu, Xuan Wang, Xiangyu Xu, Ruhao Tian, Shuai Li, Xueming Qian, and Ming-Hsuan Yang. Motion-adaptive separable collaborative filters for blind motion deblurring. In CVPR, pages 25595–25605, 2024. 6

  35. [35]

    Dinat- ir: Exploring dilated neighborhood attention for high-quality image restoration

    Hanzhou Liu, Binghan Li, Chengkai Liu, and Mi Lu. Dinat- ir: Exploring dilated neighborhood attention for high-quality image restoration. arXiv preprint arXiv:2507.17892, 2025. 6

  36. [36]

    Stochastic gradient descent with warm restarts

    I Loshchilov. Stochastic gradient descent with warm restarts. In ICLR, pages 1–16, 2016. 6

  37. [37]

    Tf-icon: Diffusion-based training-free cross-domain image compo- sition

    Shilin Lu, Yanzhu Liu, and Adams Wai-Kin Kong. Tf-icon: Diffusion-based training-free cross-domain image compo- sition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2294–2305, 2023. 1

  38. [38]

    Mace: Mass concept erasure in diffu- sion models

    Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, and Adams Wai-Kin Kong. Mace: Mass concept erasure in diffu- sion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6430– 6440, 2024. 1

  39. [39]

    Robust watermarking using generative priors against image editing: From benchmarking to advances.arXiv preprint arXiv:2410.18775, 2024

    Shilin Lu, Zihan Zhou, Jiayou Lu, Yuanzhi Zhu, and Adams Wai-Kin Kong. Robust watermarking using generative pri- ors against image editing: From benchmarking to advances. arXiv preprint arXiv:2410.18775, 2024. 1

  40. [40]

    Image restoration with mean-reverting stochastic differential equations.arXiv preprint arXiv:2301.11699, 2023

    Ziwei Luo, Fredrik K Gustafsson, Zheng Zhao, Jens Sj¨olund, and Thomas B Sch ¨on. Image restoration with mean- reverting stochastic differential equations. arXiv preprint arXiv:2301.11699, 2023. 6

  41. [41]

    Backlitnet: A dataset and network for backlit image enhancement

    Xiaoqian Lv, Shengping Zhang, Qinglin Liu, Haozhe Xie, Bineng Zhong, and Huiyu Zhou. Backlitnet: A dataset and network for backlit image enhancement. Computer Vision and Image Understanding, 218:103403, 2022. 6

  42. [42]

    Toward fast, flexible, and robust low-light image enhancement

    Long Ma, Tengyu Ma, Risheng Liu, Xin Fan, and Zhongx- uan Luo. Toward fast, flexible, and robust low-light image enhancement. In CVPR, pages 5637–5646, 2022. 8

  43. [43]

    Ttt-mim: test-time training with masked image modeling for denoising distribution shifts

    Youssef Mansour, Xuyang Zhong, Serdar Caglar, and Rein- hard Heckel. Ttt-mim: test-time training with masked image modeling for denoising distribution shifts. In ECCV, pages 341–357. Springer, 2024. 6

  44. [44]

    Deep generalized unfolding networks for image restoration

    Chong Mou, Qian Wang, and Jian Zhang. Deep generalized unfolding networks for image restoration. In CVPR, pages 17399–17410, 2022. 1, 2, 3, 6, 8

  45. [45]

    Deep multi-scale convolutional neural network for dynamic scene deblurring

    Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. In CVPR, pages 3883–3891, 2017. 6

  46. [46]

    Random sub-samples generation for self- supervised real image denoising

    Yizhong Pan, Xiao Liu, Xiangyu Liao, Yuanzhouhan Cao, and Chao Ren. Random sub-samples generation for self- supervised real image denoising. In ICCV, pages 12150– 12159, 2023. 6

  47. [47]

    Deep compressed multi- channel adaptive optics scanning light ophthalmoscope

    Jongwan Park, Kristen Hagan, Theodore B DuBose, Ramiro S Maldonado, Ryan P McNabb, Alfredo Dubra, Joseph A Izatt, and Sina Farsiu. Deep compressed multi- channel adaptive optics scanning light ophthalmoscope. Sci. Adv., 11(19):eadr5912, 2025. 1

  48. [48]

    U-shape transformer for underwater image enhancement

    Lintao Peng, Chunli Zhu, and Liheng Bian. U-shape transformer for underwater image enhancement. IEEE Transactions on Image Processing, 2023. 6

  49. [49]

    Benchmarking denoising algo- rithms with real photographs

    Tobias Plotz and Stefan Roth. Benchmarking denoising algo- rithms with real photographs. In CVPR, pages 1586–1595,

  50. [50]

    Adaptive dynamic filtering network for image denoising

    Hao Shen, Zhong-Qiu Zhao, and Wandi Zhang. Adaptive dynamic filtering network for image denoising. In AAAI, pages 2227–2235, 2023. 6

  51. [51]

    Human-aware mo- tion deblurring

    Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao. Human-aware mo- tion deblurring. In ICCV, pages 5572–5581, 2019. 6

  52. [52]

    Vmam- bair: Visual state space model for image restoration

    Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, and Wenming Yang. Vmam- bair: Visual state space model for image restoration. IEEE Transactions on Circuits and Systems for Video Technology,

  53. [53]

    Deep admm-net for compressive sensing mri

    Jian Sun, Huibin Li, Zongben Xu, et al. Deep admm-net for compressive sensing mri. NeurIPS, 29, 2016. 1

  54. [54]

    Deep Retinex Decomposition for Low-Light Enhancement

    Chen Wei, Wenjing Wang, Wenhan Yang, and Jiaying Liu. Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560, 2018. 6

  55. [55]

    Uretinex-net: Retinex-based deep unfolding network for low-light image enhancement

    Wenhui Wu, Jian Weng, Pingping Zhang, Xu Wang, Wen- han Yang, and Jianmin Jiang. Uretinex-net: Retinex-based deep unfolding network for low-light image enhancement. In CVPR, pages 5901–5910, 2022. 7, 8

  56. [56]

    Diffir: Efficient diffusion model for image restoration

    Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xing- long Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. Diffir: Efficient diffusion model for image restoration. In ICCV, 2023. 1, 2, 6

  57. [57]

    Snr-aware low-light image enhancement

    Xiaogang Xu, Ruixing Wang, Chi-Wing Fu, and Jiaya Jia. Snr-aware low-light image enhancement. In CVPR, pages 17714–17724, 2022. 8

  58. [58]

    Prism: Progressive rain removal with integrated state- space modeling

    Pengze Xue, Shanwen Wang, Fei Zhou, Yan Cui, and Xin Sun. Prism: Progressive rain removal with integrated state- space modeling. arXiv preprint arXiv:2509.26413, 2025. 6

  59. [59]

    You only need one color space: An efficient network for low-light image enhancement

    Qingsen Yan, Yixu Feng, Cheng Zhang, Pei Wang, Peng Wu, Wei Dong, Jinqiu Sun, and Yanning Zhang. You only need one color space: An efficient network for low-light image enhancement. CVPR, 2025. 6, 8

  60. [60]

    Dnlut: Ultra-efficient color image denoising via channel-aware lookup tables

    Sidi Yang, Binxiao Huang, Yulun Zhang, Dahai Yu, Yujiu Yang, and Ngai Wong. Dnlut: Ultra-efficient color image denoising via channel-aware lookup tables. In CVPR, pages 7582–7591, 2025. 6

  61. [61]

    Deep joint rain detection and removal from a single image

    Wenhan Yang, Robby T Tan, Jiashi Feng, Jiaying Liu, Zong- ming Guo, and Shuicheng Yan. Deep joint rain detection and removal from a single image. In CVPR, pages 1357–1366,

  62. [62]

    Sparse gradient regularized deep retinex network for robust low-light image enhancement

    Wenhan Yang, Wenjing Wang, Haofeng Huang, Shiqi Wang, and Jiaying Liu. Sparse gradient regularized deep retinex network for robust low-light image enhancement. IEEE Transactions on Image Processing, 30:2072–2086, 2021. 6

  63. [63]

    Diff-retinex: Rethinking low-light image enhancement with a generative diffusion model

    Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, and Jiayi Ma. Diff-retinex: Rethinking low-light image enhancement with a generative diffusion model. In ICCV, pages 12302– 12311, 2023. 2, 6

  64. [64]

    Diff-retinex++: Retinex-driven reinforced diffusion model for low-light image enhancement

    Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, and Ji- ayi Ma. Diff-retinex++: Retinex-driven reinforced diffusion model for low-light image enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 2

  65. [65]

    Deep variational net- work toward blind image restoration

    Zongsheng Yue, Hongwei Yong, Qian Zhao, Lei Zhang, Deyu Meng, and Kwan-Yee K Wong. Deep variational net- work toward blind image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(11):7011– 7026, 2024. 6

  66. [66]

    Density-aware single image de-raining using a multi-stream dense network

    He Zhang and Vishal M Patel. Density-aware single image de-raining using a multi-stream dense network. In CVPR, pages 695–704, 2018. 6

  67. [67]

    Im- age de-raining using a conditional generative adversarial net- work

    He Zhang, Vishwanath Sindagi, and Vishal M Patel. Im- age de-raining using a conditional generative adversarial net- work. IEEE transactions on circuits and systems for video technology, 30(11):3943–3956, 2019. 6

  68. [68]

    Deep un- folding network for image super-resolution

    Kai Zhang, Luc Van Gool, and Radu Timofte. Deep un- folding network for image super-resolution. In CVPR, pages 3217–3226, 2020. 2

  69. [69]

    Empowering low- light image enhancer through customized learnable priors

    Naishan Zheng, Man Zhou, Yanmeng Dong, Xiangyu Rui, Jie Huang, Chongyi Li, and Feng Zhao. Empowering low- light image enhancer through customized learnable priors. In ICCV, pages 12559–12569, 2023. 6, 8

  70. [70]

    Leveraging local and global cues for visual tracking via parallel interaction net- work

    Yaozong Zheng, Bineng Zhong, Qihua Liang, Zhenjun Tang, Rongrong Ji, and Xianxian Li. Leveraging local and global cues for visual tracking via parallel interaction net- work. IEEE Transactions on Circuits and Systems for Video Technology, 33(4):1671–1683, 2022. 1

  71. [71]

    Toward unified token learning for vision-language tracking

    Yaozong Zheng, Bineng Zhong, Qihua Liang, Guorong Li, Rongrong Ji, and Xianxian Li. Toward unified token learning for vision-language tracking. IEEE Transactions on Circuits and Systems for Video Technology, 34(4):2125–2135, 2023

  72. [72]

    Odtrack: Online dense temporal token learning for visual tracking

    Yaozong Zheng, Bineng Zhong, Qihua Liang, Zhiyi Mo, Shengping Zhang, and Xianxian Li. Odtrack: Online dense temporal token learning for visual tracking. In Proceedings of the AAAI conference on artificial intelligence, pages 7588–7596, 2024

  73. [73]

    Decoupled spatio-temporal consistency learning for self-supervised tracking

    Yaozong Zheng, Bineng Zhong, Qihua Liang, Ning Li, and Shuxiang Song. Decoupled spatio-temporal consistency learning for self-supervised tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 10635– 10643, 2025. 1

  74. [74]

    To- wards universal modal tracking with online dense temporal token learning

    Yaozong Zheng, Bineng Zhong, Qihua Liang, Shengping Zhang, Guorong Li, Xianxian Li, and Rongrong Ji. To- wards universal modal tracking with online dense temporal token learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 1

  75. [75]

    Underwater camera: Improv- ing visual perception via adaptive dark pixel prior and color correction

    Jingchun Zhou, Qian Liu, Qiuping Jiang, Wenqi Ren, Kin- Man Lam, and Weishi Zhang. Underwater camera: Improv- ing visual perception via adaptive dark pixel prior and color correction. International Journal of Computer Vision, pages 1–19, 2023. 6

  76. [76]

    Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration

    Shihao Zhou, Duosheng Chen, Jinshan Pan, Jinglei Shi, and Jufeng Yang. Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2952–2963,

  77. [77]

    Seeing the unseen: A fre- quency prompt guided transformer for image restoration

    Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Lishen Qu, and Jufeng Yang. Seeing the unseen: A fre- quency prompt guided transformer for image restoration. In ECCV, pages 246–264. Springer, 2024. 6

  78. [78]

    Trim-sod: A multi- modal, multi-task, and multi-scale spacecraft optical dataset

    Tianyu Zhu, Hesong Li, and Ying Fu. Trim-sod: A multi- modal, multi-task, and multi-scale spacecraft optical dataset. Space: Science & Technology, 5:0299, 2025. 1