arxiv: 2605.00719 · v1 · submitted 2026-05-01 · 💻 cs.CV

Recognition: unknown

Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy

Yinghao Chen , Yeying Jin , Xiang Chen , Yanyan Wei , Ziyang Yan , Yaowen Fu

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:22 UTC · model grok-4.3

classification 💻 cs.CV

keywords image derainingunsupervised learningself-reinforcementimage quality assessmentrain removalcomputer visionpseudo-paired data

0 comments

The pith

Unpaired image deraining improves when occasional high-quality outputs are selected by image quality assessment and recycled as rewards to guide further training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the difficulty of training deraining networks without paired clean and rainy images, where lack of strong constraints leads to poor convergence amid varied rain patterns. It proposes that high-quality derained results sometimes emerge during training and can be identified using image quality assessment to create dynamic rewards. These rewards are fed into a self-reinforcement stage that adds a reinforced loss term, narrowing the optimization space and pulling outputs closer to clean images. The resulting method reaches leading performance on synthetic paired data, real paired data, and real unpaired data while outperforming prior unsupervised approaches in both visual and metric evaluations. The same reinforcement idea also transfers to other unsupervised deraining models and integrates with supervised networks.

Core claim

We introduce RGSUD consisting of reward recycling and self-reinforcement training. An IQA-based dynamic reward recycling mechanism selects the best derained outputs that appear during training and continuously assembles a set of high-quality images. These collected rewards are inserted into the model's optimization through a self-reinforced loss, which improves the quality of synthesized pseudo-paired data and stabilizes training so that derained outputs align more closely with clean images.

What carries the argument

IQA-based dynamic reward recycling mechanism that selects optimal derained outputs during training and reuses them as rewards inside a self-reinforcement training loop.

If this is right

The optimization space is constrained, leading to more stable convergence even when rain degradations are diverse and complex.
Superior deraining quality is obtained on paired synthetic, paired real, and unpaired real datasets, beating existing unsupervised methods in both subjective and objective IQA scores.
The self-reinforcement strategy can be plugged into other unsupervised deraining algorithms to raise their performance.
The overall framework generalizes across multiple existing supervised deraining networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pattern of harvesting occasional good intermediate results via an automatic quality scorer could stabilize training loops in related unsupervised restoration tasks such as dehazing or denoising.
By depending on self-generated rewards rather than external paired data, the approach points toward lower data-collection costs for building practical rain-removal systems in new environments.
Extending the method to video sequences would test whether the reward recycling remains effective when temporal consistency across frames must also be maintained.

Load-bearing premise

High-quality derained images appear occasionally during training and standard image quality assessment can reliably pick them out to produce rewards that genuinely move the model toward clean-image outputs.

What would settle it

A training run in which the IQA selector consistently chooses low-quality or average outputs, producing no gain or a drop in final deraining metrics on unpaired real-image test sets compared with a plain unsupervised baseline.

Figures

Figures reproduced from arXiv: 2605.00719 by Xiang Chen, Yanyan Wei, Yaowen Fu, Yeying Jin, Yinghao Chen, Ziyang Yan.

**Figure 1.** Figure 1: Observation, Motivation, Methodology, and Performance of Our Work. (a) PSNR statistics of deraining results during unsupervised training on Rain100L [82] and RealRain1K-L [39] datasets, showing the observed performance during training. (b) Image Quality Assessment (IQA) scores from the Vision Language Model (VLM)-based DACLIP-IQA [58] on the Rain100L dataset. VLM-based IQA provides perceptual scores as rew… view at source ↗

**Figure 2.** Figure 2: The framework of the proposed RGSUD. (a) Overview of the network architecture, illustrating the Reward Recycling and Self [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative deraining performance comparisons on Rain200L, DID-Data, and RealRain1K-L datasets. Our RGSUD achieves [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Compared with derained results on the SIRR datasets, and Real3000 datasets, RGSUD recovers clearer images. In outdoor [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: The application performance of derained images in [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization results [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Unsupervised deraining has attracted attention for its ability to learn the real-world distribution of rain without paired supervision. However, the lack of strong constraints makes it difficult for the network to converge, especially with the complex diversity of rain degradation. A key motivation is that high-quality deraining results occasionally emerge during training, which can be leveraged to guide the optimization process. To overcome these challenges, we introduce RGSUD (Reward-Guided Self-Reinforcement Unsupervised Image Deraining), comprising two key stages: reward recycling and self-reinforcement (SR) training. For the former stage, we propose an Image Quality Assessment (IQA)-based dynamic reward recycling mechanism that selects optimal derained outputs during training and continuously collects high-quality deraining images. In latter stage, we incorporate these rewards into the model's optimization process, constraining the optimization space and improving alignment between derained outputs and clean images. By leveraging IQA-based self-reinforced loss and dynamically updated rewards, we enhance the quality of synthesized pseudo-paired data and stabilize the optimization. Extensive experiments demonstrate that our method achieves SOTA performance across multiple datasets, including paired synthetic, paired real, and unpaired real images, outperforming existing unsupervised deraining approaches in both subjective and objective IQA metrics. Additionally, we show that the self-reinforcement strategy is adaptable to other unsupervised deraining methods and our deraining framework demonstrates strong generalization across existing supervised deraining networks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main move is recycling IQA scores to pick occasional good derained outputs and feed them back as rewards for self-reinforcement, which is a concrete new loop for unpaired deraining but rests on shaky ground about whether those scores actually track real rain removal.

read the letter

They introduce RGSUD with two stages: an IQA-driven reward recycling step that grabs high-quality derained images as they pop up during training, then a self-reinforcement stage that folds those into the loss to tighten the optimization and build pseudo-pairs. This is a fresh angle on stabilizing unsupervised deraining when paired data is missing, and it claims the strategy can plug into other unsupervised methods plus generalize to supervised networks. The motivation is solid—unsupervised models often wander without strong constraints, and leveraging rare good results makes sense on paper. They also run tests across synthetic paired, real paired, and unpaired real sets, reporting better subjective and objective scores than prior unsupervised baselines. That breadth is useful for the subfield. The load-bearing risk is exactly what the stress test flags: IQA metrics can reward visually clean but wrong outputs, like over-smoothed results that hide rain streaks without matching the true clean distribution. In a fully unpaired setup there is no ground-truth check to confirm the selected rewards reduce actual error rather than just proxy scores, and the abstract's note that good results 'occasionally emerge' does not include safeguards against reinforcing artifacts. If the full experiments include ablations on IQA choice, correlation plots with human judgments, or failure cases, that would tighten the claim; without them the SOTA numbers are hard to trust at face value. This is worth a serious referee for anyone working on unsupervised restoration or reward-style training in vision. The idea is practical enough that revisions on validation would make it usable, so I would not desk-reject.

Referee Report

3 major / 2 minor

Summary. The paper proposes RGSUD, an unsupervised deraining framework with two stages: (1) an IQA-based dynamic reward recycling mechanism that selects and collects high-quality derained outputs as they occasionally emerge during training, and (2) a self-reinforcement (SR) training stage that incorporates these selected rewards into the loss to constrain optimization, generate pseudo-paired data, and improve alignment with clean images. It claims SOTA performance over existing unsupervised methods on paired synthetic, paired real, and unpaired real datasets in both subjective and objective IQA metrics, plus adaptability of the SR strategy to other unsupervised deraining methods and generalization to supervised networks.

Significance. If the core mechanism holds, the work could offer a practical way to stabilize and improve unsupervised image restoration by recycling occasional high-quality outputs via proxy IQA signals rather than requiring paired supervision or external clean data. The adaptability claim, if experimentally supported, would be a useful contribution for the broader unsupervised restoration community.

major comments (3)

[reward recycling stage (as described in the abstract and method overview)] The load-bearing assumption that the IQA-based dynamic reward recycling mechanism reliably selects derained outputs aligned with the true clean-image distribution (rather than merely high-scoring but incorrect results such as over-smoothed or artifact-introduced images) is not validated. In the unpaired setting this selection directly feeds the self-reinforcement loss and pseudo-pair generation; without a reported correlation study against ground-truth error or an independence check between the IQA metric and actual degradation, the optimization constraint may reinforce proxy scores instead of true quality.
[self-reinforcement (SR) training stage] The self-reinforcement training stage incorporates the same IQA signals used for reward selection into the loss function. This creates a potential circularity: the optimization is guided by the very metric that selected the training signals, with no reported mechanism (e.g., held-out validation or alternative metric) to ensure the reinforcement improves actual deraining fidelity rather than IQA score alone.
[abstract and experimental claims] The abstract asserts SOTA results across multiple datasets and superiority in both subjective and objective IQA metrics, yet the provided text supplies no experimental protocol, baselines, ablation studies, or quantitative tables. Without these details the central performance claim cannot be evaluated and the soundness of the method remains unverifiable.

minor comments (2)

[method] The notation and exact formulation of the IQA-based reward (e.g., which no-reference or full-reference metrics are used, how the dynamic threshold is computed, and the precise form of the self-reinforced loss) should be stated explicitly with equations in the method section for reproducibility.
[reward recycling stage] Clarify whether the IQA metrics employed are reference-based (which would require clean images not available in unpaired training) or no-reference, and how this choice affects the claimed alignment with clean images.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments, which highlight important aspects of validation and clarity in our work. We address each major comment point by point below, with plans for revisions where appropriate to strengthen the manuscript.

read point-by-point responses

Referee: [reward recycling stage (as described in the abstract and method overview)] The load-bearing assumption that the IQA-based dynamic reward recycling mechanism reliably selects derained outputs aligned with the true clean-image distribution (rather than merely high-scoring but incorrect results such as over-smoothed or artifact-introduced images) is not validated. In the unpaired setting this selection directly feeds the self-reinforcement loss and pseudo-pair generation; without a reported correlation study against ground-truth error or an independence check between the IQA metric and actual degradation, the optimization constraint may reinforce proxy scores instead of true quality.

Authors: We agree that direct validation of the IQA selection's alignment with true quality is valuable. While the primary evaluation uses unpaired real-world data without ground truth, the manuscript already includes experiments on paired synthetic datasets. In the revision, we will add a dedicated analysis subsection that computes Pearson/Spearman correlations between the IQA scores of dynamically selected outputs and their ground-truth errors (PSNR/SSIM) on synthetic data. This will demonstrate that the recycling mechanism preferentially selects lower-error results rather than merely high-scoring artifacts, providing evidence that the proxy does not reinforce incorrect distributions. revision: yes
Referee: [self-reinforcement (SR) training stage] The self-reinforcement training stage incorporates the same IQA signals used for reward selection into the loss function. This creates a potential circularity: the optimization is guided by the very metric that selected the training signals, with no reported mechanism (e.g., held-out validation or alternative metric) to ensure the reinforcement improves actual deraining fidelity rather than IQA score alone.

Authors: The concern about circularity is valid and merits clarification. The reward recycling is dynamic and occurs only for intermittently high-quality outputs during training, while the SR loss applies these as constraints to stabilize convergence toward cleaner outputs. To address potential proxy overfitting, the revision will include an ablation study that evaluates final model performance using a held-out alternative IQA metric (distinct from the one used for selection and loss) as well as standard fidelity metrics on both synthetic and real test sets. This will show that SR improves actual deraining quality beyond the selection metric alone. revision: partial
Referee: [abstract and experimental claims] The abstract asserts SOTA results across multiple datasets and superiority in both subjective and objective IQA metrics, yet the provided text supplies no experimental protocol, baselines, ablation studies, or quantitative tables. Without these details the central performance claim cannot be evaluated and the soundness of the method remains unverifiable.

Authors: The full manuscript contains a complete Experiments section (Section 4) that specifies the evaluation protocol (including training details, datasets for paired synthetic, paired real, and unpaired real scenarios), all baselines compared, ablation studies on the reward recycling and SR components, quantitative tables with PSNR/SSIM and multiple IQA metrics, and qualitative visual comparisons. We will revise the section to make the protocol and tables more prominently cross-referenced from the abstract and method overview, and expand any protocol descriptions that may have been insufficiently detailed in the reviewed version. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the method's logic or claims.

full rationale

The paper presents an algorithmic framework (RGSUD) with two stages: IQA-based reward recycling to select derained outputs during training, followed by self-reinforcement training that incorporates those selections into the loss for pseudo-paired data generation. This is a design choice using an external IQA proxy as reward signal rather than any first-principles derivation, mathematical reduction, or self-referential definition. No equations are shown reducing a 'prediction' to a fitted input by construction, no uniqueness theorems are imported via self-citation, and no ansatz is smuggled in. Performance claims rest on experimental results across datasets, not on tautological equivalence to inputs. The load-bearing assumption (IQA selects outputs aligned with clean images) is an empirical hypothesis open to falsification, not a circularity per the enumerated patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unverified emergence of usable high-quality samples during unsupervised training and the assumption that IQA can serve as an independent reward signal.

axioms (1)

domain assumption High-quality deraining results occasionally emerge during training and can be selected by IQA
Explicitly stated as key motivation in the abstract.

pith-pipeline@v0.9.0 · 5571 in / 1089 out tokens · 25266 ms · 2026-05-09T19:22:59.948687+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

99 extracted references · 16 canonical work pages · 1 internal anchor

[1]

Simple baselines for image restoration

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InEuropean confer- ence on computer vision, pages 17–33. Springer, 2022. 3, 7

2022
[2]

Rethinking Atrous Convolution for Semantic Image Segmentation

Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. Rethinking atrous convolution for seman- tic image segmentation.arXiv preprint arXiv:1706.05587,

work page internal anchor Pith review arXiv
[3]

Genhaze: Pioneering controllable one-step real- istic haze generation for real-world dehazing

Sixiang Chen, Tian Ye, Yunlong Lin, Yeying Jin, Yijun Yang, Haoyu Chen, Jianyu Lai, Song Fei, Zhaohu Xing, Fugee Tsung, et al. Genhaze: Pioneering controllable one-step real- istic haze generation for real-world dehazing. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 9194–9205, 2025. 3

2025
[4]

Dual-rain: Video rain removal using assertive and gentle teachers

Tingting Chen, Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, and Robby T Tan. Dual-rain: Video rain removal using assertive and gentle teachers. InEuropean Conference on Computer Vision, pages 127–143. Springer,
[5]

Un- paired deep image deraining using dual contrastive learning

Xiang Chen, Jinshan Pan, Kui Jiang, Yufeng Li, Yufeng Huang, Caihua Kong, Longgang Dai, and Zhentao Fan. Un- paired deep image deraining using dual contrastive learning. InProceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 2017–2026, 2022. 2, 5, 6, 7, 8

2017
[6]

Learn- ing a sparse transformer network for effective image de- raining

Xiang Chen, Hao Li, Mingqiang Li, and Jinshan Pan. Learn- ing a sparse transformer network for effective image de- raining. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5896–5905,
[7]

Bidirectional multi-scale implicit neural representations for image derain- ing

Xiang Chen, Jinshan Pan, and Jiangxin Dong. Bidirectional multi-scale implicit neural representations for image derain- ing. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 25627–25636,
[8]

Enhancing rainy image via invertible networks

Yinghao Chen and Yaowen Fu. Enhancing rainy image via invertible networks. In2025 10th International Conference on Signal and Image Processing (ICSIP), pages 201–205,
[9]

Single-photon 3d imaging with a multi-stage network.Op- tics Express, 30(16):29173–29188, 2022

Ying-Hao Chen, Jian Li, Shi-Peng Xie, and Qin Wang. Single-photon 3d imaging with a multi-stage network.Op- tics Express, 30(16):29173–29188, 2022. 8

2022
[10]

Transfer clip for gen- eralizable image denoising

Jun Cheng, Dong Liang, and Shan Tan. Transfer clip for gen- eralizable image denoising. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25974–25984, 2024. 3

2024
[11]

From global statistic to local statistic: Micro-doppler period esti- mation based on short-time similarity statistic.IEEE Trans- actions on Signal Processing, 72:1269–1285, 2024

Yiwei Dai, Wenpeng Zhang, and Yongxiang Liu. From global statistic to local statistic: Micro-doppler period esti- mation based on short-time similarity statistic.IEEE Trans- actions on Signal Processing, 72:1269–1285, 2024. 1

2024
[12]

Irregular micromotion period mea- surement: Reentry boosters as a case study.IEEE Transac- tions on Instrumentation and Measurement, 74:1–19, 2025

Yiwei Dai, Wenpeng Zhang, Yongxiang Liu, Jie Yu, Xiang Li, and Xiangke Liao. Irregular micromotion period mea- surement: Reentry boosters as a case study.IEEE Transac- tions on Instrumentation and Measurement, 74:1–19, 2025. 1

2025
[13]

Channel consistency prior and self- reconstruction strategy based unsupervised image deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao, Linbo Qing, and Chao Ren. Channel consistency prior and self- reconstruction strategy based unsupervised image deraining. InProceedings of the Computer Vision and Pattern Recogni- tion Conference, pages 7469–7479, 2025. 2, 5, 6, 7, 8

2025
[14]

Learning domain- aware task prompt representations for multi-domain all-in- one image restoration, 2026

Guanglu Dong, Chunlei Li, Chao Ren, Jingliang Hu, Yilei Shi, Xiao Xiang Zhu, and Lichao Mou. Learning domain- aware task prompt representations for multi-domain all-in- one image restoration, 2026. 2

2026
[15]

Mose: Monocular semantic reconstruction using nerf-lifted noisy priors.IEEE Robotics and Automa- tion Letters, 9(11):10343–10350, 2024

Zhenhua Du, Binbin Xu, Haoyu Zhang, Kai Huo, and Shuaifeng Zhi. Mose: Monocular semantic reconstruction using nerf-lifted noisy priors.IEEE Robotics and Automa- tion Letters, 9(11):10343–10350, 2024. 8

2024
[16]

Photon: Speedup volume understand- ing with efficient multimodal large language models

Chengyu Fang, Heng Guo, Zheng Jiang, Chunming He, Xiu Li, and Minfeng Xu. Photon: Speedup volume understand- ing with efficient multimodal large language models. InThe Fourteenth International Conference on Learning Represen- tations. 3
[17]

Real-world image dehazing with coherence-based pseudo la- beling and cooperative unfolding network

Chengyu Fang, Chunming He, Fengyang Xiao, Yulun Zhang, Longxiang Tang, Yuelin Zhang, Kai Li, and Xiu Li. Real-world image dehazing with coherence-based pseudo la- beling and cooperative unfolding network. InThe Thirty- eighth Annual Conference on Neural Information Processing Systems, 2024

2024
[18]

Integrating extra modality helps seg- mentor find camouflaged objects well.arXiv preprint arXiv:2502.14471, 2025

Chengyu Fang, Chunming He, Longxiang Tang, Yuelin Zhang, Chenyang Zhu, Yuqi Shen, Chubin Chen, Guoxia Xu, and Xiu Li. Integrating extra modality helps seg- mentor find camouflaged objects well.arXiv preprint arXiv:2502.14471, 2025. 3

work page arXiv 2025
[19]

Removing rain from single images via a deep detail network

Xueyang Fu, Jiabin Huang, Delu Zeng, Yue Huang, Xinghao Ding, and John Paisley. Removing rain from single images via a deep detail network. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 3855–3863, 2017. 5

2017
[20]

Onerestore: A universal restoration frame- work for composite degradation

Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, and Shengfeng He. Onerestore: A universal restoration frame- work for composite degradation. InEuropean conference on computer vision, pages 255–272, 2024. 1

2024
[21]

Reti-diff: Illumination degradation image restoration with retinex-based latent diffusion model.arXiv preprint arXiv:2311.11638, 2023

Chunming He, Chengyu Fang, Yulun Zhang, Tian Ye, Kai Li, Longxiang Tang, Zhenhua Guo, Xiu Li, and Sina Far- siu. Reti-diff: Illumination degradation image restoration with retinex-based latent diffusion model.arXiv preprint arXiv:2311.11638, 2023. 3

work page arXiv 2023
[22]

Diffusion models in low-level vision: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Chunming He, Yuqi Shen, Chengyu Fang, Fengyang Xiao, Longxiang Tang, Yulun Zhang, Wangmeng Zuo, Zhenhua Guo, and Xiu Li. Diffusion models in low-level vision: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 3

2025
[23]

Auto-regressively generating multi-view con- sistent images

JiaKui Hu, Yuxiao Yang, Jialun Liu, Jinbo Wu, Chen Zhao, and Yanye Lu. Auto-regressively generating multi-view con- sistent images. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 2556–2566,
[24]

Omni-view: Unlocking how generation facilitates un- derstanding in unified 3d model based on multiview images, 2025

JiaKui Hu, Shanshan Zhao, Qing-Guo Chen, Xuerui Qiu, Jialun Liu, Zhao Xu, Weihua Luo, Kaifu Zhang, and Yanye Lu. Omni-view: Unlocking how generation facilitates un- derstanding in unified 3d model based on multiview images, 2025

2025
[25]

Geometry-as-context: Modulating explicit 3d in scene-consistent video generation to geometry context

JiaKui Hu, Jialun Liu, Liying Yang, Xinliang Zhang, Kaiwen Li, Shuang Zeng, Yuanwei Li, Haibin Huang, Chi Zhang, and Yanye Lu. Geometry-as-context: Modulating explicit 3d in scene-consistent video generation to geometry context. arXiv preprint arXiv:2602.21929, 2026. 2

work page arXiv 2026
[26]

Medical sam3: A foundation model for universal prompt-driven medical image segmen- tation.arXiv preprint arXiv:2601.10880, 2026

Chongcong Jiang, Tianxingjian Ding, Chuhan Song, Jiachen Tu, Ziyang Yan, Yihua Shao, Zhenyi Wang, Yuzhang Shang, Tianyu Han, and Yu Tian. Medical sam3: A foundation model for universal prompt-driven medical image segmen- tation.arXiv preprint arXiv:2601.10880, 2026. 3

work page arXiv 2026
[27]

Multi-scale progressive fusion network for single image deraining

Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Baojin Huang, Yimin Luo, Jiayi Ma, and Junjun Jiang. Multi-scale progressive fusion network for single image deraining. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 8346–8355, 2020. 2

2020
[28]

Autodir: Automatic all-in-one image restoration with latent diffusion

Yitong Jiang, Zhaoyang Zhang, Tianfan Xue, and Jinwei Gu. Autodir: Automatic all-in-one image restoration with latent diffusion. InEuropean Conference on Computer Vi- sion, pages 340–359. Springer, 2024. 3

2024
[29]

Dc- shadownet: Single-image hard and soft shadow removal using unsupervised domain-classifier guided network

Yeying Jin, Aashish Sharma, and Robby T Tan. Dc- shadownet: Single-image hard and soft shadow removal using unsupervised domain-classifier guided network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5027–5036, 2021. 2

2021
[30]

Structure representation network and uncertainty feedback learning for dense non-uniform fog removal

Yeying Jin, Wending Yan, Wenhan Yang, and Robby T Tan. Structure representation network and uncertainty feedback learning for dense non-uniform fog removal. InProceedings of the Asian Conference on Computer Vision, pages 2041– 2058, 2022. 8

2041
[31]

Unsupervised night image enhancement: When layer decomposition meets light-effects suppression

Yeying Jin, Wenhan Yang, and Robby T Tan. Unsupervised night image enhancement: When layer decomposition meets light-effects suppression. InEuropean Conference on Com- puter Vision, pages 404–421. Springer, 2022. 2

2022
[32]

Enhancing visibility in nighttime haze images using guided apsf and gradient adaptive convolution

Yeying Jin, Beibei Lin, Wending Yan, Yuan Yuan, Wei Ye, and Robby T Tan. Enhancing visibility in nighttime haze images using guided apsf and gradient adaptive convolution. InProceedings of the 31st ACM International Conference on Multimedia, pages 2446–2457, 2023. 8

2023
[33]

Raindrop clarity: A dual-focused dataset for day and night raindrop removal

Yeying Jin, Xin Li, Jiadong Wang, Yan Zhang, and Malu Zhang. Raindrop clarity: A dual-focused dataset for day and night raindrop removal. InEuropean Conference on Com- puter Vision, pages 1–17. Springer, 2025. 2

2025
[34]

Ultralytics YOLOv8.https://github

Glenn Jocher. Ultralytics YOLOv8.https://github. com/ultralytics, 2023. 7

2023
[35]

Musiq: Multi-scale image quality transformer

Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. Musiq: Multi-scale image quality transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 5148–5157, 2021. 5, 6, 8

2021
[36]

Luminance-aware sta- tistical quantization: Unsupervised hierarchical learning for illumination enhancement, 2025

Derong Kong, Zhixiong Yang, Shengxi Li, Shuaifeng Zhi, Li Liu, Zhen Liu, and Jingyuan Xia. Luminance-aware sta- tistical quantization: Unsupervised hierarchical learning for illumination enhancement, 2025. 1

2025
[37]

When schrodinger bridge meets real-world image dehazing with unpaired training

Yunwei Lan, Zhigao Cui, Xin Luo, Chang Liu, Nian Wang, Menglin Zhang, Yanzhao Su, and Dong Liu. When schrodinger bridge meets real-world image dehazing with unpaired training. InProceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 8756–8765,
[38]

In- struct2see: Learning to remove any obstructions across dis- tributions

Junhang Li, Yu Guo, Chuhua Xian, and Shengfeng He. In- struct2see: Learning to remove any obstructions across dis- tributions. InInternational Conference on Machine Learn- ing, 2025. 1

2025
[39]

Toward real-world single image deraining: A new benchmark and beyond,

Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, and Dacheng Tao. Toward real-world single image deraining: A new benchmark and beyond.arXiv preprint arXiv:2206.05514, 2022. 1, 5, 6, 7, 8

work page arXiv 2022
[40]

Saratr-x: Towards building a foundation model for sar target recognition.IEEE Transactions on Im- age Processing, 2025

Weijie Li, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, and Xiang Li. Saratr-x: Towards building a foundation model for sar target recognition.IEEE Transactions on Im- age Processing, 2025. 1

2025
[41]

Ntire 2025 challenge on day and night raindrop removal for dual-focused images: Methods and results

Xin Li, Yeying Jin, Xin Jin, Zongwei Wu, Bingchen Li, Yufei Wang, Wenhan Yang, Yu Li, Zhibo Chen, Bihan Wen, et al. Ntire 2025 challenge on day and night raindrop removal for dual-focused images: Methods and results. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 1172–1183, 2025. 2

2025
[42]

Gridless super-resolution sparse aperture isar imaging via orthonormal atomic norm minimization techniques.IEEE Transactions on Antennas and Propagation, pages 1–1, 2026

Yingjun Li, Yongxiang Liu, Wenpeng Zhang, Yaowen Fu, Shuanghui Zhang, and Wei Yang. Gridless super-resolution sparse aperture isar imaging via orthonormal atomic norm minimization techniques.IEEE Transactions on Antennas and Propagation, pages 1–1, 2026. 6

2026
[43]

Gm-moe: Low-light enhance- ment with gated-mechanism mixture-of-experts

Minwen Liao, Haobo Dong, Xinyi Wang, Kurban Ubul, Yi- hua Shao, and Ziyang Yan. Gm-moe: Low-light enhance- ment with gated-mechanism mixture-of-experts. InProceed- ings of the IEEE/CVF International Conference on Com- puter Vision, pages 8766–8776, 2025. 3

2025
[44]

Nightrain: Nighttime video deraining via adaptive-rain-removal and adaptive-correction

Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Shunli Zhang, and Robby T Tan. Nightrain: Nighttime video deraining via adaptive-rain-removal and adaptive-correction. InProceedings of the AAAI Conference on Artificial Intelli- gence, pages 3378–3385, 2024. 8

2024
[45]

Un- supervised image denoising in real-world scenarios via self- collaboration parallel generative adversarial branches

Xin Lin, Chao Ren, Xiao Liu, Jie Huang, and Yinjie Lei. Un- supervised image denoising in real-world scenarios via self- collaboration parallel generative adversarial branches. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12642–12652, 2023. 4

2023
[46]

Multi-task image restoration guided by robust dino features.arXiv preprint arXiv:2312.01677, 2023

Xin Lin, Jingtong Yue, Kelvin CK Chan, Lu Qi, Chao Ren, Jinshan Pan, and Ming-Hsuan Yang. Multi-task image restoration guided by robust dino features.arXiv preprint arXiv:2312.01677, 2023. 4

work page arXiv 2023
[47]

Dual degradation representation for joint deraining and low-light enhancement in the dark.IEEE transactions on circuits and systems for video technology, 35(3):2461–2473, 2024

Xin Lin, Jingtong Yue, Sixian Ding, Chao Ren, Lu Qi, and Ming-Hsuan Yang. Dual degradation representation for joint deraining and low-light enhancement in the dark.IEEE transactions on circuits and systems for video technology, 35(3):2461–2473, 2024. 3

2024
[48]

Re-boosting self- collaboration parallel prompt gan for unsupervised image restoration.arXiv preprint arXiv:2408.09241, 2024

Xin Lin, Yuyan Zhou, Jingtong Yue, Chao Ren, Kelvin CK Chan, Lu Qi, and Ming-Hsuan Yang. Re-boosting self- collaboration parallel prompt gan for unsupervised image restoration.arXiv preprint arXiv:2408.09241, 2024. 4

work page arXiv 2024
[49]

Re-boosting self- collaboration parallel prompt gan for unsupervised image restoration.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 2025

Xin Lin, Yuyan Zhou, Jingtong Yue, Chao Ren, Kelvin CK Chan, Lu Qi, and Ming-Hsuan Yang. Re-boosting self- collaboration parallel prompt gan for unsupervised image restoration.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 2025. 4

2025
[50]

Jarvisir: Elevating autonomous driving per- ception with intelligent image restoration

Yunlong Lin, Zixu Lin, Haoyu Chen, Panwang Pan, Chenxin Li, Sixiang Chen, Kairun Wen, Yeying Jin, Wenbo Li, and Xinghao Ding. Jarvisir: Elevating autonomous driving per- ception with intelligent image restoration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 22369–22380, 2025. 3

2025
[51]

Ranerf: Neural 3-d reconstruction of space tar- gets from isar image sequences.IEEE Transactions on Geo- science and Remote Sensing, 61:1–15, 2023

Afei Liu, Shuanghui Zhang, Chi Zhang, Shuaifeng Zhi, and Xiang Li. Ranerf: Neural 3-d reconstruction of space tar- gets from isar image sequences.IEEE Transactions on Geo- science and Remote Sensing, 61:1–15, 2023. 8

2023
[52]

Frequency domain-based diffusion model for unpaired image dehazing.arXiv preprint arXiv:2507.01275,

Chengxu Liu, Lu Qi, Jinshan Pan, Xueming Qian, and Ming- Hsuan Yang. Frequency domain-based diffusion model for unpaired image dehazing.arXiv preprint arXiv:2507.01275,

work page arXiv
[53]

Learning deblurring texture prior from un- paired data with diffusion model

Chengxu Liu, Lu Qi, Jinshan Pan, Xueming Qian, and Ming- Hsuan Yang. Learning deblurring texture prior from un- paired data with diffusion model. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 14195–14204, 2025. 5, 6

2025
[54]

Dog- iqa: Standard-guided zero-shot mllm for mix-grained image quality assessment.arXiv preprint arXiv:2410.02505, 2024

Kai Liu, Ziqing Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiaohong Liu, Linghe Kong, and Yulun Zhang. Dog- iqa: Standard-guided zero-shot mllm for mix-grained image quality assessment.arXiv preprint arXiv:2410.02505, 2024. 3

work page arXiv 2024
[55]

Deep learning for generic object detection: A survey.International journal of computer vision, 128(2):261–318, 2020

Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, and Matti Pietik ¨ainen. Deep learning for generic object detection: A survey.International journal of computer vision, 128(2):261–318, 2020. 5, 7

2020
[56]

A causal adjustment module for debiasing scene graph generation.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 47(5): 4024–4043, 2025

Li Liu, Shuzhou Sun, Shuaifeng Zhi, Fan Shi, Zhen Liu, Janne Heikkil ¨a, and Yongxiang Liu. A causal adjustment module for debiasing scene graph generation.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 47(5): 4024–4043, 2025. 1

2025
[57]

Unpaired learning for deep image deraining with rain direction regu- larizer

Yang Liu, Ziyu Yue, Jinshan Pan, and Zhixun Su. Unpaired learning for deep image deraining with rain direction regu- larizer. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 4753–4761, 2021. 5, 6

2021
[58]

Controlling vision-language models for multi-task image restoration

Ziwei Luo, Fredrik K Gustafsson, Zheng Zhao, Jens Sj¨olund, and Thomas B Sch ¨on. Controlling vision-language models for multi-task image restoration. InInternational Conference on Learning Representations, 2024. 1, 2, 3, 6, 8

2024
[59]

A critical analysis of nerf-based 3d reconstruction.Remote Sensing, 15(14): 3585, 2023

Fabio Remondino, Ali Karami, Ziyang Yan, Gabriele Maz- zacca, Simone Rigon, and Rongjun Qin. A critical analysis of nerf-based 3d reconstruction.Remote Sensing, 15(14): 3585, 2023. 2

2023
[60]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical image com- puting and computer-assisted intervention, pages 234–241. Springer, 2015. 3

2015
[61]

Eventvad: Training-free event-aware video anomaly detection

Yihua Shao, Haojin He, Sijie Li, Siyu Chen, Xinwei Long, Fanhu Zeng, Yuxuan Fan, Muyang Zhang, Ziyang Yan, Ao Ma, et al. Eventvad: Training-free event-aware video anomaly detection. InProceedings of the 33rd ACM Interna- tional Conference on Multimedia, pages 2586–2595, 2025. 3

2025
[62]

Icm-fusion: In-context meta-optimized lora fusion for multi-task adaptation.arXiv preprint arXiv:2508.04153, 2025

Yihua Shao, Xiaofeng Lin, Xinwei Long, Siyu Chen, Minxi Yan, Yang Liu, Ziyang Yan, Ao Ma, Hao Tang, and Jing- cai Guo. Icm-fusion: In-context meta-optimized lora fusion for multi-task adaptation.arXiv preprint arXiv:2508.04153, 2025

work page arXiv 2025
[63]

Ac- cidentblip: Agent of accident warning based on ma-former

Yihua Shao, Yeling Xu, Xinwei Long, Siyu Chen, Ziyang Yan, Haoting Liu, Yan Wang, Hao Tang, and Yang Yang. Ac- cidentblip: Agent of accident warning based on ma-former. In2025 IEEE Intelligent Vehicles Symposium (IV), pages 2156–2161. IEEE, 2025. 3

2025
[64]

Planerectr++: Uni- fied query learning for joint 3d planar reconstruction and pose estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Jingjia Shi, Shuaifeng Zhi, and Kai Xu. Planerectr++: Uni- fied query learning for joint 3d planar reconstruction and pose estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 8

2025
[65]

Rapid salient object detection with difference convolutional neural networks.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 2025

Zhuo Su, Li Liu, Matthias M ¨uller, Jiehua Zhang, Diana Wofk, Ming-Ming Cheng, and Matti Pietik ¨ainen. Rapid salient object detection with difference convolutional neural networks.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 2025. 1

2025
[66]

Nima: Neural image assessment.IEEE transactions on image processing, 27(8): 3998–4011, 2018

Hossein Talebi and Peyman Milanfar. Nima: Neural image assessment.IEEE transactions on image processing, 27(8): 3998–4011, 2018. 5, 6, 8

2018
[67]

Ex- ploring clip for assessing the look and feel of images

Jianyi Wang, Kelvin CK Chan, and Chen Change Loy. Ex- ploring clip for assessing the look and feel of images. InPro- ceedings of the AAAI conference on artificial intelligence, pages 2555–2563, 2023. 5, 8

2023
[68]

Unifying appearance codes and bilateral grids for driving scene gaussian splatting.arXiv preprint arXiv:2506.05280, 2025

Nan Wang, Yuantao Chen, Lixing Xiao, Weiqing Xiao, Bo- han Li, Zhaoxi Chen, Chongjie Ye, Shaocong Xu, Saining Zhang, Ziyang Yan, et al. Unifying appearance codes and bilateral grids for driving scene gaussian splatting.arXiv preprint arXiv:2506.05280, 2025. 1

work page arXiv 2025
[69]

Spatial attentive single-image deraining with a high quality real rain dataset

Tianyu Wang, Xin Yang, Ke Xu, Shaozhe Chen, Qiang Zhang, and Rynson WH Lau. Spatial attentive single-image deraining with a high quality real rain dataset. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12270–12279, 2019. 5, 6, 7

2019
[70]

Image quality assessment: from error visibility to structural similarity.IEEE TIP, 13(4):600–612, 2004

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE TIP, 13(4):600–612, 2004. 6

2004
[71]

Semi-supervised transfer learning for image rain re- moval

Wei Wei, Deyu Meng, Qian Zhao, Zongben Xu, and Ying Wu. Semi-supervised transfer learning for image rain re- moval. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3877–3886,
[72]

Deraincyclegan: Rain attentive cyclegan for single image deraining and rain- making.IEEE Transactions on Image Processing, 30:4788– 4801, 2021

Yanyan Wei, Zhao Zhang, Yang Wang, Mingliang Xu, Yi Yang, Shuicheng Yan, and Meng Wang. Deraincyclegan: Rain attentive cyclegan for single image deraining and rain- making.IEEE Transactions on Image Processing, 30:4788– 4801, 2021. 2, 5, 6, 7, 8

2021
[73]

arXiv preprint arXiv:2312.17090 (2023)

Haoning Wu, Zicheng Zhang, Weixia Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Yixuan Gao, Annan Wang, Erli Zhang, Wenxiu Sun, et al. Q-align: Teaching lmms for visual scoring via discrete text-defined levels.arXiv preprint arXiv:2312.17090, 2023. 2, 5, 6

work page arXiv 2023
[74]

Rainmamba: Enhanced locality learning with state space models for video deraining

Hongtao Wu, Yijun Yang, Huihui Xu, Weiming Wang, Jinni Zhou, and Lei Zhu. Rainmamba: Enhanced locality learning with state space models for video deraining. InProceedings of the 32nd ACM International Conference on Multimedia, pages 7881–7890, 2024. 2

2024
[75]

Enhancing diffusion-based restoration models via difficulty- adaptive reinforcement learning with iqa reward, 2025

Xiaogang Xu, Ruihang Chu, Jian Wang, Kun Zhou, Wenjie Shu, Harry Yang, Ser-Nam Lim, Hao Chen, and Liang Lin. Enhancing diffusion-based restoration models via difficulty- adaptive reinforcement learning with iqa reward, 2025. 3

2025
[76]

Synthetic-to-real self-supervised robust depth estimation via learning with motion and structure priors

Weilong Yan, Ming Li, Haipeng Li, Shuwei Shao, and Robby T Tan. Synthetic-to-real self-supervised robust depth estimation via learning with motion and structure priors. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 21880–21890, 2025. 2

2025
[77]

Las-comp: Zero-shot 3d completion with latent-spatial consistency.arXiv preprint arXiv:2602.18735, 2026

Weilong Yan, Haipeng Li, Hao Xu, Nianjin Ye, Yihao Ai, Shuaicheng Liu, and Jingyu Hu. Las-comp: Zero-shot 3d completion with latent-spatial consistency.arXiv preprint arXiv:2602.18735, 2026. 2

work page arXiv 2026
[78]

Ziyang Yan, Gabriele Mazzacca, Simone Rigon, Elisa Mari- arosaria Farella, Pawel Trybala, Fabio Remondino, et al. Nerfbk: a holistic dataset for benchmarking nerf-based 3d reconstruction.International Archives of the Photogramme- try, Remote Sensing and Spatial Information Sciences, 48(1): 219–226, 2023. 2

2023
[79]

3dsceneeditor: Controllable 3d scene editing with gaussian splatting.arXiv preprint arXiv:2412.01583, 2024

Ziyang Yan, Lei Li, Yihua Shao, Siyu Chen, Zongkai Wu, Jenq-Neng Hwang, Hao Zhao, and Fabio Remondino. 3dsce- needitor: Controllable 3d scene editing with gaussian splat- ting.arXiv preprint arXiv:2412.01583, 2024. 1

work page arXiv 2024
[80]

Renderworld: World model with self-supervised 3d label

Ziyang Yan, Wenzhen Dong, Yihua Shao, Yuhang Lu, Haiyang Liu, Jingwen Liu, Haozhe Wang, Zhe Wang, Yan Wang, Fabio Remondino, et al. Renderworld: World model with self-supervised 3d label. In2025 IEEE Inter- national Conference on Robotics and Automation (ICRA), pages 6063–6070. IEEE, 2025. 1

2025

Showing first 80 references.