arxiv: 2604.02742 · v1 · submitted 2026-04-03 · 📡 eess.IV · cs.CV

Recognition: no theorem link

Task-Guided Prompting for Unified Remote Sensing Image Restoration

Wenli Huang , Yang Wu , Xiaomeng Xin , Zhihong Liu , Jinjun Wang , Ye Deng

Authors on Pith no claims yet

Pith reviewed 2026-05-13 18:29 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords remote sensing image restorationunified multi-task frameworktask-guided promptingimage restorationcloud removalSAR despecklingmulti-modal restoration

0 comments

The pith

A single network with task-specific prompts restores remote sensing images across five degradation types using one set of shared weights.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that learnable task embeddings can generate degradation-aware cues to modulate a decoder hierarchically, allowing one architecture to manage denoising, cloud removal, shadow removal, deblurring, and SAR despeckling across RGB, multispectral, SAR, and thermal infrared data. A sympathetic reader would care because real-world remote sensing observations routinely mix these degradations and sensor types, yet prior methods required separate specialized models for each case. By building a unified benchmark and showing gains on both joint training scenarios and unseen composite degradations, the work demonstrates that task-guided modulation can replace the need for multiple independent networks.

Core claim

TGPNet unifies five restoration tasks inside one architecture by inserting learnable task-specific embeddings that produce degradation-aware cues; these cues then hierarchically modulate features throughout the decoder while all weights remain shared, enabling precise adaptation to each pattern without separate models or retraining.

What carries the argument

Task-Guided Prompting (TGP), which creates learnable task-specific embeddings that generate degradation-aware cues for hierarchical modulation of decoder features.

If this is right

The same weights handle unseen composite degradations without additional training.
Performance exceeds that of dedicated single-task models on individual problems such as cloud removal.
One architecture covers restoration needs for RGB, multispectral, SAR, and thermal infrared modalities.
Operational pipelines can replace multiple specialized models with a single adaptive system.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The prompting approach may transfer to multi-task restoration problems outside remote sensing, such as medical or astronomical imaging.
Memory and deployment costs drop when a single model replaces an ensemble of task-specific networks.
Extending the benchmark to include additional sensor types or degradations would test whether the hierarchical modulation scales further.

Load-bearing premise

Task-specific embeddings can precisely tailor feature modulation in a shared-weight network for distinct degradations across modalities without causing interference or accuracy loss.

What would settle it

Jointly training the unified model on all five tasks and measuring whether its cloud removal performance falls below that of a model trained only on cloud removal.

Figures

Figures reproduced from arXiv: 2604.02742 by Jinjun Wang, Wenli Huang, Xiaomeng Xin, Yang Wu, Ye Deng, Zhihong Liu.

**Figure 2.** Figure 2: Architecture of the proposed Task-Guided Prompting Network (TGPNet) for unified remote sensing image restoration. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Visual comparison of restored images for four degradation types on our URSIR benchmark. Key local details (in green [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Visual comparison of restored images for multispectral declouding on SEN12MS-CR and thermal deblurring on HIT [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Visual evaluation of TGPNet on unseen real-world imagery from the WHU-Shadow dataset [57], demonstrating [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Visual comparison of restoration results for out-of-distribution composite degradations: direct vs. sequential processing. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Visual comparison of restored images on composite degradation tasks under Gaussian noise ( [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Visual comparison of restored images for single-degradation declouding on RICE2. Key local details (in green boxes) [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Visualization of ablation study results comparing the [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 11.** Figure 11: t-SNE visualization of decoder stage 2 features before [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗

read the original abstract

Remote sensing image restoration (RSIR) is essential for recovering high-fidelity imagery from degraded observations, enabling accurate downstream analysis. However, most existing methods focus on single degradation types within homogeneous data, restricting their practicality in real-world scenarios where multiple degradations often across diverse spectral bands or sensor modalities, creating a significant operational bottleneck. To address this fundamental gap, we propose TGPNet, a unified framework capable of handling denoising, cloud removal, shadow removal, deblurring, and SAR despeckling within a single, unified architecture. The core of our framework is a novel Task-Guided Prompting (TGP) strategy. TGP leverages learnable, task-specific embeddings to generate degradation-aware cues, which then hierarchically modulate features throughout the decoder. This task-adaptive mechanism allows the network to precisely tailor its restoration process for distinct degradation patterns while maintaining a single set of shared weights. To validate our framework, we construct a unified RSIR benchmark covering RGB, multispectral, SAR, and thermal infrared modalities for five aforementioned restoration tasks. Experimental results demonstrate that TGPNet achieves state-of-the-art performance on both unified multi-task scenarios and unseen composite degradations, surpassing even specialized models in individual domains such as cloud removal. By successfully unifying heterogeneous degradation removal within a single adaptive framework, this work presents a significant advancement for multi-task RSIR, offering a practical and scalable solution for operational pipelines. The code and benchmark will be released at https://github.com/huangwenwenlili/TGPNet.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TGPNet gives a workable single-model approach for five RSIR tasks via hierarchical task prompts, but the SOTA edge over specialized models rests on unverified training parity.

read the letter

The main point is a shared backbone that uses learnable task embeddings to generate cues and modulate the decoder at multiple levels, letting one set of weights handle denoising, cloud removal, shadow removal, deblurring, and SAR despeckling across RGB, multispectral, SAR, and thermal data. They also release a new unified benchmark, which is the practical part worth noticing. This setup directly targets the operational hassle of running separate models for each degradation type and modality. The hierarchical prompting is a straightforward adaptation trick that keeps the architecture simple while allowing task-specific behavior without duplicating the whole network. That combination of unification and a public benchmark is the clearest addition here. The stress-test concern holds up on the abstract: the claim that it beats specialized single-task models on individual domains like cloud removal does not come with evidence that those baselines were retrained from scratch on the exact same splits and schedule. Without that, any reported gap could trace to differences in optimization, data exposure, or the implicit capacity added by the embeddings rather than the prompting mechanism itself. No ablations or quantitative tables appear in the provided description, so the contribution of the TGP component versus the unified training regime stays hard to isolate. The assumption that one shared weight set can be precisely steered across heterogeneous degradations without interference is plausible but needs direct testing. This paper is for groups building operational remote sensing pipelines who would rather maintain one model than five. It deserves peer review because the unification goal is concrete and the prompting idea is testable, even if the comparisons require tightening and the results section needs fuller controls.

Referee Report

2 major / 2 minor

Summary. The paper proposes TGPNet, a unified framework for remote sensing image restoration (RSIR) that handles five tasks—denoising, cloud removal, shadow removal, deblurring, and SAR despeckling—across RGB, multispectral, SAR, and thermal infrared modalities using a single shared-weight architecture. The core contribution is Task-Guided Prompting (TGP), which employs learnable task-specific embeddings to produce degradation-aware cues that hierarchically modulate features in the decoder. The authors introduce a new multi-modal RSIR benchmark and claim that TGPNet achieves state-of-the-art results on both unified multi-task settings and unseen composite degradations, outperforming even specialized single-task models in domains such as cloud removal.

Significance. If the superiority claims are substantiated with proper controls, the work would advance multi-task RSIR by demonstrating that a single adaptive network can address heterogeneous degradations and sensor modalities without task-specific retraining, offering a scalable alternative to maintaining separate models. The construction and planned release of the unified benchmark is a concrete contribution that would facilitate future research on composite degradations. The hierarchical prompting mechanism provides a reusable design pattern for task-conditioned feature modulation.

major comments (2)

[Experiments] Experiments section: The headline claims that TGPNet surpasses specialized single-task models (e.g., on cloud removal) rest on comparisons whose validity depends on whether those baselines were retrained from scratch on the exact data splits, schedule, and optimization protocol of the new unified benchmark. The manuscript does not report this information, leaving open the possibility that observed gains arise from training-regime differences or implicit capacity expansion via the task embeddings rather than from the TGP modulation itself.
[Method] Method section (TGP description): The central assumption that a single shared backbone modulated by task-specific embeddings can precisely tailor restoration for distinct degradation patterns across modalities without interference or negative transfer is load-bearing for the unified-framework claim, yet the paper provides no ablation or capacity-matched comparison isolating the effect of the hierarchical modulation from the benefits of joint training.

minor comments (2)

[Abstract] Abstract: The statement of SOTA performance would be strengthened by naming the primary quantitative metrics (PSNR/SSIM) and briefly indicating the magnitude of improvement over the strongest baseline.
[Results] Figure captions and tables: Ensure all reported results include standard deviations or confidence intervals when multiple runs are performed, and clearly label whether results are on the unified multi-task test set or on composite-degradation hold-out sets.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below and will revise the manuscript to incorporate the requested clarifications and additional analyses.

read point-by-point responses

Referee: [Experiments] Experiments section: The headline claims that TGPNet surpasses specialized single-task models (e.g., on cloud removal) rest on comparisons whose validity depends on whether those baselines were retrained from scratch on the exact data splits, schedule, and optimization protocol of the new unified benchmark. The manuscript does not report this information, leaving open the possibility that observed gains arise from training-regime differences or implicit capacity expansion via the task embeddings rather than from the TGP modulation itself.

Authors: We agree that the manuscript should explicitly document the baseline training details to substantiate the comparisons. All single-task baselines were retrained from scratch on the identical data splits, using the same optimization schedule, learning rate policy, and batch size as TGPNet. We will revise the Experiments section to include a dedicated subsection detailing these protocols for every compared method, along with confirmation that no additional capacity or task-specific architectural changes were introduced beyond the original baseline designs. This will demonstrate that the reported gains arise from the TGP mechanism rather than training differences. revision: yes
Referee: [Method] Method section (TGP description): The central assumption that a single shared backbone modulated by task-specific embeddings can precisely tailor restoration for distinct degradation patterns across modalities without interference or negative transfer is load-bearing for the unified-framework claim, yet the paper provides no ablation or capacity-matched comparison isolating the effect of the hierarchical modulation from the benefits of joint training.

Authors: We acknowledge that an explicit isolation of the hierarchical modulation's contribution would strengthen the unified-framework claim. We will add two new experiments in the revised manuscript: (1) a capacity-matched ablation in which a single-task baseline is augmented with an equivalent number of parameters to the task embeddings and trained jointly, and (2) a component ablation that disables the hierarchical prompting while retaining joint training. These results will quantify the specific benefit of the TGP modulation versus joint-training effects and confirm the absence of negative transfer across modalities. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces TGPNet, a new neural architecture using learnable task-specific embeddings for hierarchical feature modulation in unified RSIR. All claims rest on experimental validation against a constructed multi-modal benchmark rather than any closed-form derivations, predictions, or self-referential definitions. No equations are presented that reduce performance metrics to fitted inputs by construction, and no load-bearing self-citations or uniqueness theorems are invoked. The framework is self-contained with independent empirical support.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach rests on standard assumptions of deep convolutional networks being adaptable via conditioning signals, plus the new invented prompting mechanism; no free parameters beyond the learnable embeddings are specified.

free parameters (1)

task-specific embeddings
Learnable embeddings per task that are trained to produce degradation-aware cues modulating decoder features.

axioms (1)

domain assumption Hierarchical feature modulation by task embeddings can adapt a shared network to multiple distinct degradation types without cross-task interference.
Invoked in the design of the TGP strategy to enable unified processing.

invented entities (1)

Task-Guided Prompting (TGP) no independent evidence
purpose: Generate degradation-aware cues from learnable task-specific embeddings to modulate features throughout the decoder.
Core novel component introduced to unify the five restoration tasks.

pith-pipeline@v0.9.0 · 5587 in / 1332 out tokens · 31115 ms · 2026-05-13T18:29:55.649143+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 2 internal anchors

[1]

Landslide detection, monitoring and prediction with remote-sensing techniques,

N. Casagli, E. Intrieri, V . Tofani, G. Gigli, and F. Raspini, “Landslide detection, monitoring and prediction with remote-sensing techniques,” Nature Reviews Earth & Environment, vol. 4, no. 1, pp. 51–64, 2023

work page 2023
[2]

Satellite remote sensing for water resources management: Potential for supporting sustainable development in data- poor regions,

J. Sheffield, E. F. Wood, M. Pan, H. Beck, G. Coccia, A. Serrat- Capdevila, and K. Verbist, “Satellite remote sensing for water resources management: Potential for supporting sustainable development in data- poor regions,”Water Resources Research, vol. 54, no. 12, pp. 9724– 9758, 2018

work page 2018
[3]

Statistical machine learning methods and remote sensing for sustainable development goals: A review,

J. Holloway and K. Mengersen, “Statistical machine learning methods and remote sensing for sustainable development goals: A review,” Remote Sensing, vol. 10, no. 9, p. 1365, 2018

work page 2018
[4]

Multiscale and direction target detecting in remote sensing images via modified yolo-v4,

Z. Zakria, J. Deng, R. Kumar, M. S. Khokhar, J. Cai, and J. Kumar, “Multiscale and direction target detecting in remote sensing images via modified yolo-v4,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 1039–1048, 2022

work page 2022
[5]

Remote sensing image segmentation advances: A meta-analysis,

I. Kotaridis and M. Lazaridou, “Remote sensing image segmentation advances: A meta-analysis,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 173, pp. 309–322, 2021

work page 2021
[6]

Rsid-cr: Remote sensing image denoising based on contrastive learning,

Z. Wang, X. He, B. Xiao, L. Chen, and X. Bi, “Rsid-cr: Remote sensing image denoising based on contrastive learning,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024

work page 2024
[7]

Cr- former: Single image cloud removal with focused taylor attention,

Y . Wu, Y . Deng, S. Zhou, Y . Liu, W. Huang, and J. Wang, “Cr- former: Single image cloud removal with focused taylor attention,”IEEE Transactions on Geoscience and Remote Sensing, 2024

work page 2024
[8]

Cascaded memory network for optical remote sensing imagery cloud removal,

J. Liu, B. Pan, and Z. Shi, “Cascaded memory network for optical remote sensing imagery cloud removal,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–11, 2024

work page 2024
[9]

Shadowformer: Global context helps shadow removal,

L. Guo, S. Huang, D. Liu, H. Cheng, and B. Wen, “Shadowformer: Global context helps shadow removal,” inProceedings of the AAAI conference on artificial intelligence, vol. 37, no. 1, 2023, pp. 710–718

work page 2023
[10]

Homoformer: Homogenized transformer for image shadow removal,

J. Xiao, X. Fu, Y . Zhu, D. Li, J. Huang, K. Zhu, and Z.-J. Zha, “Homoformer: Homogenized transformer for image shadow removal,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 25 617–25 626

work page 2024
[11]

Sar image despeckling using continuous attention module,

J. Ko and S. Lee, “Sar image despeckling using continuous attention module,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 3–19, 2021

work page 2021
[12]

Contrastive learn- ing for real sar image despeckling,

Y . Fang, R. Liu, Y . Peng, J. Guan, D. Li, and X. Tian, “Contrastive learn- ing for real sar image despeckling,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 218, pp. 376–391, 2024

work page 2024
[13]

All-in-one image restoration for unknown corruption,

B. Li, X. Liu, P. Hu, Z. Wu, J. Lv, and X. Peng, “All-in-one image restoration for unknown corruption,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 17 452–17 462

work page 2022
[14]

Promptir: Prompting for all-in-one blind image restoration,

V . Potlapalli, S. Zamir, S. Khan, and F. Khan, “Promptir: Prompting for all-in-one blind image restoration,”arXiv preprint arXiv:2306.13090, vol. 6

work page arXiv
[15]

Adair: Adaptive all-in-one image restoration via frequency mining and modulation,

Y . Cui, S. W. Zamir, S. Khan, A. Knoll, M. Shah, and F. S. Khan, “Adair: Adaptive all-in-one image restoration via frequency mining and modulation,” inThe Thirteenth International Conference on Learning Representations. VOL. XX, NO. XX, DEC. 2025 16

work page 2025
[16]

Image restoration for remote sensing: Overview and toolbox,

B. Rasti, Y . Chang, E. Dalsasso, L. Denis, and P. Ghamisi, “Image restoration for remote sensing: Overview and toolbox,”IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 201–230, 2021

work page 2021
[17]

Coupling model-and data-driven methods for remote sensing image restoration and fusion: Improving physical interpretability,

H. Shen, M. Jiang, J. Li, C. Zhou, Q. Yuan, and L. Zhang, “Coupling model-and data-driven methods for remote sensing image restoration and fusion: Improving physical interpretability,”IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 231–249, 2022

work page 2022
[18]

Deep memory connected neural network for optical remote sensing image restoration,

W. Xu, G. Xu, Y . Wang, X. Sun, D. Lin, and Y . Wu, “Deep memory connected neural network for optical remote sensing image restoration,” Remote Sensing, vol. 10, no. 12, p. 1893, 2018

work page 2018
[19]

Hybrid convolutional and attention network for hyperspectral image denoising,

S. Hu, F. Gao, X. Zhou, J. Dong, and Q. Du, “Hybrid convolutional and attention network for hyperspectral image denoising,”IEEE Geoscience and Remote Sensing Letters, vol. 21, pp. 1–5, 2024

work page 2024
[20]

Mb-taylorformer v2: improved multi-branch linear transformer expanded by taylor formula for image restoration,

Z. Jin, Y . Qiu, K. Zhang, H. Li, and W. Luo, “Mb-taylorformer v2: improved multi-branch linear transformer expanded by taylor formula for image restoration,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

work page 2025
[21]

Deep dense multi-scale network for snow removal using semantic and depth priors,

K. Zhang, R. Li, Y . Yu, W. Luo, and C. Li, “Deep dense multi-scale network for snow removal using semantic and depth priors,”IEEE Transactions on Image Processing, vol. 30, pp. 7419–7431, 2021

work page 2021
[22]

Enhanced spatio- temporal interaction learning for video deraining: faster and better,

K. Zhang, D. Li, W. Luo, W. Ren, and W. Liu, “Enhanced spatio- temporal interaction learning for video deraining: faster and better,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 1287–1293, 2022

work page 2022
[23]

Adversarial spatio-temporal learning for video deblurring,

K. Zhang, W. Luo, Y . Zhong, L. Ma, W. Liu, and H. Li, “Adversarial spatio-temporal learning for video deblurring,”IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 291–301, 2018

work page 2018
[24]

Lldiffusion: Learning degradation representations in diffusion models for low-light image enhancement,

T. Wang, K. Zhang, Y . Zhang, W. Luo, B. Stenger, T. Lu, T.-K. Kim, and W. Liu, “Lldiffusion: Learning degradation representations in diffusion models for low-light image enhancement,”Pattern Recognition, vol. 166, p. 111628, 2025

work page 2025
[25]

Despecknet: Generalizing deep learning-based sar image despeckling,

A. G. Mullissa, D. Marcos, D. Tuia, M. Herold, and J. Reiche, “Despecknet: Generalizing deep learning-based sar image despeckling,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1– 15, 2020

work page 2020
[26]

Hir-diff: Unsupervised hyperspectral image restoration via improved diffusion models,

L. Pang, X. Rui, L. Cui, H. Wang, D. Meng, and X. Cao, “Hir-diff: Unsupervised hyperspectral image restoration via improved diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 3005–3014

work page 2024
[27]

A progressive image restoration network for high-order degradation imaging in remote sensing,

Y . Feng, Y . Yang, X. Fan, Z. Zhang, L. Bu, and J. Zhang, “A progressive image restoration network for high-order degradation imaging in remote sensing,”arXiv preprint arXiv:2412.07195, 2024

work page arXiv 2024
[28]

Prompthsi: Universal hyperspectral image restoration framework for composite degradation,

C.-M. Lee, C.-H. Cheng, Y .-F. Lin, Y .-C. Cheng, W.-T. Liao, C.-C. Hsu, F.-E. Yang, and Y .-C. F. Wang, “Prompthsi: Universal hyperspectral image restoration framework for composite degradation,”arXiv e-prints, pp. arXiv–2411, 2024

work page 2024
[29]

A survey on all-in- one image restoration: Taxonomy, evaluation and future trends,

J. Jiang, Z. Zuo, G. Wu, K. Jiang, and X. Liu, “A survey on all-in- one image restoration: Taxonomy, evaluation and future trends,”arXiv preprint arXiv:2410.15067, 2024

work page arXiv 2024
[30]

Pre-trained image processing transformer,

H. Chen, Y . Wang, T. Guo, C. Xu, Y . Deng, Z. Liu, S. Ma, C. Xu, C. Xu, and W. Gao, “Pre-trained image processing transformer,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 12 299–12 310

work page 2021
[31]

Lora-ir: taming low-rank experts for efficient all-in-one image restoration,

Y . Ai, H. Huang, and R. He, “Lora-ir: taming low-rank experts for efficient all-in-one image restoration,”arXiv preprint arXiv:2410.15385, 2024

work page arXiv 2024
[32]

Complexity experts are task-discriminative learners for any image restoration,

E. Zamfir, Z. Wu, N. Mehta, Y . Tan, D. P. Paudel, Y . Zhang, and R. Timofte, “Complexity experts are task-discriminative learners for any image restoration,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 12 753–12 763

work page 2025
[33]

Onerestore: A universal restoration framework for composite degradation,

Y . Guo, Y . Gao, Y . Lu, H. Zhu, R. W. Liu, and S. He, “Onerestore: A universal restoration framework for composite degradation,” inEuropean conference on computer vision. Springer, 2024, pp. 255–272

work page 2024
[34]

Allrestorer: All-in-one transformer for image restoration under composite degradations,

J. Mao, Y . Yang, X. Yin, L. Shao, and H. Tang, “Allrestorer: All-in-one transformer for image restoration under composite degradations,”arXiv preprint arXiv:2411.10708, 2024

work page arXiv 2024
[35]

Restoring vision in adverse weather conditions with patch-based denoising diffusion models,

O. Ozdenizci and R. Legenstein, “Restoring vision in adverse weather conditions with patch-based denoising diffusion models,”IEEE Trans- actions on Pattern Analysis & Machine Intelligence, vol. 45, no. 08, pp. 10 346–10 357, 2023

work page 2023
[36]

Multimodal prompt perceiver: Empower adaptiveness generalizability and fidelity for all-in- one image restoration,

Y . Ai, H. Huang, X. Zhou, J. Wang, and R. He, “Multimodal prompt perceiver: Empower adaptiveness generalizability and fidelity for all-in- one image restoration,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 25 432–25 444

work page 2024
[37]

Autodir: Automatic all-in-one image restoration with latent diffusion,

Y . Jiang, Z. Zhang, T. Xue, and J. Gu, “Autodir: Automatic all-in-one image restoration with latent diffusion,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 340–359

work page 2024
[38]

Unirestore: Unified perceptual and task-oriented image restoration model using diffusion prior,

I. Chen, W.-T. Chen, Y .-W. Liu, Y .-C. Chiang, S.-Y . Kuo, M.-H. Yanget al., “Unirestore: Unified perceptual and task-oriented image restoration model using diffusion prior,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 17 969–17 979

work page 2025
[39]

Unicorn: Latent diffusion-based unified controllable image restoration network across multiple degradations,

D. Mandal, S. Chattopadhyay, G. Tong, and P. Chakravarthula, “Unicorn: Latent diffusion-based unified controllable image restoration network across multiple degradations,”arXiv preprint arXiv:2503.15868, 2025

work page arXiv 2025
[40]

Q-Agent: Quality-Driven Chain-of-Thought Image Restoration Agent through Robust Multimodal Large Language Model

Y . Zhou, J. Cao, Z. Zhang, F. Wen, Y . Jiang, J. Jia, X. Liu, X. Min, and G. Zhai, “Q-agent: Quality-driven chain-of-thought image restoration agent through robust multimodal large language model,”arXiv preprint arXiv:2504.07148, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[41]

Vision-language gradient descent-driven all-in-one deep unfolding networks,

H. Zeng, X. Wang, Y . Chen, J. Su, and J. Liu, “Vision-language gradient descent-driven all-in-one deep unfolding networks,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 7524–7533

work page 2025
[42]

Instructir: High-quality image restoration following human instructions,

M. V . Conde, G. Geigle, and R. Timofte, “Instructir: High-quality image restoration following human instructions,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 1–21

work page 2024
[43]

Spire: Semantic prompt-driven image restoration,

C. Qi, Z. Tu, K. Ye, M. Delbracio, P. Milanfar, Q. Chen, and H. Talebi, “Spire: Semantic prompt-driven image restoration,” inEuropean Con- ference on Computer Vision. Springer, 2024, pp. 446–464

work page 2024
[44]

Multi-axis prompt and multi-dimension fusion network for all-in-one weather-degraded image restoration,

Y . Wen, T. Gao, J. Zhang, Z. Li, and T. Chen, “Multi-axis prompt and multi-dimension fusion network for all-in-one weather-degraded image restoration,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 8, 2025, pp. 8323–8331

work page 2025
[45]

Restormer: Efficient transformer for high-resolution image restoration,

S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 5728–5739

work page 2022
[46]

Film: Visual reasoning with a general conditioning layer,

E. Perez, F. Strub, H. De Vries, V . Dumoulin, and A. Courville, “Film: Visual reasoning with a general conditioning layer,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

work page 2018
[47]

Loss functions for image restoration with neural networks,

H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,”IEEE Transactions on computational imaging, vol. 3, no. 1, pp. 47–57, 2016

work page 2016
[48]

Bag-of-visual-words and spatial extensions for land-use classification,

Y . Yang and S. Newsam, “Bag-of-visual-words and spatial extensions for land-use classification,” inProceedings of the 18th SIGSPATIAL in- ternational conference on advances in geographic information systems, 2010, pp. 270–279

work page 2010
[49]

A remote sensing image dataset for cloud removal,

D. Lin, G. Xu, X. Wang, Y . Wang, X. Sun, and K. Fu, “A remote sensing image dataset for cloud removal,”arXiv preprint arXiv:1901.00600, 2019

work page arXiv 1901
[50]

Cloud removal in sentinel-2 imagery using a deep residual neural network and sar-optical data fusion,

A. Meraner, P. Ebel, X. X. Zhu, and M. Schmitt, “Cloud removal in sentinel-2 imagery using a deep residual neural network and sar-optical data fusion,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 166, pp. 333–346, 2020

work page 2020
[51]

Deshadownet: A multi- context embedding deep network for shadow removal,

L. Qu, J. Tian, S. He, Y . Tang, and R. W. Lau, “Deshadownet: A multi- context embedding deep network for shadow removal,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4067–4075

work page 2017
[52]

Robust sar image despeckling by deep learning from near-real datasets,

J. Guan, R. Liu, X. Tian, X. Tang, and S. Li, “Robust sar image despeckling by deep learning from near-real datasets,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 2963–2979, 2023

work page 2023
[53]

Hit-uav: A high-altitude infrared thermal dataset for unmanned aerial vehicle-based object detection,

J. Suo, T. Wang, X. Zhang, H. Chen, W. Zhou, and W. Shi, “Hit-uav: A high-altitude infrared thermal dataset for unmanned aerial vehicle-based object detection,”Scientific Data, vol. 10, no. 1, p. 227, 2023

work page 2023
[54]

Decoupled Weight Decay Regularization

I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[55]

Attentive contextual attention for cloud removal,

W. Huang, Y . Deng, Y . Wu, and J. Wang, “Attentive contextual attention for cloud removal,”IEEE Transactions on Geoscience and Remote Sensing, 2024

work page 2024
[56]

Harmony in diversity: Improving all-in-one image restoration via multi-task collaboration,

G. Wu, J. Jiang, K. Jiang, and X. Liu, “Harmony in diversity: Improving all-in-one image restoration via multi-task collaboration,” inProceedings of the 32nd ACM international conference on multimedia, 2024, pp. 6015–6023

work page 2024
[57]

Deeply supervised convolutional neural network for shadow detection based on a novel aerial shadow imagery dataset,

S. Luo, H. Li, and H. Shen, “Deeply supervised convolutional neural network for shadow detection based on a novel aerial shadow imagery dataset,”ISPRS Journal of Photogrammetry and remote sensing, vol. 167, pp. 443–457, 2020

work page 2020
[58]

Cloud removal for remote sensing imagery via spatial attention generative adversarial network,

H. Pan, “Cloud removal for remote sensing imagery via spatial attention generative adversarial network,”arXiv preprint arXiv:2009.13015, 2020

work page arXiv 2009
[59]

Uncertainty-based thin cloud removal network via conditional variational autoencoders,

H. Ding, Y . Zi, and F. Xie, “Uncertainty-based thin cloud removal network via conditional variational autoencoders,” inProceedings of the Asian Conference on Computer Vision, 2022, pp. 469–485

work page 2022
[60]

Recovering realistic texture in image super-resolution by deep spatial feature transform,

X. Wang, K. Yu, C. Dong, and C. C. Loy, “Recovering realistic texture in image super-resolution by deep spatial feature transform,” VOL. XX, NO. XX, DEC. 2025 17 inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 606–615. Wenli Huangreceived her Ph.D. from Xi’an Jiao- tong University, Xi’an, China, in 2023. She is c...

work page 2025
[61]

degree in the Institute of Artificial Intelligence and Robotics at Xi’an Jiaotong University

She is currently pursuing a Ph.D. degree in the Institute of Artificial Intelligence and Robotics at Xi’an Jiaotong University. Her research interests include Knowledge graph completion and Graph Representation Learning. Xiaomeng Xinreceived the M.E. degree from ShaanXi Normal University, Xi an, China, in 2015. She is currently pursuing the Ph.D. degree i...

work page 2015