pith. machine review for the scientific record. sign in

arxiv: 2603.27979 · v2 · submitted 2026-03-30 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

RetinexDualV2: Physically-Grounded Dual Retinex for Generalized UHD Image Restoration

Jun Chen, Mohab Kishawy

Authors on Pith no claims yet

Pith reviewed 2026-05-14 22:21 UTC · model grok-4.3

classification 💻 cs.CV
keywords UHD image restorationRetinex decompositionphysical priorsdual-branch networkdegradation removalattention mechanismgeneralized restoration
0
0 comments X

The pith

A single dual Retinex network restores UHD images across rain, low-light and noise degradations by conditioning on extracted physical priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a framework that extracts degradation-specific physical cues such as rain masks and dark channels from an input image and uses them to steer a dual-branch Retinex decomposition. This conditioning occurs through a specialized attention mechanism inside the network, so the same model structure can separate reflection and illumination layers for several different problems without redesign. A reader would care because most current restoration systems demand a separate trained model for each degradation type, which becomes impractical when real UHD photos contain mixed issues like rain combined with poor lighting. If the approach holds, deployment in cameras or editing tools could become simpler and more efficient while still delivering competitive results on benchmark challenges.

Core claim

RetinexDualV2 is a unified, physically grounded dual-branch framework for diverse Ultra-High-Definition (UHD) image restoration. Unlike generic models, our method employs a Task-Specific Physical Grounding Module (TS-PGM) to extract degradation-aware priors (e.g., rain masks and dark channels). These explicitly guide a Retinex decomposition network via a novel Physical-Conditioned Multi-head Self-Attention (PC-MSA) mechanism, enabling robust reflection and illumination correction. This physical conditioning allows a single architecture to handle various complex degradations seamlessly, without task-specific structural modifications.

What carries the argument

Task-Specific Physical Grounding Module that supplies degradation-aware priors to condition Physical-Conditioned Multi-head Self-Attention inside the dual Retinex branches for reflection and illumination separation.

If this is right

  • One trained architecture can address raindrop removal and joint noise low-light enhancement without any structural changes between tasks.
  • Explicit physical conditioning improves both restoration quality and computational efficiency at ultra-high resolutions.
  • The same model generalizes to day and night conditions through the shared grounding and attention pathway.
  • Reflection and illumination layers are corrected more reliably when guided by measurable image-formation cues rather than learned features alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practical camera pipelines could adopt the model for on-device correction of mixed degradations without storing multiple specialized networks.
  • Extending the grounding module to extract priors for blur or compression artifacts might allow further task unification.
  • The explicit tie to physical image formation suggests restoration outputs could become more interpretable for downstream analysis.

Load-bearing premise

The physical priors extracted by the grounding module remain accurate and sufficiently informative to guide Retinex decomposition for any new degradation without requiring changes to the network architecture.

What would settle it

Apply the trained model to UHD images containing a previously unseen degradation such as combined haze and motion blur; if it underperforms dedicated task-specific methods or needs architectural adjustments, the claim does not hold.

Figures

Figures reproduced from arXiv: 2603.27979 by Jun Chen, Mohab Kishawy.

Figure 1
Figure 1. Figure 1: Visual results of our proposed method on multiple image restoration tasks for NTIRE 2026 test sets, demonstrating its generaliz [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Inputs and corresponding priors for each task. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: RetinexDualV2 Overview. Based on retinex theory, it decomposes the UHD image into [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparison between different architectures on UHD-LL [ [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visual comparison between different architectures on 4K-Rain13K dataset [ [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visual results of our proposed method on multiple image restoration tasks from NTIRE 2026 test datasets. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
read the original abstract

We propose RetinexDualV2, a unified, physically grounded dual-branch framework for diverse Ultra-High-Definition (UHD) image restoration. Unlike generic models, our method employs a Task-Specific Physical Grounding Module (TS-PGM) to extract degradation-aware priors (e.g., rain masks and dark channels). These explicitly guide a Retinex decomposition network via a novel Physical-Conditioned Multi-head Self-Attention (PC-MSA) mechanism, enabling robust reflection and illumination correction. This physical conditioning allows a single architecture to handle various complex degradations seamlessly, without task-specific structural modifications. RetinexDualV2 demonstrates exceptional generalizability, securing 4th place in the NTIRE 2026 Day and Night Raindrop Removal Challenge and 5th place in the Joint Noise Low-light Enhancement (JNLLIE) Challenge. Extensive experiments confirm the state-of-the-art performance and efficiency of our physically motivated approach. Code is available at https://github.com/ErrorLogic1211/RetinexDual/tree/master/RetinexDualV2

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes RetinexDualV2, a unified dual-branch Retinex framework for generalized UHD image restoration. A Task-Specific Physical Grounding Module (TS-PGM) extracts degradation-aware priors (rain masks, dark channels) that condition Retinex decomposition through a novel Physical-Conditioned Multi-head Self-Attention (PC-MSA) mechanism. This enables a single architecture to address multiple degradations without task-specific structural changes. The method reports 4th place in the NTIRE 2026 Day and Night Raindrop Removal Challenge and 5th place in the Joint Noise Low-light Enhancement (JNLLIE) Challenge, with claims of SOTA performance and efficiency; code is released.

Significance. If the physical priors reliably guide decomposition, the work could advance unified UHD restoration models that avoid per-task redesigns, offering efficiency gains for real-world applications. The challenge rankings provide evidence of competitiveness, and open-sourcing the code supports reproducibility and further validation.

major comments (2)
  1. [Experimental results] Experimental results section: The central claim that TS-PGM priors enable unified generalizability without architectural changes is load-bearing, yet no quantitative validation of prior accuracy (e.g., rain-mask IoU against ground truth) or ablation isolating PC-MSA conditioning from the dual-branch backbone is reported. Challenge rankings alone do not establish that performance stems from physical guidance rather than model capacity.
  2. [Methods] Methods section, TS-PGM and PC-MSA description: The paper states that priors are extracted from the input and injected via PC-MSA, but provides no equations or diagrams detailing the exact conditioning mechanism (e.g., how rain masks modulate attention weights). This omission prevents verification that the priors remain effective across degradations without task-specific tuning.
minor comments (2)
  1. [Abstract] Abstract: The summary of 'extensive experiments' and 'SOTA performance' would be strengthened by including at least one key quantitative metric (PSNR/SSIM) or reference to an ablation table.
  2. [Figures] Figure captions: Ensure all figures showing prior extraction (e.g., rain masks, dark channels) include scale bars or explicit comparison to input images for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the manuscript accordingly to improve clarity and strengthen the evidence for our claims.

read point-by-point responses
  1. Referee: Experimental results section: The central claim that TS-PGM priors enable unified generalizability without architectural changes is load-bearing, yet no quantitative validation of prior accuracy (e.g., rain-mask IoU against ground truth) or ablation isolating PC-MSA conditioning from the dual-branch backbone is reported. Challenge rankings alone do not establish that performance stems from physical guidance rather than model capacity.

    Authors: We agree that the current experimental section relies primarily on challenge rankings and overall performance metrics, which do not fully isolate the contribution of the physical priors. In the revised manuscript, we will add (1) quantitative evaluation of prior accuracy (e.g., rain-mask IoU on available synthetic data) and (2) a controlled ablation study comparing the full model against a variant without PC-MSA conditioning. These additions will provide direct evidence that performance gains stem from the physical guidance rather than model capacity alone. We note that ground-truth priors are not always available for real-world data, but synthetic benchmarks will be used where possible. revision: yes

  2. Referee: Methods section, TS-PGM and PC-MSA description: The paper states that priors are extracted from the input and injected via PC-MSA, but provides no equations or diagrams detailing the exact conditioning mechanism (e.g., how rain masks modulate attention weights). This omission prevents verification that the priors remain effective across degradations without task-specific tuning.

    Authors: We acknowledge the lack of explicit mathematical formulation and visual details for the conditioning process in PC-MSA. In the revision, we will include the precise equations describing how degradation-aware priors (from TS-PGM) modulate the attention weights within PC-MSA, as well as an expanded diagram of the module. This will make the mechanism verifiable and demonstrate its applicability across different degradations without architectural changes. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper grounds its dual-branch Retinex framework in established Retinex theory and extracts standard physical priors (rain masks, dark channels) directly from the input image via the TS-PGM module. These priors condition the PC-MSA mechanism as an explicit design choice rather than a fitted or self-defined quantity. No equations reduce claimed performance, generalizability, or challenge rankings to parameters defined only inside the paper, and no self-citation chain is invoked as a uniqueness theorem or load-bearing premise. The architecture is presented as a physically motivated extension without tautological reduction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is limited to the core modeling assumptions stated there; no free parameters or invented entities are explicitly quantified.

axioms (1)
  • domain assumption Retinex theory provides a valid decomposition of degraded images into reflection and illumination components that can be corrected independently once guided by physical priors.
    The entire framework is built on applying Retinex decomposition to diverse degradations after conditioning with extracted physical signals.

pith-pipeline@v0.9.0 · 5482 in / 1385 out tokens · 38727 ms · 2026-05-14T22:21:37.164721+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 1 internal anchor

  1. [1]

    Ancuti, Alexandru Brateanu, Raul Balmez, Ciprian Orhei, Florin-Alexandru Vasluianu, Codruta O

    Radu P. Ancuti, Alexandru Brateanu, Raul Balmez, Ciprian Orhei, Florin-Alexandru Vasluianu, Codruta O. Ancuti, Radu Timofte, Cosmin Ancuti, et al. Ntire 2026 night- time image dehazing challenge report. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 7

  2. [2]

    Retinexmamba: Retinex-based mamba for low-light image enhancement

    Jiesong Bai, Yuhao Yin, Qiyuan He, Yuanxian Li, and Xiaofeng Zhang. Retinexmamba: Retinex-based mamba for low-light image enhancement. InNeural Information Processing, pages 427–442, Singapore, 2025. Springer Na- ture Singapore. 2

  3. [3]

    Retinexformer: One-stage retinex- based transformer for low-light image enhancement

    Yuanhao Cai, Hao Bian, Jing Lin, Haoqian Wang, Radu Tim- ofte, and Yulun Zhang. Retinexformer: One-stage retinex- based transformer for low-light image enhancement. InPro- ceedings of the IEEE/CVF international conference on com- puter vision, pages 12504–12513, 2023. 2

  4. [4]

    Drs- former: Learning a dense-residual spatial transformer for im- age restoration

    Hongming Chen, Xiang Chen, and Xianping Fu. Drs- former: Learning a dense-residual spatial transformer for im- age restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 6

  5. [5]

    Udr- s2former: Sparse spectral-spatial transformer for ultra-high- definition deraining

    Hongming Chen, Xiang Chen, and Xianping Fu. Udr- s2former: Sparse spectral-spatial transformer for ultra-high- definition deraining. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2023. 6

  6. [6]

    Towards ultra-high-definition image deraining: A benchmark and an efficient method

    Hongming Chen, Xiang Chen, Chen Wu, Zhuoran Zheng, Jinshan Pan, and Xianping Fu. Towards ultra-high-definition image deraining: A benchmark and an efficient method. arXiv preprint arXiv:2405.17074, 2024. 5, 6

  7. [7]

    Dehazenerf: Multi-image haze removal and 3d shape reconstruction using neural radiance fields

    Wei-Ting Chen, Wang Yifan, Sy-Yen Kuo, and Gordon Wet- zstein. Dehazenerf: Multi-image haze removal and 3d shape reconstruction using neural radiance fields. In2024 Inter- national Conference on 3D Vision (3DV), pages 247–256,

  8. [8]

    Low Light Image Enhancement Challenge at NTIRE 2026

    George Ciubotariu, Sharif S M A, Abdur Rehman, Fayaz Ali, Rizwan Ali Naqvi, Marcos Conde, Radu Timofte, et al. Low Light Image Enhancement Challenge at NTIRE 2026 . In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR) Workshops, 2026. 7

  9. [9]

    W. Dong, Y . Min, H. Zhou, and J. Chen. Towards scale- aware low-light enhancement via structure-guided trans- former design. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), pages 1459–1468, Nashville, TN, USA, 2025. 3

  10. [10]

    A physics-informed low-rank deep neural network for blind and universal lens aberration correction

    Jin Gong, Runzhao Yang, Weihang Zhang, Jinli Suo, and Qionghai Dai. A physics-informed low-rank deep neural network for blind and universal lens aberration correction. In2024 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 24861–24870, 2024. 2

  11. [11]

    Eretinex: Event camera meets retinex theory for low-light image enhancement.arXiv preprint arXiv:2503.02484, 2025

    Xuejian Guo, Zhiqiang Tian, Yuehang Wang, Siqi Li, Yu Jiang, Shaoyi Du, and Yue Gao. Eretinex: Event camera meets retinex theory for low-light image enhancement.arXiv preprint arXiv:2503.02484, 2025. 2

  12. [12]

    Diffll: Diffusion models for low-light image enhancement

    Yuchen Jiang, Cheng Li, and John Smith. Diffll: Diffusion models for low-light image enhancement. InACM Transac- tions on Graphics (TOG), 2023. 6

  13. [13]

    Hussein, and Jun Chen

    Mohab Kishawy, Ali A. Hussein, and Jun Chen. Retinex- dual: Retinex-based dual nature approach for generalized ultra-high-definition image restoration, 2025. 2, 3, 4

  14. [14]

    Imagenet classification with deep convolutional neural net- works

    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural net- works. InAdvances in Neural Information Processing Sys- tems. Curran Associates, Inc., 2012. 5

  15. [15]

    Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks

    Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming- Hsuan Yang. Fast and accurate image super-resolution with deep laplacian pyramid networks.CoRR, abs/1710.01992,

  16. [16]

    Embedding fourier for ultra-high-definition low-light image enhancement

    Chongyi Li, Chun-Le Guo, Man Zhou, Zhexin Liang, Shangchen Zhou, Ruicheng Feng, and Chen Change Loy. Embedding fourier for ultra-high-definition low-light image enhancement. InICLR, 2023. 2, 5, 6, 7

  17. [17]

    NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results

    Xin Li, Jiachao Gong, Xijun Wang, Shiyao Xiong, Bingchen Li, Suhang Yao, Chao Zhou, Zhibo Chen, Radu Timofte, et al. NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 6, 7

  18. [18]

    Retinex- former+: Retinex-based dual-channel transformer for low- light image enhancement.Computers, Materials and Con- tinua, 82(2):1969–1984, 2025

    Song Liu, Hongying Zhang, Xue Li, and Xi Yang. Retinex- former+: Retinex-based dual-channel transformer for low- light image enhancement.Computers, Materials and Con- tinua, 82(2):1969–1984, 2025. 2

  19. [19]

    Decoupled weight decay regularization, 2019

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization, 2019. 6

  20. [20]

    Fourier priors-guided diffusion for zero-shot joint low-light enhance- ment and deblurring

    Xiaoqian Lv, Shengping Zhang, Chenyang Wang, Yichen Zheng, Bineng Zhong, Chongyi Li, and Liqiang Nie. Fourier priors-guided diffusion for zero-shot joint low-light enhance- ment and deblurring. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 25378–25388, 2024. 2

  21. [21]

    Ntire 2026 single im- age shadow removal challenge report

    Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Zongwei WU, Radu Timofte, et al. Ntire 2026 single im- age shadow removal challenge report. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 7

  22. [22]

    Cor- relation matching transformation transformers for uhd image restoration

    Cong Wang, Jinshan Pan, Wei Wang, Gang Fu, Siyuan Liang, Mengzhu Wang, Xiao-Ming Wu, and Jun Liu. Cor- relation matching transformation transformers for uhd image restoration. InProceedings of the AAAI Conference on Arti- ficial Intelligence, pages 5336–5344, 2024. 2, 6

  23. [23]

    Uformer: A u- shaped transformer for image restoration

    Shikun Wang, Peng Zhang, and Hao Li. Uformer: A u- shaped transformer for image restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 6

  24. [24]

    Ultra-high-definition low-light image enhancement: A benchmark and transformer-based method

    Tao Wang, Kaihao Zhang, Tianrun Shen, Wenhan Luo, Bjorn Stenger, and Tong Lu. Ultra-high-definition low-light image enhancement: A benchmark and transformer-based method. Proceedings of the AAAI Conference on Artificial Intelli- gence, 37(3):2654–2662, 2023. 2

  25. [25]

    Zero-reference low-light enhancement via physical quadru- ple priors, 2024

    Wenjing Wang, Huan Yang, Jianlong Fu, and Jiaying Liu. Zero-reference low-light enhancement via physical quadru- ple priors, 2024. 2, 3

  26. [26]

    Rcdnet: Rain removal via conditional degradation modeling

    Ying Wang, Jun Li, and Kai Xu. Rcdnet: Rain removal via conditional degradation modeling. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 6

  27. [27]

    Bovik, H.R

    Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE Transactions on Image Processing, 13(4): 600–612, 2004. 5, 6

  28. [28]

    Mamballie: Implicit retinex-aware low light enhancement with global-then-local state space

    Jiangwei Weng, Zhiqiang Yan, Ying Tai, Jianjun Qian, Jian Yang, and Jun Li. Mamballie: Implicit retinex-aware low light enhancement with global-then-local state space. In Advances in Neural Information Processing Systems, pages 27440–27462. Curran Associates, Inc., 2024. 2

  29. [29]

    Image deraining transformer (idt) for high-resolution rain removal

    Zhen Xiao, Tian Qiao, and Rui Zhang. Image deraining transformer (idt) for high-resolution rain removal. InIEEE Transactions on Pattern Analysis and Machine Intelligence,

  30. [30]

    Snr-aware low-light image enhancement

    Ming Xu, Lei Zhang, and Wei Zhou. Snr-aware low-light image enhancement. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2022. 6

  31. [31]

    Jorder-e: Joint rain detection and removal from a single image

    Wen Yang, Xue Tan, Xi Li, and Hao Wang. Jorder-e: Joint rain detection and removal from a single image. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 6

  32. [32]

    Structure-preserving deraining with residue channel prior guidance

    Qiaosi Yi, Juncheng Li, Qinyan Dai, Faming Fang, Guixu Zhang, and Tieyong Zeng. Structure-preserving deraining with residue channel prior guidance. In2021 IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 4218–4227, 2021. 2, 6

  33. [33]

    Empowering resampling operation for ultra-high-definition image enhancement with model-aware guidance

    Wei Yu, Jie Huang, Bing Li, Kaiwen Zheng, Qi Zhu, Man Zhou, and Feng Zhao. Empowering resampling operation for ultra-high-definition image enhancement with model-aware guidance. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 25722–25731, 2024. 2

  34. [34]

    Towards efficient and scale-robust ultra- high-definition image demoir´eing

    Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, and Xiaojuan Qi. Towards efficient and scale-robust ultra- high-definition image demoir´eing. InEuropean Conference on Computer Vision, pages 646–662. Springer, 2022. 2

  35. [35]

    Zamir, A

    S. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M. Shah. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2022. 6

  36. [36]

    Efros, Eli Shecht- man, and Oliver Wang

    Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shecht- man, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018. 5

  37. [37]

    From zero to detail: Deconstructing ultra-high-definition image restora- tion from progressive spectral perspective

    Chen Zhao, Zhizhou Chen, Yunzhe Xu, Enxuan Gu, Jian Li, Zili Yi, Qian Wang, Jian Yang, and Ying Tai. From zero to detail: Deconstructing ultra-high-definition image restora- tion from progressive spectral perspective. InProceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 17935–17946, 2025. 2, 6

  38. [38]

    Wave-mamba: Wavelet state space model for ultra-high- definition low-light image enhancement

    Wenbin Zou, Hongxia Gao, Weipeng Yang, and Tongtong Liu. Wave-mamba: Wavelet state space model for ultra-high- definition low-light image enhancement. InACM Multime- dia 2024, 2024. 2, 6