PixIE: Prompted Pixel-Space Low-Light Image Enhancement

David Bull; Guoxi Huang; Nantheera Anantrasirichai; Ruirui Lin

arxiv: 2605.23531 · v1 · pith:RYI3V5QBnew · submitted 2026-05-22 · 💻 cs.CV

PixIE: Prompted Pixel-Space Low-Light Image Enhancement

Ruirui Lin , Guoxi Huang , David Bull , Nantheera Anantrasirichai This is my paper

Pith reviewed 2026-05-25 05:08 UTC · model grok-4.3

classification 💻 cs.CV

keywords low-light image enhancementpixel-space processingsemantic promptingDINO featuresimage denoisingdetail recoverycomputer vision

0 comments

The pith

PixIE enhances low-light images by injecting DINOv3 features into per-pixel modulation blocks after cross-scale denoising.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PixIE, a feed-forward pixel-space method for low-light image enhancement that first applies cross-scale denoising to suppress noise and keep structure, then refines output with DINO-Prompted Pixel Blocks. These blocks use patch-conditioned per-pixel modulation driven by intermediate features from DINOv3, supported by Spatial-Channel Compaction for efficient attention and Multi-Receptive-Field Pixel Embedding for neighborhood context. The approach targets noise, contrast loss, and semantic ambiguity together. If the method works as described, it would produce images with higher reconstruction fidelity and perceptual quality than prior techniques on standard benchmarks.

Core claim

PixIE performs cross-scale denoising followed by refinement in DINO-Prompted Pixel Blocks that inject intermediate DINOv3 features via patch-conditioned, spatially continuous per-pixel modulation; Spatial-Channel Compaction folds features into a compact grid to bound attention cost across scales, while Multi-Receptive-Field Pixel Embedding supplies neighborhood-aware representations before prompting, yielding average PSNR gains of 1.9-15.0 percent and LPIPS reductions of 8.5-44.4 percent on LLIE benchmarks along with sharper details and stable textures.

What carries the argument

DINO-Prompted Pixel Blocks (DPPB) that inject intermediate DINOv3 features via patch-conditioned per-pixel modulation to refine details after initial denoising.

If this is right

Cross-scale denoising suppresses noise while preserving structure before semantic refinement.
Spatial-Channel Compaction enables pixel-attention computation with bounded cost across multiple scales.
Multi-Receptive-Field Pixel Embedding increases robustness to signal-dependent noise compared with point-wise embeddings.
The overall pipeline recovers sharper details and more stable textures than recent state-of-the-art methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same prompting structure could be tested on related restoration tasks such as dehazing or low-light video to check if foundation-model features transfer.
Efficiency from compaction might allow the framework to run on mobile hardware if the modulation blocks are further quantized.
Replacing DINOv3 with a different foundation model could reveal whether the gains depend on specific semantic properties of that model.

Load-bearing premise

The assumption that DINOv3 features can be injected via patch-conditioned per-pixel modulation to improve detail recovery without introducing semantic errors or artifacts in noisy low-light inputs.

What would settle it

A set of low-light test images where the enhanced outputs exhibit semantic artifacts, such as invented textures or misidentified object boundaries, traceable to mismatched DINOv3 feature injection.

Figures

Figures reproduced from arXiv: 2605.23531 by David Bull, Guoxi Huang, Nantheera Anantrasirichai, Ruirui Lin.

**Figure 1.** Figure 1: DINOv3 ViT-S/16 features under low-light degradation. Left: Mean tokenwise cosine similarity across the 12 transformer layers for three comparisons: Clean vs. Low-Match (noise), Low-Match vs. Low-Raw (illumination), and Clean vs. Low-Raw (noise and illumination). Similarity increases with depth, indicating progressively more stable representations; dotted vertical lines mark the layers {2, 5, 8, 11} used … view at source ↗

**Figure 2.** Figure 2: Visual comparison of RetinexFormer [2], CIDNet [49], and PixIE (Ours) on an example LoLv2-Real test image. The low-light input is histogram-stretched to match the ground truth for better noise visualization. PixIE demonstrates superior noise suppression and detail restoration. hoods. Relying solely on single-pixel statistics makes it difficult to distinguish true signals from noise [7, 8, 46], motivating … view at source ↗

**Figure 3.** Figure 3: The overall pipeline of our proposed PixIE. Given a low-light input, the crossscale denoising stream suppresses noise fine-to-coarse, followed by Multi-ReceptiveField Pixel Embedding (MRPE), and DINO-Prompted Pixel Blocks (DPPBs) inject DINO semantic guidance at each scale, and multi-scale fusion aggregates refined features to predict a residual correction Iˆ = I + R. Transformer blocks with fine-to-coa… view at source ↗

**Figure 4.** Figure 4: Patch-conditioned modulation strategy comparison: (A) Global broadcast applies one shared vector to all pixels. (B) Patch-wise constant uses one token per patch, yielding piecewise-constant modulation and grid seams at patch boundaries. (C) Perpixel MLP predicts pixel-wise parameters within each patch, but independent per-patch prediction may introduce boundary discontinuities. (D) Ours upsamples the tok… view at source ↗

**Figure 5.** Figure 5: Qualitative comparison of PixIE with recent state-of-the-art methods on LoLv1 (top row), LoLv2-Real (second row), and LSRW (bottom row) test sets, respectively. 0.902. These results demonstrate that our pixel-space enhancement effectively balances noise suppression with structural fidelity, even in datasets like LSRW and LOLv2 that are captured under real, heavy low-light noise. Furthermore, PixIE achieves… view at source ↗

**Figure 6.** Figure 6: Qualitative comparison of PixIE with recent state-of-the-art methods on five unpaired datasets to show generalization. where they are easily suppressed by aggressive denoising or global illumination correction. In the top row (LOLv1), most methods fail to recover subtle clothing textures, whereas PixIE better preserves these high-frequency patterns. PixIE restores the fine details of the ping-pong table wh… view at source ↗

**Figure 7.** Figure 7: Qualitative ablation results of PixIE on the LoLv2-Real dataset. ‘full mod’ denotes spatially continuous modulation in full resolution. We compare (i) w/o MDTA and w/o full mod, (ii) w/o MDTA and w full mod, and (iii) w MDTA and w full mod (full PixIE). As shown in [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

read the original abstract

Low-light images exhibit severe noise, contrast loss, and semantic ambiguity, making enhancement a joint problem of denoising and detail recovery. We propose PixIE, a feed-forward pixel-space LLIE framework semantically-prompted by a vision foundation model. PixIE first performs a cross-scale denoising to suppress noise and preserve structure, then refines details with DINO-Prompted Pixel Blocks (DPPB) that inject intermediate DINOv3 features via patch-conditioned, spatially continuous per-pixel modulation. We introduce a Spatial-Channel Compaction (SCC), which folds features into a compact spatial grid and compresses in the channel dimension, so pixel-attention is computed efficiently with bounded cost across scales. We further propose Multi-Receptive-Field Pixel Embedding (MRPE) to provide neighborhood-aware pixel representations before semantic prompting, improving robustness to signal-dependent noise beyond point-wise embeddings. Experiments on LLIE benchmarks show that PixIE improves the average PSNR by 1.9-15.0% over recent state-of-the-art methods and reduces LPIPS by 8.5-44.4%. Qualitative comparisons further demonstrate that PixIE recovers sharper details and more stable textures, resulting in improved reconstruction fidelity and perceptual quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PixIE tries to inject DINOv3 features into a pixel-space low-light pipeline via patch modulation, but the abstract gives no evidence that this avoids artifacts or that the claimed PSNR/LPIPS gains are robust.

read the letter

The paper's main move is a feed-forward pixel-space network for low-light enhancement that starts with cross-scale denoising, then applies DINO-Prompted Pixel Blocks to modulate pixels using intermediate DINOv3 features through patch-conditioned per-pixel operations. It adds Spatial-Channel Compaction to keep attention cheap across scales and Multi-Receptive-Field Pixel Embedding to give each pixel some local neighborhood context before the prompting step. Those three pieces are the concrete additions over plain pixel-space baselines. The efficiency trick in SCC and the local context in MRPE look like practical engineering choices that could matter for real-time use. The quantitative claims are average PSNR lifts of 1.9-15% and LPIPS drops of 8.5-44.4% on standard LLIE benchmarks, with qualitative notes on sharper textures. That is the extent of what is shown. The soft spot is exactly the one flagged in the stress test: DINOv3 was trained on normal images, and low-light inputs carry signal-dependent noise. Nothing in the abstract isolates whether the patch modulation actually transfers useful semantics or just hallucinates detail, and there are no ablations, error bars, or failure-case examples. Without those, the metric numbers are hard to trust as more than post-hoc tuning. This is incremental work aimed at the LLIE community that already follows prompting and efficiency tricks. A reader already deep in that literature might pick up the SCC and MRPE ideas for their own pipelines, but the paper does not reorganize the area. It deserves a serious referee to check whether the full experiments close the artifact gap and whether the comparisons are fair on the same splits and training regimes. I would send it to review rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The paper proposes PixIE, a feed-forward pixel-space low-light image enhancement (LLIE) framework that first applies cross-scale denoising and then refines details using DINO-Prompted Pixel Blocks (DPPB) to inject intermediate DINOv3 features via patch-conditioned per-pixel modulation. It introduces Spatial-Channel Compaction (SCC) for efficient pixel-attention and Multi-Receptive-Field Pixel Embedding (MRPE) for neighborhood-aware representations. Experiments claim average PSNR gains of 1.9-15.0% and LPIPS reductions of 8.5-44.4% over recent SOTA methods on LLIE benchmarks, with qualitative improvements in detail and texture stability.

Significance. If the reported gains hold under rigorous validation and the DPPB modulation proves robust to signal-dependent noise without semantic artifacts, the work could meaningfully advance LLIE by showing how vision foundation model features can be efficiently integrated into pixel-space processing. The SCC mechanism for bounded-cost attention across scales is a potentially useful efficiency contribution if its implementation details are fully specified.

major comments (2)

[DPPB and experiments sections] The central empirical claim of consistent PSNR/LPIPS gains rests on the assumption that DINOv3 features (trained on standard lighting) can be injected via DPPB without introducing semantic mismatches or artifacts in noisy low-light inputs; the manuscript provides no quantitative ablation isolating DPPB's contribution or failure-case analysis on signal-dependent noise, which is load-bearing for attributing improvements to the prompting mechanism rather than the denoising or embedding stages.
[Experiments and results] The abstract and method description report average metric improvements but the manuscript lacks per-dataset tables with standard deviations, dataset splits, or statistical tests; without these, it is impossible to determine whether the 1.9-15.0% PSNR range reflects robust gains or is driven by particular benchmarks or post-hoc tuning.

minor comments (2)

[Method] Notation for patch-conditioned modulation and the exact form of per-pixel modulation in DPPB should be formalized with equations to allow reproducibility.
[Method] The paper should clarify the cross-scale denoising architecture (e.g., number of scales, loss terms) to distinguish its contribution from the novel DPPB component.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will incorporate revisions to strengthen the empirical support for PixIE.

read point-by-point responses

Referee: [DPPB and experiments sections] The central empirical claim of consistent PSNR/LPIPS gains rests on the assumption that DINOv3 features (trained on standard lighting) can be injected via DPPB without introducing semantic mismatches or artifacts in noisy low-light inputs; the manuscript provides no quantitative ablation isolating DPPB's contribution or failure-case analysis on signal-dependent noise, which is load-bearing for attributing improvements to the prompting mechanism rather than the denoising or embedding stages.

Authors: We agree that a dedicated quantitative ablation isolating DPPB is necessary to attribute gains specifically to the prompting mechanism. The manuscript presents the full framework results but does not isolate DPPB from the cross-scale denoising and MRPE stages. In the revised version we will add an ablation study that removes or substitutes the DPPB module and reports the resulting metric changes. We will also include a failure-case analysis on low-light images exhibiting strong signal-dependent noise to examine potential semantic mismatches or artifacts introduced by DINOv3 features. revision: yes
Referee: [Experiments and results] The abstract and method description report average metric improvements but the manuscript lacks per-dataset tables with standard deviations, dataset splits, or statistical tests; without these, it is impossible to determine whether the 1.9-15.0% PSNR range reflects robust gains or is driven by particular benchmarks or post-hoc tuning.

Authors: We concur that per-dataset breakdowns with variability measures would improve transparency. The reported ranges are averages across the evaluated LLIE benchmarks, but the manuscript does not tabulate individual dataset results or standard deviations. In revision we will add per-dataset tables that include PSNR and LPIPS for each benchmark, along with standard deviations computed over multiple runs where available, and explicit details on the train/test splits employed. We will also explore the inclusion of statistical significance tests to support the observed improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical method with independent experimental validation

full rationale

The paper describes a proposed architecture (cross-scale denoising + DPPB with DINOv3 injection via SCC and MRPE) and validates it via standard benchmark metrics (PSNR, LPIPS) on LLIE datasets. No equations or claims reduce by construction to fitted parameters or self-referential definitions; performance numbers are external measurements, not tautological. No load-bearing self-citations or uniqueness theorems are invoked in the provided text. This is a standard empirical ML paper whose central claims rest on reproducible experiments rather than internal redefinitions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no equations, free parameters, axioms, or invented scientific entities are described.

pith-pipeline@v0.9.0 · 5754 in / 1155 out tokens · 26789 ms · 2026-05-25T05:08:56.187105+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DINO-Prompted Pixel Block (DPPB) ... patch-conditioned, spatially continuous per-pixel modulation ... Spatial-Channel Compaction (SCC) ... Multi-Receptive-Field Pixel Embedding (MRPE)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PixIE ... feed-forward pixel-space LLIE framework semantically-prompted by a vision foundation model

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 8 internal anchors

[1]

In: International Conference on Neural Informa- tion Processing (2024),https://api.semanticscholar.org/CorpusID:269604860

Bai, J., Yin, Y., He, Q., Li, Y., Zhang, X.: Retinexmamba: Retinex-based mamba for low-light image enhancement. In: International Conference on Neural Informa- tion Processing (2024),https://api.semanticscholar.org/CorpusID:269604860

work page 2024
[2]

In: IEEE/CVF ICCV

Cai, Y., Bian, H., Lin, J., Wang, H., Timofte, R., Zhang, Y.: Retinexformer: One- stage retinex-based transformer for low-light image enhancement. In: IEEE/CVF ICCV. pp. 12504–12513 (October 2023)

work page 2023
[3]

In: ECCV

Chang, M., Li, Q., Feng, H., Xu, Z.: Spatial-adaptive network for single image de- noising. In: ECCV. p. 171–187. Springer-Verlag, Berlin, Heidelberg (2020).https: //doi.org/10.1007/978-3-030-58577-8_11,https://doi.org/10.1007/978-3- 030-58577-8_11

work page doi:10.1007/978-3-030-58577-8_11 2020
[4]

arXiv preprint arXiv:2504.07963 (2025)

Chen, S., Ge, C., Zhang, S., Sun, P., Luo, P.: Pixelflow: Pixel-space generative models with flow. arXiv preprint arXiv:2504.07963 (2025)

work page arXiv 2025
[5]

arXiv preprint arXiv:2511.18822 (2025)

Chen, Z., Zhu, J., Chen, X., Zhang, J., Hu, X., Zhao, H., Wang, C., Yang, J., Tai, Y.: Dip: Taming diffusion models in pixel space. arXiv preprint arXiv:2511.18822 (2025)

work page arXiv 2025
[6]

In: ICML

Crowson, K., Baumann, S.A., Birch, A., Abraham, T.M., Kaplan, D.Z., Shippole, E.: Scalable high-resolution pixel-space image synthesis with hourglass diffusion transformers. In: ICML. JMLR.org (2024)

work page 2024
[7]

IEEE TIP17(10), 1737–1754 (2008).https://doi.org/10.1109/TIP.2008.2001399

Foi, A., Trimeche, M., Katkovnik, V., Egiazarian, K.: Practical poissonian-gaussian noise modeling and fitting for single-image raw-data. IEEE TIP17(10), 1737–1754 (2008).https://doi.org/10.1109/TIP.2008.2001399

work page doi:10.1109/tip.2008.2001399 2008
[8]

In: NeurIPS

Gou, Y., Hu, P., Lv, J., Zhou, J.T., Peng, X.: Multi-scale adaptive network for single image denoising. In: NeurIPS. Curran Associates Inc. (2022)

work page 2022
[9]

In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

Guo, C.G., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 1780– 1789 (June 2020)

work page 2020
[10]

IEEE Transactions on image processing26(2), 982–993 (2016)

Guo, X., Li, Y., Ling, H.: Lime: Low-light image enhancement via illumination map estimation. IEEE Transactions on image processing26(2), 982–993 (2016)

work page 2016
[11]

Journal of Visual Communication and Image Representation90, 103712 (2023)

Hai, J., Xuan, Z., Yang, R., Hao, Y., Zou, F., Lin, F., Han, S.: R2rnet: Low- light image enhancement via real-low to real-normal network. Journal of Visual Communication and Image Representation90, 103712 (2023)

work page 2023
[12]

In: ICML

Hoogeboom, E., Heek, J., Salimans, T.: Simple diffusion: end-to-end diffusion for high resolution images. In: ICML. ICML’23, JMLR.org (2023)

work page 2023
[13]

Advances in Neural Information Processing Systems36, 79734–79747 (2023)

Hou, J., Zhu, Z., Hou, J., Liu, H., Zeng, H., Yuan, H.: Global structure-aware diffusion process for low-light image enhancement. Advances in Neural Information Processing Systems36, 79734–79747 (2023)

work page 2023
[14]

In: Machine Learning from Challenging Data 2025

Huang, G., Lin, R., Li, Y., Bull, D., Anantrasirichai, N.: Bvi-mamba: video en- hancement using a visual state-space model for low-light and underwater environ- ments. In: Machine Learning from Challenging Data 2025. vol. 13460, pp. 74–81. SPIE (2025)

work page 2025
[15]

Proceedings of the AAAI Conference on Artificial Intelligence (2026)

Huang, G., Yang, Q., Qi, Z., Lin, R., Bull, D., Anantrasirichai, N.: Bayesian neu- ral networks for one-to-many mapping in image enhancement. Proceedings of the AAAI Conference on Artificial Intelligence (2026)

work page 2026
[16]

IEEE/CVF TCE53(4), 1752–1758 (2007)

Ibrahim, H., Pik Kong, N.S.: Brightness Preserving Dynamic Histogram Equaliza- tion for Image Contrast Enhancement. IEEE/CVF TCE53(4), 1752–1758 (2007)

work page 2007
[17]

ACM TOG42(6), 1–14 (2023) 16 Ruirui Lin, Guoxi Huang, David Bull, Nantheera Anantrasirichai

Jiang, H., Luo, A., Fan, H., Han, S., Liu, S.: Low-light image enhancement with wavelet-based diffusion models. ACM TOG42(6), 1–14 (2023) 16 Ruirui Lin, Guoxi Huang, David Bull, Nantheera Anantrasirichai

work page 2023
[18]

In: ECCV (2024)

Jiang, H., Luo, A., Liu, X., Han, S., Liu, S.: Lightendiffusion: Unsupervised low- light image enhancement with latent-retinex diffusion models. In: ECCV (2024)

work page 2024
[19]

IEEE TIP30, 2340–2349 (2021)

Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: Deep light enhancement without paired supervision. IEEE TIP30, 2340–2349 (2021)

work page 2021
[20]

arXiv preprint arXiv:2306.15870 (2023)

Jin, Z., Chen, S., Chen, Y., Xu, Z., Feng, H.: Let segment anything help image dehaze. arXiv preprint arXiv:2306.15870 (2023)

work page arXiv 2023
[21]

Adam: A Method for Stochastic Optimization

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR abs/1412.6980(2014),https://api.semanticscholar.org/CorpusID:6628106

work page internal anchor Pith review Pith/arXiv arXiv 2014
[22]

Auto-Encoding Variational Bayes

Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014), http://arxiv.org/abs/1312.6114

work page internal anchor Pith review Pith/arXiv arXiv 2014
[23]

Segment Anything

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., Dollár, P., Girshick, R.: Segment anything. arXiv:2304.02643 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[24]

Scientific American237 6, 108–28 (1977)

Land, E.H.: The retinex theory of color vision. Scientific American237 6, 108–28 (1977)

work page 1977
[25]

IEEE transactions on image processing22(12), 5372–5384 (2013)

Lee, C., Lee, C., Kim, C.S.: Contrast enhancement based on layered difference representation of 2d histograms. IEEE transactions on image processing22(12), 5372–5384 (2013)

work page 2013
[26]

Li, S., Liu, M., Zhang, Y., Chen, S., Li, H., Dou, Z., Chen, H.: Sam-deblur: Let segmentanythingboostimagedeblurring.In:ICASSP.pp.2445–2449.IEEE(2024)

work page 2024
[27]

In: IEEE ICASSP

Lin, J., Anantrasirichai, N., Bull, D.: Multi-scale denoising in the feature space for low-light instance segmentation. In: IEEE ICASSP. pp. 1–5 (2025)

work page 2025
[28]

arXiv preprint arXiv:2312.01677 (2023)

Lin, X., Yue, J., Chan, K.C., Qi, L., Ren, C., Pan, J., Yang, M.H.: Multi-task image restoration guided by robust dino features. arXiv preprint arXiv:2312.01677 (2023)

work page arXiv 2023
[29]

Pattern Recognition61, 650–662 (2017)

Lore,K.G.,Akintayo,A.,Sarkar,S.:Llnet:Adeepautoencoderapproachtonatural low-light image enhancement. Pattern Recognition61, 650–662 (2017)

work page 2017
[30]

In: ICLR (2023),https://api

Luo, Z., Gustafsson, F.K., Zhao, Z., Sjolund, J., Schon, T.B.: Controlling vision- language models for multi-task image restoration. In: ICLR (2023),https://api. semanticscholar.org/CorpusID:263605463

work page 2023
[31]

In: BMVC (2018)

Lv, F., Lu, F., Wu, J., Lim, C.: Mbllen: Low-light image/video enhancement using cnns. In: BMVC (2018)

work page 2018
[32]

IEEE Transactions on Image Processing24(11), 3345–3356 (2015)

Ma, K., Zeng, K., Wang, Z.: Perceptual quality assessment for multi-exposure image fusion. IEEE Transactions on Image Processing24(11), 3345–3356 (2015)

work page 2015
[33]

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Ma, Z., Wei, L., Wang, S., Zhang, S., Tian, Q.: Deco: Frequency-decoupled pixel diffusion for end-to-end image generation. arXiv preprint arXiv:2511.19365 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[34]

completely blind

Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters20(3), 209–212 (2013).https: //doi.org/10.1109/LSP.2012.2227726

work page doi:10.1109/lsp.2012.2227726 2013
[35]

Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez,P.,Haziza,D.,Massa,F.,El-Nouby,A.,Howes,R.,Huang,P.Y.,Xu,H., Sharma, V., Li, S.W., Galuba, W., Rabbat, M., Assran, M., Ballas, N., Synnaeve, G., Misra, I., Jegou, H., Mairal, J., Labatut, P., Joulin, A., Bojanowski, P.: Dinov2: Learning robust visual features without ...

work page 2023
[36]

In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Peebles, W., Xie, S.: Scalable diffusion models with transformers. In: IEEE/CVF ICCV. pp. 4172–4182 (2023).https://doi.org/10.1109/ICCV51070.2023.00387

work page doi:10.1109/iccv51070.2023.00387 2023
[37]

In: ECCV

Qi, C., Tu, Z., Ye, K., Delbracio, M., Milanfar, P., Chen, Q., Talebi, H.: Spire: Se- mantic prompt-driven image restoration. In: ECCV. pp. 446–464. Springer (2024) Abbreviated paper title 17

work page 2024
[38]

In: ICML (2021),https://api

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: ICML (2021),https://api. semanticscholar.org/CorpusID:231591445

work page 2021
[39]

SAM 2: Segment Anything in Images and Videos

Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Ryali, C., Ma, T., Khedr, H., Rädle, R., Rolland, C., Gustafson, L., Mintun, E., Pan, J., Alwala, K.V., Carion, N., Wu, C.Y., Girshick, R., Dollár, P., Feichtenhofer, C.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024),https://arxiv.org/ abs/2408.00714

work page internal anchor Pith review Pith/arXiv arXiv 2024
[40]

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. CVPR pp. 10674–10685 (2021), https://api.semanticscholar.org/CorpusID:245335280

work page 2021
[41]

Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khalidov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., Massa, F., Haziza, D., Wehrstedt, L., Wang, J., Darcet, T., Moutakanni, T., Sentana, L., Roberts, C., Vedaldi, A., Tolan, J., Brandt, J., Couprie, C., Mairal, J., Jégou, H., Labatut, P., Bojanowski, P.: DINOv3 (2025),https://ar...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[42]

Multimedia Tools and Applications77, 9211–9231 (2018)

Vonikakis, V., Kouskouridas, R., Gasteratos, A.: On the evaluation of illumina- tion compensation algorithms. Multimedia Tools and Applications77, 9211–9231 (2018)

work page 2018
[43]

In: AAAI (2023)

Wang, J., Chan, K.C., Loy, C.C.: Exploring clip for assessing the look and feel of images. In: AAAI (2023)

work page 2023
[44]

IEEE transactions on image processing 22(9), 3538–3548 (2013)

Wang, S., Zheng, J., Hu, H.M., Li, B.: Naturalness preserved enhancement algo- rithm for non-uniform illumination images. IEEE transactions on image processing 22(9), 3538–3548 (2013)

work page 2013
[45]

In: BMVC (2018)

Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: BMVC (2018)

work page 2018
[46]

IEEE TPAMI44(11), 8520–8537 (2021)

Wei, K., Fu, Y., Zheng, Y., Yang, J.: Physics-based noise modeling for extreme low-light photography. IEEE TPAMI44(11), 8520–8537 (2021)

work page 2021
[47]

In: CVPR (2022)

Xu, X., Wang, R., Fu, C.W., Jia, J.: Snr-aware low-light image enhancement. In: CVPR (2022)

work page 2022
[48]

ACM TOMM21(11) (Nov 2025).https://doi.org/10

Xue, M., He, J., Wang, W., Zhou, M.: Low-light image enhancement via clip-fourier guided wavelet diffusion. ACM TOMM21(11) (Nov 2025).https://doi.org/10. 1145/3764933,https://doi.org/10.1145/3764933

work page doi:10.1145/3764933 2025
[49]

Yan, Q., Feng, Y., Zhang, C., Pang, G., Shi, K., Wu, P., Dong, W., Sun, J., Zhang, Y.:Hvi:Anewcolorspaceforlow-lightimageenhancement.In:IEEE/CVFCVPR. pp. 5678–5687 (2025).https://doi.org/10.1109/CVPR52734.2025.00533

work page doi:10.1109/cvpr52734.2025.00533 2025
[50]

IEEE Transactions on Image Processing30, 2072–2086 (2021)

Yang, W., Wang, W., Huang, H., Wang, S., Liu, J.: Sparse gradient regularized deep retinex network for robust low-light image enhancement. IEEE Transactions on Image Processing30, 2072–2086 (2021)

work page 2072
[51]

Yu, Y., Xiong, W., Nie, W., Sheng, Y., Liu, S., Luo, J.: Pixeldit: Pixel diffusion transformers for image generation (2025),https://arxiv.org/abs/2511.20645

work page internal anchor Pith review Pith/arXiv arXiv 2025
[52]

In: IEEE/CVF CVPR (2023)

Yuhui, W., Chen, P., Guoqing, W., Yang, Y., Jiwei, W., Chongyi, L., Shen, H.T.: Learning semantic-aware knowledge guidance for low-light image enhancement. In: IEEE/CVF CVPR (2023)

work page 2023
[53]

In: CVPR (2022)

Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: Efficient transformer for high-resolution image restoration. In: CVPR (2022)

work page 2022
[54]

IEEE/CVF CVPR pp

Zhang, Q., Liu, X., Li, W., Chen, H., Liu, J., Hu, J., Xiong, Z., Yuan, C., Wang, Y.: Distilling semantic priors from sam to efficient image restoration models. IEEE/CVF CVPR pp. 25409–25419 (2024),https://api.semanticscholar.org/ CorpusID:268681086 18 Ruirui Lin, Guoxi Huang, David Bull, Nantheera Anantrasirichai

work page 2024
[55]

In: IEEE/CVF CVPR (2018)

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: IEEE/CVF CVPR (2018)

work page 2018
[56]

Diffusion Transformers with Representation Autoencoders

Zheng, B., Ma, N., Tong, S., Xie, S.: Diffusion transformers with representation autoencoders. arXiv preprint arXiv:2510.11690 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[57]

In: Proceedings of the IEEE/CVF Winter conference on applications of computer vision

Zheng, S., Gupta, G.: Semantic-guided zero-shot learning for low-light image/video enhancement. In: Proceedings of the IEEE/CVF Winter conference on applications of computer vision. pp. 581–590 (2022)

work page 2022
[58]

In: ACM MM (2024),https: //openreview.net/forum?id=oQahsz6vWe

Zou, W., Gao, H., Yang, W., Liu, T.: Wave-mamba: Wavelet state space model for ultra-high-definition low-light image enhancement. In: ACM MM (2024),https: //openreview.net/forum?id=oQahsz6vWe

work page 2024

[1] [1]

In: International Conference on Neural Informa- tion Processing (2024),https://api.semanticscholar.org/CorpusID:269604860

Bai, J., Yin, Y., He, Q., Li, Y., Zhang, X.: Retinexmamba: Retinex-based mamba for low-light image enhancement. In: International Conference on Neural Informa- tion Processing (2024),https://api.semanticscholar.org/CorpusID:269604860

work page 2024

[2] [2]

In: IEEE/CVF ICCV

Cai, Y., Bian, H., Lin, J., Wang, H., Timofte, R., Zhang, Y.: Retinexformer: One- stage retinex-based transformer for low-light image enhancement. In: IEEE/CVF ICCV. pp. 12504–12513 (October 2023)

work page 2023

[3] [3]

In: ECCV

Chang, M., Li, Q., Feng, H., Xu, Z.: Spatial-adaptive network for single image de- noising. In: ECCV. p. 171–187. Springer-Verlag, Berlin, Heidelberg (2020).https: //doi.org/10.1007/978-3-030-58577-8_11,https://doi.org/10.1007/978-3- 030-58577-8_11

work page doi:10.1007/978-3-030-58577-8_11 2020

[4] [4]

arXiv preprint arXiv:2504.07963 (2025)

Chen, S., Ge, C., Zhang, S., Sun, P., Luo, P.: Pixelflow: Pixel-space generative models with flow. arXiv preprint arXiv:2504.07963 (2025)

work page arXiv 2025

[5] [5]

arXiv preprint arXiv:2511.18822 (2025)

Chen, Z., Zhu, J., Chen, X., Zhang, J., Hu, X., Zhao, H., Wang, C., Yang, J., Tai, Y.: Dip: Taming diffusion models in pixel space. arXiv preprint arXiv:2511.18822 (2025)

work page arXiv 2025

[6] [6]

In: ICML

Crowson, K., Baumann, S.A., Birch, A., Abraham, T.M., Kaplan, D.Z., Shippole, E.: Scalable high-resolution pixel-space image synthesis with hourglass diffusion transformers. In: ICML. JMLR.org (2024)

work page 2024

[7] [7]

IEEE TIP17(10), 1737–1754 (2008).https://doi.org/10.1109/TIP.2008.2001399

Foi, A., Trimeche, M., Katkovnik, V., Egiazarian, K.: Practical poissonian-gaussian noise modeling and fitting for single-image raw-data. IEEE TIP17(10), 1737–1754 (2008).https://doi.org/10.1109/TIP.2008.2001399

work page doi:10.1109/tip.2008.2001399 2008

[8] [8]

In: NeurIPS

Gou, Y., Hu, P., Lv, J., Zhou, J.T., Peng, X.: Multi-scale adaptive network for single image denoising. In: NeurIPS. Curran Associates Inc. (2022)

work page 2022

[9] [9]

In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

Guo, C.G., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 1780– 1789 (June 2020)

work page 2020

[10] [10]

IEEE Transactions on image processing26(2), 982–993 (2016)

Guo, X., Li, Y., Ling, H.: Lime: Low-light image enhancement via illumination map estimation. IEEE Transactions on image processing26(2), 982–993 (2016)

work page 2016

[11] [11]

Journal of Visual Communication and Image Representation90, 103712 (2023)

Hai, J., Xuan, Z., Yang, R., Hao, Y., Zou, F., Lin, F., Han, S.: R2rnet: Low- light image enhancement via real-low to real-normal network. Journal of Visual Communication and Image Representation90, 103712 (2023)

work page 2023

[12] [12]

In: ICML

Hoogeboom, E., Heek, J., Salimans, T.: Simple diffusion: end-to-end diffusion for high resolution images. In: ICML. ICML’23, JMLR.org (2023)

work page 2023

[13] [13]

Advances in Neural Information Processing Systems36, 79734–79747 (2023)

Hou, J., Zhu, Z., Hou, J., Liu, H., Zeng, H., Yuan, H.: Global structure-aware diffusion process for low-light image enhancement. Advances in Neural Information Processing Systems36, 79734–79747 (2023)

work page 2023

[14] [14]

In: Machine Learning from Challenging Data 2025

Huang, G., Lin, R., Li, Y., Bull, D., Anantrasirichai, N.: Bvi-mamba: video en- hancement using a visual state-space model for low-light and underwater environ- ments. In: Machine Learning from Challenging Data 2025. vol. 13460, pp. 74–81. SPIE (2025)

work page 2025

[15] [15]

Proceedings of the AAAI Conference on Artificial Intelligence (2026)

Huang, G., Yang, Q., Qi, Z., Lin, R., Bull, D., Anantrasirichai, N.: Bayesian neu- ral networks for one-to-many mapping in image enhancement. Proceedings of the AAAI Conference on Artificial Intelligence (2026)

work page 2026

[16] [16]

IEEE/CVF TCE53(4), 1752–1758 (2007)

Ibrahim, H., Pik Kong, N.S.: Brightness Preserving Dynamic Histogram Equaliza- tion for Image Contrast Enhancement. IEEE/CVF TCE53(4), 1752–1758 (2007)

work page 2007

[17] [17]

ACM TOG42(6), 1–14 (2023) 16 Ruirui Lin, Guoxi Huang, David Bull, Nantheera Anantrasirichai

Jiang, H., Luo, A., Fan, H., Han, S., Liu, S.: Low-light image enhancement with wavelet-based diffusion models. ACM TOG42(6), 1–14 (2023) 16 Ruirui Lin, Guoxi Huang, David Bull, Nantheera Anantrasirichai

work page 2023

[18] [18]

In: ECCV (2024)

Jiang, H., Luo, A., Liu, X., Han, S., Liu, S.: Lightendiffusion: Unsupervised low- light image enhancement with latent-retinex diffusion models. In: ECCV (2024)

work page 2024

[19] [19]

IEEE TIP30, 2340–2349 (2021)

Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: Deep light enhancement without paired supervision. IEEE TIP30, 2340–2349 (2021)

work page 2021

[20] [20]

arXiv preprint arXiv:2306.15870 (2023)

Jin, Z., Chen, S., Chen, Y., Xu, Z., Feng, H.: Let segment anything help image dehaze. arXiv preprint arXiv:2306.15870 (2023)

work page arXiv 2023

[21] [21]

Adam: A Method for Stochastic Optimization

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR abs/1412.6980(2014),https://api.semanticscholar.org/CorpusID:6628106

work page internal anchor Pith review Pith/arXiv arXiv 2014

[22] [22]

Auto-Encoding Variational Bayes

Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014), http://arxiv.org/abs/1312.6114

work page internal anchor Pith review Pith/arXiv arXiv 2014

[23] [23]

Segment Anything

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., Dollár, P., Girshick, R.: Segment anything. arXiv:2304.02643 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[24] [24]

Scientific American237 6, 108–28 (1977)

Land, E.H.: The retinex theory of color vision. Scientific American237 6, 108–28 (1977)

work page 1977

[25] [25]

IEEE transactions on image processing22(12), 5372–5384 (2013)

Lee, C., Lee, C., Kim, C.S.: Contrast enhancement based on layered difference representation of 2d histograms. IEEE transactions on image processing22(12), 5372–5384 (2013)

work page 2013

[26] [26]

Li, S., Liu, M., Zhang, Y., Chen, S., Li, H., Dou, Z., Chen, H.: Sam-deblur: Let segmentanythingboostimagedeblurring.In:ICASSP.pp.2445–2449.IEEE(2024)

work page 2024

[27] [27]

In: IEEE ICASSP

Lin, J., Anantrasirichai, N., Bull, D.: Multi-scale denoising in the feature space for low-light instance segmentation. In: IEEE ICASSP. pp. 1–5 (2025)

work page 2025

[28] [28]

arXiv preprint arXiv:2312.01677 (2023)

Lin, X., Yue, J., Chan, K.C., Qi, L., Ren, C., Pan, J., Yang, M.H.: Multi-task image restoration guided by robust dino features. arXiv preprint arXiv:2312.01677 (2023)

work page arXiv 2023

[29] [29]

Pattern Recognition61, 650–662 (2017)

Lore,K.G.,Akintayo,A.,Sarkar,S.:Llnet:Adeepautoencoderapproachtonatural low-light image enhancement. Pattern Recognition61, 650–662 (2017)

work page 2017

[30] [30]

In: ICLR (2023),https://api

Luo, Z., Gustafsson, F.K., Zhao, Z., Sjolund, J., Schon, T.B.: Controlling vision- language models for multi-task image restoration. In: ICLR (2023),https://api. semanticscholar.org/CorpusID:263605463

work page 2023

[31] [31]

In: BMVC (2018)

Lv, F., Lu, F., Wu, J., Lim, C.: Mbllen: Low-light image/video enhancement using cnns. In: BMVC (2018)

work page 2018

[32] [32]

IEEE Transactions on Image Processing24(11), 3345–3356 (2015)

Ma, K., Zeng, K., Wang, Z.: Perceptual quality assessment for multi-exposure image fusion. IEEE Transactions on Image Processing24(11), 3345–3356 (2015)

work page 2015

[33] [33]

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Ma, Z., Wei, L., Wang, S., Zhang, S., Tian, Q.: Deco: Frequency-decoupled pixel diffusion for end-to-end image generation. arXiv preprint arXiv:2511.19365 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[34] [34]

completely blind

Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters20(3), 209–212 (2013).https: //doi.org/10.1109/LSP.2012.2227726

work page doi:10.1109/lsp.2012.2227726 2013

[35] [35]

Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez,P.,Haziza,D.,Massa,F.,El-Nouby,A.,Howes,R.,Huang,P.Y.,Xu,H., Sharma, V., Li, S.W., Galuba, W., Rabbat, M., Assran, M., Ballas, N., Synnaeve, G., Misra, I., Jegou, H., Mairal, J., Labatut, P., Joulin, A., Bojanowski, P.: Dinov2: Learning robust visual features without ...

work page 2023

[36] [36]

In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Peebles, W., Xie, S.: Scalable diffusion models with transformers. In: IEEE/CVF ICCV. pp. 4172–4182 (2023).https://doi.org/10.1109/ICCV51070.2023.00387

work page doi:10.1109/iccv51070.2023.00387 2023

[37] [37]

In: ECCV

Qi, C., Tu, Z., Ye, K., Delbracio, M., Milanfar, P., Chen, Q., Talebi, H.: Spire: Se- mantic prompt-driven image restoration. In: ECCV. pp. 446–464. Springer (2024) Abbreviated paper title 17

work page 2024

[38] [38]

In: ICML (2021),https://api

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: ICML (2021),https://api. semanticscholar.org/CorpusID:231591445

work page 2021

[39] [39]

SAM 2: Segment Anything in Images and Videos

Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Ryali, C., Ma, T., Khedr, H., Rädle, R., Rolland, C., Gustafson, L., Mintun, E., Pan, J., Alwala, K.V., Carion, N., Wu, C.Y., Girshick, R., Dollár, P., Feichtenhofer, C.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024),https://arxiv.org/ abs/2408.00714

work page internal anchor Pith review Pith/arXiv arXiv 2024

[40] [40]

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. CVPR pp. 10674–10685 (2021), https://api.semanticscholar.org/CorpusID:245335280

work page 2021

[41] [41]

Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khalidov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., Massa, F., Haziza, D., Wehrstedt, L., Wang, J., Darcet, T., Moutakanni, T., Sentana, L., Roberts, C., Vedaldi, A., Tolan, J., Brandt, J., Couprie, C., Mairal, J., Jégou, H., Labatut, P., Bojanowski, P.: DINOv3 (2025),https://ar...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[42] [42]

Multimedia Tools and Applications77, 9211–9231 (2018)

Vonikakis, V., Kouskouridas, R., Gasteratos, A.: On the evaluation of illumina- tion compensation algorithms. Multimedia Tools and Applications77, 9211–9231 (2018)

work page 2018

[43] [43]

In: AAAI (2023)

Wang, J., Chan, K.C., Loy, C.C.: Exploring clip for assessing the look and feel of images. In: AAAI (2023)

work page 2023

[44] [44]

IEEE transactions on image processing 22(9), 3538–3548 (2013)

Wang, S., Zheng, J., Hu, H.M., Li, B.: Naturalness preserved enhancement algo- rithm for non-uniform illumination images. IEEE transactions on image processing 22(9), 3538–3548 (2013)

work page 2013

[45] [45]

In: BMVC (2018)

Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: BMVC (2018)

work page 2018

[46] [46]

IEEE TPAMI44(11), 8520–8537 (2021)

Wei, K., Fu, Y., Zheng, Y., Yang, J.: Physics-based noise modeling for extreme low-light photography. IEEE TPAMI44(11), 8520–8537 (2021)

work page 2021

[47] [47]

In: CVPR (2022)

Xu, X., Wang, R., Fu, C.W., Jia, J.: Snr-aware low-light image enhancement. In: CVPR (2022)

work page 2022

[48] [48]

ACM TOMM21(11) (Nov 2025).https://doi.org/10

Xue, M., He, J., Wang, W., Zhou, M.: Low-light image enhancement via clip-fourier guided wavelet diffusion. ACM TOMM21(11) (Nov 2025).https://doi.org/10. 1145/3764933,https://doi.org/10.1145/3764933

work page doi:10.1145/3764933 2025

[49] [49]

Yan, Q., Feng, Y., Zhang, C., Pang, G., Shi, K., Wu, P., Dong, W., Sun, J., Zhang, Y.:Hvi:Anewcolorspaceforlow-lightimageenhancement.In:IEEE/CVFCVPR. pp. 5678–5687 (2025).https://doi.org/10.1109/CVPR52734.2025.00533

work page doi:10.1109/cvpr52734.2025.00533 2025

[50] [50]

IEEE Transactions on Image Processing30, 2072–2086 (2021)

Yang, W., Wang, W., Huang, H., Wang, S., Liu, J.: Sparse gradient regularized deep retinex network for robust low-light image enhancement. IEEE Transactions on Image Processing30, 2072–2086 (2021)

work page 2072

[51] [51]

Yu, Y., Xiong, W., Nie, W., Sheng, Y., Liu, S., Luo, J.: Pixeldit: Pixel diffusion transformers for image generation (2025),https://arxiv.org/abs/2511.20645

work page internal anchor Pith review Pith/arXiv arXiv 2025

[52] [52]

In: IEEE/CVF CVPR (2023)

Yuhui, W., Chen, P., Guoqing, W., Yang, Y., Jiwei, W., Chongyi, L., Shen, H.T.: Learning semantic-aware knowledge guidance for low-light image enhancement. In: IEEE/CVF CVPR (2023)

work page 2023

[53] [53]

In: CVPR (2022)

Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: Efficient transformer for high-resolution image restoration. In: CVPR (2022)

work page 2022

[54] [54]

IEEE/CVF CVPR pp

Zhang, Q., Liu, X., Li, W., Chen, H., Liu, J., Hu, J., Xiong, Z., Yuan, C., Wang, Y.: Distilling semantic priors from sam to efficient image restoration models. IEEE/CVF CVPR pp. 25409–25419 (2024),https://api.semanticscholar.org/ CorpusID:268681086 18 Ruirui Lin, Guoxi Huang, David Bull, Nantheera Anantrasirichai

work page 2024

[55] [55]

In: IEEE/CVF CVPR (2018)

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: IEEE/CVF CVPR (2018)

work page 2018

[56] [56]

Diffusion Transformers with Representation Autoencoders

Zheng, B., Ma, N., Tong, S., Xie, S.: Diffusion transformers with representation autoencoders. arXiv preprint arXiv:2510.11690 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[57] [57]

In: Proceedings of the IEEE/CVF Winter conference on applications of computer vision

Zheng, S., Gupta, G.: Semantic-guided zero-shot learning for low-light image/video enhancement. In: Proceedings of the IEEE/CVF Winter conference on applications of computer vision. pp. 581–590 (2022)

work page 2022

[58] [58]

In: ACM MM (2024),https: //openreview.net/forum?id=oQahsz6vWe

Zou, W., Gao, H., Yang, W., Liu, T.: Wave-mamba: Wavelet state space model for ultra-high-definition low-light image enhancement. In: ACM MM (2024),https: //openreview.net/forum?id=oQahsz6vWe

work page 2024