pith. machine review for the scientific record. sign in

arxiv: 2604.06161 · v2 · submitted 2026-04-07 · 💻 cs.CV · cs.AI· cs.GR

Recognition: 2 theorem links

· Lean Theorem

DiffHDR: Re-Exposing LDR Videos with Video Diffusion Models

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:58 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.GR
keywords LDR to HDR conversionvideo diffusion modelsgenerative inpaintingradiance recoveryHDR videotemporal consistencyre-exposuresynthetic training data
0
0 comments X

The pith

Video diffusion models restore realistic high-dynamic-range radiance to standard low-dynamic-range videos by inpainting overexposed and underexposed regions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that LDR-to-HDR video conversion can be solved by treating missing highlight and shadow information as a generative inpainting problem inside a pretrained video diffusion model. Standard 8-bit videos lose much of the original scene radiance through saturation and quantization, which prevents accurate re-exposure or display on HDR screens. By working in Log-Gamma space and drawing on the model's learned spatio-temporal priors, the method synthesizes plausible radiance values where the input is clipped while keeping the measurable parts continuous. This produces temporally stable output that offers wide latitude for later brightness adjustments. The approach also incorporates optional text or image guidance and creates its own training pairs from static HDRI sources to compensate for scarce real paired data.

Core claim

DiffHDR formulates LDR-to-HDR conversion as a generative radiance inpainting task within the latent space of a video diffusion model. Operating in Log-Gamma color space, it leverages spatio-temporal generative priors from a pretrained video diffusion model to synthesize plausible HDR radiance in over- and underexposed regions while recovering the continuous scene radiance of the quantized pixels. The framework supports controllable conversion guided by text prompts or reference images and trains on synthetic HDR video data generated from static HDRI maps.

What carries the argument

Generative radiance inpainting task performed in the latent space of a pretrained video diffusion model, operating in Log-Gamma color space to leverage spatio-temporal priors.

If this is right

  • HDR videos exhibit higher radiance fidelity and greater temporal stability than outputs from prior LDR-to-HDR techniques.
  • The same model accepts text prompts or reference images to steer the appearance of restored highlights and shadows.
  • Generated videos retain enough dynamic range to support large brightness shifts during post-production re-exposure.
  • Synthetic training data derived from static HDRI maps enables effective learning despite the absence of large paired HDR video datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The inpainting framing could transfer to other video restoration problems such as recovering motion blur or removing compression artifacts by the same latent-space mechanism.
  • If the recovered radiance proves consistent across different diffusion backbones, the method might serve as a general post-capture upgrade layer for any existing video archive.
  • Real-time deployment would require distilling the diffusion steps or replacing them with a feed-forward network while preserving the same radiance accuracy.

Load-bearing premise

A video diffusion model trained on ordinary footage can reliably invent accurate bright and dark details for overexposed and underexposed areas without creating visible inconsistencies across frames.

What would settle it

Simultaneous capture of the same real-world scene with both a standard LDR camera and a calibrated HDR camera, followed by direct comparison of radiance values in the regions that were saturated or crushed in the LDR input.

Figures

Figures reproduced from arXiv: 2604.06161 by Dao Mi, Dmitriy Smirnov, Julien Philip, Leo Isikdogan, Li Ma, Mingming He, Ning Yu, Pablo Delgado, Pablo Salamanca, Paul Debevec, Wenping Wang, Xin Li, Yuancheng Xu, Zhengming Yu.

Figure 1
Figure 1. Figure 1: DiffHDR reconstructs lost radiance to convert LDR videos into faithful HDR while maintaining temporal coherence (Top). DiffHDR further enables controllable HDR synthesis guided by text prompts or reference images, facilitating realistic hallu￾cination of saturated regions (Bottom). almost the entirety of video produced using generative models. This LDR-centric ecosystem persists because LDR remains the mos… view at source ↗
Figure 2
Figure 2. Figure 2: Framework of DiffHDR. Given an input LDR video, we first detect its clipped regions and map it into the proposed Log-Gamma color space. A finetuned video diffusion model reconstructs missing radiance in over- and underexposed regions. A mask detector and context-focused prompting module support controllable detail synthesis. The final output HDR video supports faithful re-exposure, accurate repro￾duction o… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative comparison on the SI-HDR dataset. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparison on in-the-wild video dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of different color mappings. We compare different color map￾ping methods by feeding the mapped images into the VAE and evaluating the recon￾struction quality. We also visualize per-pixel error maps, where brighter regions indicate larger reconstruction errors. Our method achieves the best performance. the superior perceptual quality of our method. On video datasets, DiffHDR con￾sistently achieve… view at source ↗
Figure 6
Figure 6. Figure 6: Ablation study on data augmentation strategy and mask detection. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Ablation study on context-focused prompting. [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Text- and image-guided generation. DiffHDR supports both text and image controls for guiding the generation in reconstructed regions. model effectively suppresses camera noise and restores detailed textures. These improvements are quantitatively validated in Tab. 5. Effect of context-focused prompting. We further evaluate the proposed context￾focused prompting modules in [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗
Figure 10
Figure 10. Figure 10: Across diverse scenes with challenging illumination conditions, DiffHDR [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative comparison on the SI-HDR dataset. [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative comparison on the in-the-wild videos. [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Effect of α in context-focused cross-attention. By adjusting the control coefficient α, our method enables controllable manipulation of the dynamic range in overexposed and underexposed regions. As α increases, the recovered radiance in the corresponding regions becomes progressively stronger, leading to larger dynamic range as illustrated by the intensity visualization. Ground Truth Finetuned Ours [PITH… view at source ↗
Figure 12
Figure 12. Figure 12: Effect of finetuning the video VAE. The finetuned VAE produces smoother reconstructions and suppresses high-frequency details. The spectrum anal￾ysis on the right (radially averaged power spectrum) shows reduced high-frequency energy after finetuning, indicating over-smoothing of the representation. In contrast, the non-finetuned VAE preserves more high-frequency information and yields better reconstructi… view at source ↗
Figure 13
Figure 13. Figure 13: Banding artifacts caused by BF16 inference in the video VAE. [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗
read the original abstract

Most digital videos are stored in 8-bit low dynamic range (LDR) formats, where much of the original high dynamic range (HDR) scene radiance is lost due to saturation and quantization. This loss of highlight and shadow detail precludes mapping accurate luminance to HDR displays and limits meaningful re-exposure in post-production workflows. Although techniques have been proposed to convert LDR images to HDR through dynamic range expansion, they struggle to restore realistic detail in the over- and underexposed regions. To address this, we present DiffHDR, a framework that formulates LDR-to-HDR conversion as a generative radiance inpainting task within the latent space of a video diffusion model. By operating in Log-Gamma color space, DiffHDR leverages spatio-temporal generative priors from a pretrained video diffusion model to synthesize plausible HDR radiance in over- and underexposed regions while recovering the continuous scene radiance of the quantized pixels. Our framework further enables controllable LDR-to-HDR video conversion guided by text prompts or reference images. To address the scarcity of paired HDR video data, we develop a pipeline that synthesizes high-quality HDR video training data from static HDRI maps. Extensive experiments demonstrate that DiffHDR significantly outperforms state-of-the-art approaches in radiance fidelity and temporal stability, producing realistic HDR videos with considerable latitude for re-exposure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces DiffHDR, a framework for LDR-to-HDR video conversion formulated as generative radiance inpainting in the latent space of a pretrained video diffusion model. Operating in Log-Gamma color space, it leverages spatio-temporal priors to synthesize plausible HDR radiance in over- and underexposed regions while recovering continuous scene radiance. A synthesis pipeline generates paired training data from static HDRI maps, and the method supports controllable conversion via text prompts or reference images. Extensive experiments are claimed to show significant outperformance over state-of-the-art methods in radiance fidelity and temporal stability.

Significance. If the results hold under real-world conditions, this approach could advance practical HDR video workflows by enabling high-quality re-exposure and detail recovery in post-production using generative video priors. The data synthesis strategy addresses a key scarcity issue, and the emphasis on temporal stability via diffusion models represents a timely application of recent generative techniques.

major comments (1)
  1. [§3.3] §3.3: The training data synthesis pipeline generates all paired HDR video data by warping static HDRI environment maps onto video sequences. This construction cannot reproduce temporally correlated real-world effects such as moving specular highlights, object-motion-dependent cast shadows, or varying inter-reflections. Because the central claims of improved radiance fidelity and temporal stability depend on the video diffusion priors learning plausible inpainting from this data distribution, the absence of validation on real paired HDR-LDR video sequences or targeted ablations on dynamic lighting directly threatens the generalization of the reported gains.
minor comments (2)
  1. [Abstract] The abstract asserts outperformance without any numerical results or error metrics; moving key quantitative highlights (e.g., PSNR, temporal consistency scores) from the experiments section into the abstract would improve readability.
  2. [§3] Notation for the Log-Gamma color space transformation and the precise conditioning mechanism (text vs. reference image) should be defined explicitly with equations in §3 or §4 to avoid ambiguity for readers.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive comment on the data synthesis pipeline. We address the concern directly below and propose a partial revision to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§3.3] §3.3: The training data synthesis pipeline generates all paired HDR video data by warping static HDRI environment maps onto video sequences. This construction cannot reproduce temporally correlated real-world effects such as moving specular highlights, object-motion-dependent cast shadows, or varying inter-reflections. Because the central claims of improved radiance fidelity and temporal stability depend on the video diffusion priors learning plausible inpainting from this data distribution, the absence of validation on real paired HDR-LDR video sequences or targeted ablations on dynamic lighting directly threatens the generalization of the reported gains.

    Authors: We acknowledge that warping static HDRI maps onto video sequences cannot fully reproduce dynamic lighting effects such as moving specular highlights, motion-dependent cast shadows, or varying inter-reflections. This is an inherent limitation of the synthesis approach. However, the video diffusion model is pretrained on large-scale real-world video corpora that contain natural temporal lighting dynamics; by performing generative inpainting in latent space, the model leverages these priors to produce plausible, motion-consistent HDR radiance even when the paired training examples are synthesized from static maps. Our experiments evaluate on real LDR videos (qualitative results and user studies), where DiffHDR exhibits improved temporal stability and detail recovery over baselines. Real paired HDR-LDR video data remains scarce, which is why synthesis was necessary; we therefore cannot provide quantitative metrics on such pairs. We will revise the paper to add an explicit limitations paragraph on the data synthesis strategy, clarify the mitigating role of pretrained priors, and include any feasible targeted analysis of dynamic lighting consistency. revision: partial

standing simulated objections not resolved
  • Quantitative validation on real paired HDR-LDR video sequences, as no sufficiently large public dataset of this kind exists.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's core formulation treats LDR-to-HDR conversion as generative radiance inpainting inside a pretrained video diffusion model's latent space, with a separate synthetic data pipeline built from static HDRI maps. These elements are constructed independently of the final empirical claims. Performance assertions rest on comparative experiments for radiance fidelity and temporal stability rather than any quantity defined by fitting parameters to the target outputs or by self-referential equations. No load-bearing self-citations, uniqueness theorems, or ansatzes reduce the results to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the effectiveness of pretrained video diffusion model priors for plausible radiance synthesis and on the fidelity of the HDRI-based synthetic training data pipeline. No explicit free parameters, axioms, or invented entities are detailed in the provided text.

pith-pipeline@v0.9.0 · 5574 in / 1196 out tokens · 68678 ms · 2026-05-10T18:58:56.494212+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Generating HDR Video from SDR Video

    cs.CV 2026-05 unverdicted novelty 7.0

    A multi-exposure video model predicts bracketed linear SDR sequences from single nonlinear SDR input, which a merging model combines into HDR video preserving shadow and highlight detail.

  2. Single-Shot HDR Recovery via a Video Diffusion Prior

    cs.CV 2026-05 unverdicted novelty 7.0

    Single-shot HDR is achieved by conditioning a video diffusion model on an LDR input to generate an exposure bracket and fusing the bracket with per-pixel weights from a lightweight UNet.

Reference graph

Works this paper leans on

113 extracted references · 35 canonical work pages · cited by 2 Pith papers · 13 internal anchors

  1. [1]

    Poly haven (2024),https://polyhaven.com/, accessed: 2024-10-21

  2. [2]

    In: International Conference on Learning Representa- tions (ICLR) (2024)

    Adiya, T., Ha, S.J.: Omnipainter: Global-local temporally consistent video in- painting diffusion model. In: International Conference on Learning Representa- tions (ICLR) (2024)

  3. [3]

    Cosmos World Foundation Model Platform for Physical AI

    Agarwal, N., Ali, A., Bala, M., Balaji, Y., Barker, E., Cai, T., Chattopadhyay, P., Chen, Y., Cui, Y., Ding, Y., et al.: Cosmos world foundation model platform for physical ai. arXiv preprint arXiv:2501.03575 (2025)

  4. [4]

    In: International conference on machine learning

    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial net- works. In: International conference on machine learning. pp. 214–223. Pmlr (2017)

  5. [5]

    Graph.43(2) (Mar 2024).https://doi.org/10.1145/3648570, https://doi.org/10.1145/3648570

    Banterle, F., Marnerides, D., Bashford-rogers, T., Debattista, K.: Self-supervised high dynamic range imaging: What can be learned from a single 8-bit video? ACM Trans. Graph.43(2) (Mar 2024).https://doi.org/10.1145/3648570, https://doi.org/10.1145/3648570

  6. [6]

    Bracket diffusion: Hdr image generation by consistent ldr denoising, 2025

    Bemana, M., Leimkühler, T., Myszkowski, K., Seidel, H.P., Ritschel, T.: Bracket diffusion: Hdr image generation by consistent ldr denoising. arXiv preprint arXiv:2405.14304 (2024)

  7. [7]

    Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

    Blattmann, A., Dockhorn, T., Kulal, S., Mendelevitch, D., Kilian, M., Lorenz, D., Levi, Y., English, Z., Voleti, V., Letts, A., et al.: Stable video diffusion: Scaling latent video diffusion models to large datasets. arXiv preprint arXiv:2311.15127 (2023)

  8. [8]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Burgert, R., Xu, Y., Xian, W., Pilarski, O., Clausen, P., He, M., Ma, L., Deng, Y., Li, L., Mousavi, M., et al.: Go-with-the-flow: Motion-controllable video diffusion models using real-time warped noise. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 13–23 (2025)

  9. [9]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Chan, E.R., Lin, C.Z., Chan, M.A., Nagano, K., Pan, B., De Mello, S., Gallo, O., Guibas, L.J., Tremblay, J., Khamis, S., et al.: Efficient geometry-aware 3d generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 16123–16133 (2022)

  10. [10]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

    Chan, E.R., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-gan: Peri- odic implicit generative adversarial networks for 3d-aware image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 5799–5809 (2021)

  11. [11]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

    Chen, G., Chen, C., Guo, S., Liang, Z., Wong, K.Y.K., Zhang, L.: Hdr video reconstruction: A coarse-to-fine network and a real-world benchmark dataset. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 2502–2511 (October 2021)

  12. [12]

    In: Proceedings of the AAAI Conference on Artificial Intelligence

    Chen, R., Zheng, B., Zhang, H., Chen, Q., Yan, C., Slabaugh, G., Yuan, S.: Improving dynamic hdr imaging with fusion transformer. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 37, pp. 340–349 (2023)

  13. [13]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Chen, Z., Wang, Y., Cai, X., You, Z., Lu, Z., Zhang, F., Guo, S., Xue, T.: Ultra- fusion: Ultra high dynamic imaging using exposure fusion. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16111–16121 (2025)

  14. [14]

    Chung, H., Cho, N.I.: Lan-hdr: Luminance-based alignment network for high dy- namicrangevideoreconstruction.In:ProceedingsoftheIEEE/CVFInternational Conference on Computer Vision (ICCV). pp. 12760–12769 (October 2023)

  15. [15]

    In: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques

    Debevec, P.E., Malik, J.: Recovering high dynamic range radiance maps from pho- tographs. In: Proceedings of the 24th Annual Conference on Computer Graphics DiffHDR 17 and Interactive Techniques. p. 369–378. SIGGRAPH ’97, ACM Press/Addison- Wesley Publishing Co., USA (1997).https://doi.org/10.1145/258734.258884, https://doi.org/10.1145/258734.258884

  16. [16]

    Advances in neural information processing systems34, 8780–8794 (2021)

    Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. Advances in neural information processing systems34, 8780–8794 (2021)

  17. [17]

    In: Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, Septem- ber 29–October 4, 2024, Proceedings, Part XI

    Dille, S., Careaga, C., Aksoy, Y.: Intrinsic single-image hdr reconstruction. In: Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, Septem- ber 29–October 4, 2024, Proceedings, Part XI. p. 161–177. Springer-Verlag, Berlin, Heidelberg (2024).https://doi.org/10.1007/978-3-031-73247-8_10, https://doi.org/10.1007/978-3-031-73247-8_10

  18. [18]

    ACM transactions on graphics (TOG)36(6), 1–15 (2017)

    Eilertsen, G., Kronander, J., Denes, G., Mantiuk, R.K., Unger, J.: Hdr image reconstruction from a single exposure using deep cnns. ACM transactions on graphics (TOG)36(6), 1–15 (2017)

  19. [19]

    ACM Trans

    Endo, Y., Kanamori, Y., Mitani, J.: Deep reverse tone mapping. ACM Trans. Graph.36(6), 177–1 (2017)

  20. [20]

    In: Digital photography X

    Froehlich, J., Grandinetti, S., Eberhardt, B., Walter, S., Schilling, A., Brendel, H.: Creating cinematic wide gamut hdr-video for the evaluation of tone mapping operators and hdr-displays. In: Digital photography X. vol. 9023, pp. 279–288. SPIE (2014)

  21. [21]

    In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

    Gallo, O., Troccoli, A., Hu, J., Pulli, K., Kautz, J.: Locally non-rigid registration for mobile hdr photography. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 48–55 (2015).https://doi.org/ 10.1109/CVPRW.2015.7301366

  22. [22]

    Communications of the ACM63(11), 139–144 (2020)

    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Communications of the ACM63(11), 139–144 (2020)

  23. [23]

    arXiv preprint arXiv:2405.15468 (2024)

    Goswami, A., Singh, A.R., Banterle, F., Debattista, K., Bashford-Rogers, T.: Semantic aware diffusion inverse tone mapping. arXiv preprint arXiv:2405.15468 (2024)

  24. [24]

    In: Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers

    Gu, Z., Yan, R., Lu, J., Li, P., Dou, Z., Si, C., Dong, Z., Liu, Q., Lin, C., Liu, Z., et al.: Diffusion as shader: 3d-aware video diffusion for versatile video generation control. In: Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers. pp. 1–12 (2025)

  25. [25]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision

    Guan,Y.,Xu,R.,Liao,Y.,Yao,M.,Wang,L.,Xiong,Z.:Hdrimagegenerationvia gain map decomposed diffusion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 17536–17545 (2025)

  26. [26]

    In: European Conference on Computer Vision

    Guan, Y., Xu, R., Yao, M., Gao, R., Wang, L., Xiong, Z.: Diffusion-promoted hdr video reconstruction. In: European Conference on Computer Vision. pp. 20–38. Springer (2024)

  27. [27]

    In: Computer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings, Part IX

    Guan, Y., Xu, R., Yao, M., Gao, R., Wang, L., Xiong, Z.: Diffusion-promoted hdr video reconstruction. In: Computer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings, Part IX. p. 20–38. Springer-Verlag, Berlin, Heidelberg (2025).https://doi.org/10.1007/978-3-031-91838-4_2, https://doi.org/10.1007/978-3-031-91838-4_2

  28. [28]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Guo, C., Fan, L., Xue, Z., Jiang, X.: Learning a practical sdr-to-hdrtv up- conversion using new dataset and degradation models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 22231–22241 (June 2023)

  29. [29]

    In: Proceedings of the Asian Conference on Computer Vision (ACCV)

    Guo, C., Jiang, X.: Lhdr: Hdr reconstruction for legacy content using a lightweight dnn. In: Proceedings of the Asian Conference on Computer Vision (ACCV). pp. 3155–3171 (December 2022) 18 Z. Yu, L. Ma, M. He et al

  30. [30]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Guo, S., Yan, Z., Zhang, K., Zuo, W., Zhang, L.: Toward convolutional blind denoising of real photographs. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 1712–1722 (2019)

  31. [31]

    LTX-Video: Realtime Video Latent Diffusion

    HaCohen, Y., Chiprut, N., Brazowski, B., Shalem, D., Moshe, D., Richardson, E., Levin, E., Shiran, G., Zabari, N., Gordon, O., Panet, P., Weissbuch, S., Ku- likov, V., Bitterman, Y., Melumian, Z., Bibi, O.: Ltx-video: Realtime video latent diffusion. arXiv preprint arXiv:2501.00103 (2024)

  32. [32]

    Princeton university press (2020)

    Hamilton, J.D.: Time series analysis. Princeton university press (2020)

  33. [33]

    In: ACM SIGGRAPH 2022 conference proceedings

    Hanji, P., Mantiuk, R., Eilertsen, G., Hajisharif, S., Unger, J.: Comparison of single image hdr reconstruction methods—the caveats of quality assessment. In: ACM SIGGRAPH 2022 conference proceedings. pp. 1–8 (2022)

  34. [34]

    In: SIGGRAPH Asia 2024 conference papers

    He, M., Clausen, P., Taşel, A.L., Ma, L., Pilarski, O., Xian, W., Rikker, L., Yu, X., Burgert, R., Yu, N., et al.: Diffrelight: Diffusion-based facial performance relighting. In: SIGGRAPH Asia 2024 conference papers. pp. 1–12 (2024)

  35. [35]

    Advances in neural information processing systems30(2017)

    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems30(2017)

  36. [36]

    Classifier-Free Diffusion Guidance

    Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)

  37. [37]

    Hu, J., Gallo, O., Pulli, K., Sun, X.: Hdr deghosting: How to deal with satura- tion? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2013)

  38. [38]

    In: Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR)

    Hu, T., Yan, Q., Qi, Y., Zhang, Y.: Generating content for hdr deghosting from frequency view. In: Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR). pp. 25732–25741 (June 2024)

  39. [39]

    VChain: Chain-of-visual-thought for reasoning in video generation.arXiv preprint arXiv:2510.05094, 2025

    Huang, Z., Yu, N., Chen, G., Qiu, H., Debevec, P., Liu, Z.: Vchain: Chain-of- visual-thought for reasoning in video generation. arXiv preprint arXiv:2510.05094 (2025)

  40. [40]

    ACM Transactions on Graphics42(3), 1–18 (2023)

    Jiang, K., Chen, S.Y., Fu, H., Gao, L.: Nerffacelighting: Implicit and disentangled face lighting representation leveraging generative prior in neural radiance fields. ACM Transactions on Graphics42(3), 1–18 (2023)

  41. [41]

    CoRR , volume =

    Jiang, Z., Han, Z., Mao, C., Zhang, J., Pan, Y., Liu, Y.: Vace: All-in-one video creation and editing. arXiv preprint arXiv:2503.07598 (2025)

  42. [42]

    In: European Conference on Computer Vision

    Ju, X., Liu, X., Wang, X., Bian, Y., Shan, Y., Xu, Q.: Brushnet: A plug-and-play image inpainting model with decomposed dual-branch diffusion. In: European Conference on Computer Vision. pp. 150–168. Springer (2024)

  43. [43]

    Computer Graphics Forum38(2), 193–205 (2019).https: //doi.org/https://doi.org/10.1111/cgf.13630,https://onlinelibrary

    Kalantari, N.K., Ramamoorthi, R.: Deep hdr video from sequences with alter- nating exposures. Computer Graphics Forum38(2), 193–205 (2019).https: //doi.org/https://doi.org/10.1111/cgf.13630,https://onlinelibrary. wiley.com/doi/abs/10.1111/cgf.13630

  44. [44]

    ACM Trans

    Kalantari, N.K., Ramamoorthi, R., et al.: Deep high dynamic range imaging of dynamic scenes. ACM Trans. Graph.36(4), 144–1 (2017)

  45. [45]

    and O'Brien, James F

    Kalantari, N.K., Shechtman, E., Barnes, C., Darabi, S., Goldman, D.B., Sen, P.: Patch-based high dynamic range video. ACM Trans. Graph.32(6) (Nov 2013).https://doi.org/10.1145/2508363.2508402,https://doi.org/10. 1145/2508363.2508402

  46. [47]

    ACM Trans

    Kang, S.B., Uyttendaele, M., Winder, S., Szeliski, R.: High dynamic range video. ACM Trans. Graph.22(3), 319–325 (Jul 2003).https://doi.org/10.1145/ 882262.882270,https://doi.org/10.1145/882262.882270

  47. [48]

    Advances in neural information processing systems34, 852–863 (2021)

    Karras, T., Aittala, M., Laine, S., Härkönen, E., Hellsten, J., Lehtinen, J., Aila, T.: Alias-free generative adversarial networks. Advances in neural information processing systems34, 852–863 (2021)

  48. [49]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4401–4410 (2019)

  49. [50]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8110–8119 (2020)

  50. [51]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Ke, J., Wang, Q., Wang, Y., Milanfar, P., Yang, F.: Musiq: Multi-scale image quality transformer. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 5148–5157 (2021)

  51. [52]

    arXiv preprint arXiv:2407.16308 (2024)

    Kong, L., Li, B., Xiong, Y., Zhang, H., Gu, H., Chen, J.: Safnet: Selective align- ment fusion network for efficient hdr imaging. arXiv preprint arXiv:2407.16308 (2024)

  52. [53]

    Image Commun.29(2), 203–215 (Feb 2014).https://doi.org/10.1016/j.image.2013.08.018,https: //doi.org/10.1016/j.image.2013.08.018

    Kronander, J., Gustavson, S., Bonnet, G., Ynnerman, A., Unger, J.: A unified framework for multi-sensor hdr video reconstruction. Image Commun.29(2), 203–215 (Feb 2014).https://doi.org/10.1016/j.image.2013.08.018,https: //doi.org/10.1016/j.image.2013.08.018

  53. [54]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

    Le, P.H., Le, Q., Nguyen, R., Hua, B.S.: Single-image hdr reconstruction by multi- exposure generation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 4063–4072 (January 2023)

  54. [55]

    In: proceedings of the European Conference on Computer Vision (ECCV)

    Lee, S., An, G.H., Kang, S.J.: Deep recursive hdri: Inverse tone mapping using generative adversarial networks. In: proceedings of the European Conference on Computer Vision (ECCV). pp. 596–611 (2018)

  55. [56]

    Neurocomputing479, 47–59 (2022)

    Li, H., Yang, Y., Chang, M., Feng, H., Xu, Z., Li, Q., Chen, Y.: Srdiff: Single image super-resolution with diffusion probabilistic models. Neurocomputing479, 47–59 (2022)

  56. [57]

    In: The Thirteenth International Conference on Learning Representations (2025)

    Liao, Y., Guan, Y., Xu, R., Li, J., Sun, S., Xiong, Z.: Learning gain map for inverse tone mapping. In: The Thirteenth International Conference on Learning Representations (2025)

  57. [58]

    Flow Matching for Generative Modeling

    Lipman, Y., Chen, R.T., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. arXiv preprint arXiv:2210.02747 (2022)

  58. [59]

    Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

    Liu, X., Gong, C., Liu, Q.: Flow straight and fast: Learning to generate and transfer data with rectified flow. arXiv preprint arXiv:2209.03003 (2022)

  59. [60]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Liu, Y.L., Lai, W.S., Chen, Y.S., Kao, Y.L., Yang, M.H., Chuang, Y.Y., Huang, J.B.: Single-image hdr reconstruction by learning to reverse the camera pipeline. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 1651–1660 (2020)

  60. [61]

    In: European Conference on Computer Vision

    Liu, Z., Wang, Y., Zeng, B., Liu, S.: Ghost-free high dynamic range imaging with context-aware transformer. In: European Conference on Computer Vision. pp. 344–360. Springer (2022)

  61. [62]

    Decoupled Weight Decay Regularization

    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  62. [63]

    Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint:Inpaintingusingdenoisingdiffusionprobabilisticmodels.In:Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11461–11471 (2022) 20 Z. Yu, L. Ma, M. He et al

  63. [64]

    ACM Transactions on Graphics (TOG)40(4), 1–19 (2021)

    Mantiuk, R.K., Denes, G., Chapiro, A., Kaplanyan, A., Rufo, G., Bachy, R., Lian, T., Patney, A.: Fovvideovdp: A visible difference predictor for wide field-of-view video. ACM Transactions on Graphics (TOG)40(4), 1–19 (2021)

  64. [65]

    2304.13625 , archivePrefix=

    Mantiuk, R.K., Hammou, D., Hanji, P.: Hdr-vdp-3: A multi-metric for predicting image differences, quality and contrast distortions in high dynamic range and regular content. arXiv preprint arXiv:2304.13625 (2023)

  65. [66]

    In: Computer Graphics Forum

    Marnerides, D., Bashford-Rogers, T., Hatchett, J., Debattista, K.: Expandnet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content. In: Computer Graphics Forum. vol. 37, pp. 37–49. Wiley Online Library (2018)

  66. [67]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Mei, Y., He, M., Ma, L., Philip, J., Xian, W., George, D.M., Yu, X., Dedic, G., Taşel, A.L., Yu, N., et al.: Lux post facto: Learning portrait performance relighting with conditional video diffusion and a hybrid dataset. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 5510–5522 (2025)

  67. [68]

    In: International Conference on Learning Representations (ICLR) (2022)

    Meng, C., He, Y., Song, Y., Song, J., Wu, J., Zhu, J.Y., Ermon, S.: Sdedit: Guided image synthesis and editing with stochastic differential equations. In: International Conference on Learning Representations (ICLR) (2022)

  68. [69]

    In: The Thirty-ninth Annual Confer- ence on Neural Information Processing Systems (2025),https://openreview

    Meng, Y., Jin, X., Lei, L., Guo, C.L., Li, C.: UltraLED: Learning to see every- thing in ultra-high dynamic range scenes. In: The Thirty-ninth Annual Confer- ence on Neural Information Processing Systems (2025),https://openreview. net/forum?id=zZLfHw4Erp

  69. [70]

    In: Proceedings of the AAAI Conference on Artificial Intelli- gence

    Mou, C., Wang, X., Xie, L., Wu, Y., Zhang, J., Qi, Z., Shan, Y., Qie, X.: T2i- adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. In: Proceedings of the AAAI Conference on Artificial Intelli- gence. vol. 38, pp. 4296–4304 (2024)

  70. [71]

    IEEE Transactions on Image Processing30, 3885–3896 (2021)

    Niu, Y., Wu, J., Liu, W., Guo, W., Lau, R.W.: Hdr-gan: Hdr image reconstruction from multi-exposed ldr images with large motions. IEEE Transactions on Image Processing30, 3885–3896 (2021)

  71. [72]

    Pexels: Pexels: Free stock photos & videos.https://www.pexels.com(2026), accessed: 2026-03-03

  72. [73]

    IEEE transactions on visualization and computer graphics11(1), 13– 24 (2005)

    Reinhard, E., Devlin, K.: Dynamic range reduction inspired by photoreceptor physiology. IEEE transactions on visualization and computer graphics11(1), 13– 24 (2005)

  73. [74]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)

  74. [75]

    IEEE Transactions on Pattern Analysis and Machine Intelligence45(4), 4713–4726 (2023)

    Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence45(4), 4713–4726 (2023)

  75. [76]

    ACM Transactions on Graphics (TOG)39(4), 80–1 (2020)

    Santos, M.S., Ren, T.I., Kalantari, N.K.: Single image hdr reconstruction using a cnn with masked features and perceptual loss. ACM Transactions on Graphics (TOG)39(4), 80–1 (2020)

  76. [77]

    ACM Trans

    Sen,P.,Kalantari,N.K.,Yaesoubi,M.,Darabi,S.,Goldman,D.B.,Shechtman,E.: Robust patch-based hdr reconstruction of dynamic scenes. ACM Trans. Graph. 31(6), 203–1 (2012)

  77. [78]

    In: European Conference on Computer Vision

    Song, J.W., Park, Y.I., Kong, K., Kwak, J., Kang, S.J.: Selective transhdr: Transformer-based selective hdr imaging using ghost region mask. In: European Conference on Computer Vision. pp. 288–304. Springer (2022)

  78. [79]

    In: Proceedings DiffHDR 21 of the IEEE/CVF conference on computer vision and pattern recognition

    Sun, J., Wang, X., Wang, L., Li, X., Zhang, Y., Zhang, H., Liu, Y.: Next3d: Generative neural texture rasterization for 3d-aware head avatars. In: Proceedings DiffHDR 21 of the IEEE/CVF conference on computer vision and pattern recognition. pp. 20991–21002 (2023)

  79. [80]

    Gemini: A Family of Highly Capable Multimodal Models

    Team, G., Anil, R., Borgeaud, S., Alayrac, J.B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A.M., Hauth, A., Millican, K., et al.: Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023)

  80. [81]

    In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

    Tel, S., Wu, Z., Zhang, Y., Heyrman, B., Demonceaux, C., Timofte, R., Ginhac, D.: Alignment-free hdr deghosting with semantics consistent transformer. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 12790– 12799. IEEE (2023)

Showing first 80 references.