Diffusion-Based Material Regularization for Physics-Based Inverse Rendering

Feng Xu; Jingwang Ling; Lifan Wu; Shuang Zhao

arxiv: 2606.31065 · v1 · pith:UNSCKQ2Bnew · submitted 2026-06-30 · 💻 cs.CV

Diffusion-Based Material Regularization for Physics-Based Inverse Rendering

Jingwang Ling , Lifan Wu , Feng Xu , Shuang Zhao This is my paper

Pith reviewed 2026-07-01 06:44 UTC · model grok-4.3

classification 💻 cs.CV

keywords inverse renderingmaterial reconstructiondiffusion modelsphysics-based renderingregularizationrelighting3D reconstructionmulti-view images

0 comments

The pith

Treating diffusion model outputs as a similarity kernel regularizes materials during physics-based inverse rendering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that diffusion predictions can guide material optimization without being treated as fixed targets. Instead of replacing the physics-based image formation model, the approach adds a loss that keeps material parameters consistent only across surface patches where the diffusion model sees little change. This frees the optimizer to match observed images closely while avoiding the common failure mode in which lighting gets baked into materials. The resulting assets can be fed directly into conventional renderers and produce correct appearance under new lighting. A reader would care because inverse rendering has long been underconstrained, and this offers a way to import learned priors without breaking physical consistency.

Core claim

The central claim is that a regularization loss built from a diffusion model's per-pixel similarity predictions penalizes material variation only where those predictions are nearly constant, leaving the optimizer free to fit the input images elsewhere; when embedded in an end-to-end differentiable pipeline, this loss enables joint recovery of geometry, materials, and illumination that satisfies the rendering equation and supports accurate relighting.

What carries the argument

The diffusion-based material regularization loss, which uses diffusion predictions as a similarity kernel rather than as target material values.

If this is right

Joint optimization of geometry, materials, and illumination becomes feasible from multi-view images.
Reconstructed assets can be inserted directly into standard rendering pipelines without further adjustment.
The assets support faithful appearance under novel lighting conditions.
Quantitative improvements appear on Synthetic4Relight, Stanford-ORB, and DTC-Synthetic in both reconstruction error and relighting metrics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same similarity-kernel idea could be applied to other data-driven priors such as normals or geometry.
The approach might reduce the number of input views needed by letting the diffusion model supply additional consistency constraints.
Because the loss is differentiable, it could be combined with other differentiable renderers or editing tools.

Load-bearing premise

The diffusion model's per-pixel predictions supply a reliable similarity kernel that does not systematically conflict with the image-formation model or introduce biases that cannot be overcome by the data term.

What would settle it

If assets reconstructed by the method produce renderings that match the training views yet deviate substantially from ground-truth images when illuminated by novel lighting on the Synthetic4Relight or Stanford-ORB test sets, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2606.31065 by Feng Xu, Jingwang Ling, Lifan Wu, Shuang Zhao.

**Figure 1.** Figure 1: Qualitative comparison of relighting on the Stanford-ORB dataset [21] between our method, Neural-PBIR [39], and MaterialFusion [27]. With our new material clustering regularizer, we avoid baked-in shadows while accurately modeling spatially varying materials (top). Our method is also more robust to strong highlights on glossy metallic surfaces, producing more accurate reflections (bottom). Abstract. Recon… view at source ↗

**Figure 2.** Figure 2: Overview of our pipeline. From N multi-view images {Ii} under unknown illumination, we (1) predict per-view intrinsic G-buffers G = [A, R,M, N] with a conditional diffusion model; (2) reconstruct a voxel-grid SDF by neural volume rendering, supervised by the predicted normals N (Lshape); and (3) jointly optimize shape, spatially varying material, and an environment map by differentiable rendering, minimi… view at source ↗

**Figure 3.** Figure 3: Qualitative comparison of relighting on the Stanford-ORB dataset [21] and DTC-Synthetic dataset [11] between our method, Neural-PBIR [39], and MaterialFusion [27]. 4.1 Implementation Details For the SDF stage, we largely follow [39] and add the normal regularization term with weight 10−5 . For the PBIR stage, we represent the shape as a mesh, the materials with the Disney principled BRDF [4] (optimizing b… view at source ↗

**Figure 4.** Figure 4: Qualitative comparison of relighting on the Synthetic4Relight dataset [50] between our method, Neural-PBIR [39], and MaterialFusion [27]. tivation. BRDF outputs use no activation; we clamp them to [0, 1] and apply an L1 penalty to out-of-range values with a weight of 10−2 . We optimize using an initial learning rate of 3×10−2 with cosine annealing to 10−3 . We set λmat = 0.1, σg = 0.02 and ϵ = 10−2 , and … view at source ↗

**Figure 5.** Figure 5: Qualitative comparison of relighting between our method and alternative material regularization strategies, including directly back-projecting DiffusionRenderer predictions (Diffusion-BP), optimizing from this initialization without regularization (w/o reg.), and using a non–data-driven albedo–specular correlation regularizer (d-s corr.). cup_scene007 Input Relight-GT Ours scale inv. curry_scene001 Input … view at source ↗

**Figure 6.** Figure 6: Qualitative relighting comparison between our method and the global scaleinvariant loss from VideoMat [33]. MaterialFusion is based on a different shape reconstruction pipeline [15], so shape quality is not directly comparable. It often eliminates baked-in shadows but tends to underfit the input images, failing to reconstruct spatially varying color patterns in cup_scene006 and on the side of baking_scene… view at source ↗

**Figure 7.** Figure 7: Ablation study on the normal-supervision loss (Eq. (1)). Relight-GT Ours No albedo trans [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Ablating scale-agnostic albedo transformation. Rows show relighting and environment maps. Our method outperforms the baselines on relighting accuracy as well as albedo/roughness estimation and shape reconstruction. As Neural-PBIR is the current state of the art on Stanford-ORB, surpassing it sets a new state of the art on this dataset by a notable margin. 4.4 Exploring Alternative Choices We evaluate alter… view at source ↗

**Figure 9.** Figure 9: Additional qualitative Stanford-ORB relighting results grouped by case. Each row shows the shared input and relighting target once, followed by previous-method baselines and our result. agate these errors and become less effective ( [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Synthetic4Relight air_baloons split-channel-guidance ablation qualitative comparison (input view 012, roughness view 000). B DiffusionRenderer: Image Mode vs. Video Mode DiffusionRenderer [25] supports two inference modes. In image mode, each view is processed independently, yielding sharp per-view G-buffer predictions. In video mode, views are processed jointly as a temporally coherent sequence, which im… view at source ↗

**Figure 11.** Figure 11: Comparison of DiffusionRenderer predictions in video mode and image mode on car_scene002 (Stanford-ORB), shown for two training views. Video mode (rows 1 and 3) produces more cross-view consistent G-buffers but at the cost of spatial sharpness; image mode (rows 2 and 4) yields sharper per-view predictions. mode also improves Diffusion-BP relighting PSNR-L over image mode (33.09 vs. 32.67). C Additional I… view at source ↗

**Figure 12.** Figure 12: Qualitative Stanford-ORB relighting results when replacing DiffusionRenderer with RGB↔X as the upstream model. E Transfer of Our Material Regularization to IRGS To assess whether our regularization transfers beyond our PBIR pipeline, we apply it to IRGS [14], a strong inverse-rendering baseline based on 2D Gaussian splatting with inter-reflective ray tracing. IRGS includes non-data-driven smoothness term… view at source ↗

**Figure 13.** Figure 13: Synthetic4Relight air_baloons (view 101). JBF material regularization smooths base color and roughness G-buffers, yielding improved relighting quality [PITH_FULL_IMAGE:figures/full_fig_p027_13.png] view at source ↗

**Figure 14.** Figure 14: Synthetic4Relight chair (view 111). JBF material regularization smooths base color and roughness G-buffers, yielding improved relighting quality [PITH_FULL_IMAGE:figures/full_fig_p028_14.png] view at source ↗

**Figure 15.** Figure 15: Synthetic4Relight hotdog (view 078). JBF material regularization smooths base color and roughness G-buffers, yielding improved relighting quality [PITH_FULL_IMAGE:figures/full_fig_p029_15.png] view at source ↗

**Figure 16.** Figure 16: Synthetic4Relight jugs (view 158). JBF material regularization smooths base color and roughness G-buffers, yielding improved relighting quality [PITH_FULL_IMAGE:figures/full_fig_p030_16.png] view at source ↗

**Figure 17.** Figure 17: Stanford-ORB cactus_scene005 [PITH_FULL_IMAGE:figures/full_fig_p032_17.png] view at source ↗

**Figure 18.** Figure 18: Stanford-ORB car_scene002 [PITH_FULL_IMAGE:figures/full_fig_p033_18.png] view at source ↗

**Figure 19.** Figure 19: Stanford-ORB cup_scene006 [PITH_FULL_IMAGE:figures/full_fig_p034_19.png] view at source ↗

**Figure 20.** Figure 20: Stanford-ORB grogu_scene003 [PITH_FULL_IMAGE:figures/full_fig_p035_20.png] view at source ↗

**Figure 21.** Figure 21: Stanford-ORB pitcher_scene001 [PITH_FULL_IMAGE:figures/full_fig_p036_21.png] view at source ↗

**Figure 22.** Figure 22: Stanford-ORB baking_scene003 [PITH_FULL_IMAGE:figures/full_fig_p037_22.png] view at source ↗

**Figure 23.** Figure 23: Stanford-ORB ball_scene003 [PITH_FULL_IMAGE:figures/full_fig_p038_23.png] view at source ↗

**Figure 24.** Figure 24: Stanford-ORB blocks_scene005 [PITH_FULL_IMAGE:figures/full_fig_p039_24.png] view at source ↗

**Figure 25.** Figure 25: Stanford-ORB chips_scene003 [PITH_FULL_IMAGE:figures/full_fig_p040_25.png] view at source ↗

**Figure 26.** Figure 26: DTC-Synthetic TeaPot_B094FQW6Q4_EmeraldGoldTop_scene002 [PITH_FULL_IMAGE:figures/full_fig_p041_26.png] view at source ↗

**Figure 27.** Figure 27: DTC-Synthetic TeaPot_B084G3K8TD_YellowBlackSunflowers_TU_scene002 [PITH_FULL_IMAGE:figures/full_fig_p042_27.png] view at source ↗

**Figure 28.** Figure 28: Synthetic4Relight air_baloons [PITH_FULL_IMAGE:figures/full_fig_p043_28.png] view at source ↗

**Figure 29.** Figure 29: Synthetic4Relight hotdog [PITH_FULL_IMAGE:figures/full_fig_p044_29.png] view at source ↗

**Figure 30.** Figure 30: Synthetic4Relight chair [PITH_FULL_IMAGE:figures/full_fig_p045_30.png] view at source ↗

**Figure 31.** Figure 31: Synthetic4Relight jugs [PITH_FULL_IMAGE:figures/full_fig_p046_31.png] view at source ↗

**Figure 32.** Figure 32: Stanford-ORB grogu_scene003 ablation intrinsic comparison [PITH_FULL_IMAGE:figures/full_fig_p047_32.png] view at source ↗

**Figure 33.** Figure 33: DTC-Synthetic Block_B007GE75HY_RedBlue_scene002 ablation intrinsic comparison [PITH_FULL_IMAGE:figures/full_fig_p048_33.png] view at source ↗

**Figure 34.** Figure 34: Stanford-ORB cup_scene007 scale-invariant ablation intrinsic comparison [PITH_FULL_IMAGE:figures/full_fig_p049_34.png] view at source ↗

**Figure 35.** Figure 35: Stanford-ORB curry_scene001 scale-invariant ablation intrinsic comparison [PITH_FULL_IMAGE:figures/full_fig_p050_35.png] view at source ↗

read the original abstract

Reconstructing physics-based 3D assets -- geometry, materials, and illumination -- from multi-view images is a core problem in computer graphics and vision, and a prerequisite for realistic relighting and editing. Physics-based inverse rendering offers an accurate image-formation model, but is severely underconstrained: without strong priors, illumination is baked into materials, and reconstructions generalize poorly to novel views and lighting. Data-driven diffusion models, in contrast, predict visually plausible materials, yet their predictions rarely satisfy the rendering equation and are not directly usable for physics-based rendering. We bridge these two paradigms rather than replacing either. Our key idea is to treat the predictions of a state-of-the-art diffusion model not as target material values but as a similarity kernel for optimization: we introduce a regularization loss that penalizes deviations in the optimized material over surface regions where the diffusion predictions are near-constant, while leaving the optimization free to match the input images. Built on this regularizer, our end-to-end pipeline jointly reconstructs geometry, materials, and illumination, yielding high-quality assets that drop into standard rendering pipelines and relight faithfully. On the Synthetic4Relight, Stanford-ORB, and DTC-Synthetic datasets, our method significantly outperforms state-of-the-art baselines in both reconstruction accuracy and relighting quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The main move here is treating diffusion predictions as a similarity kernel for material regularization instead of targets, which cleanly sidesteps the usual physics inconsistency problem.

read the letter

The paper's central contribution is a regularization loss that only penalizes material deviations in regions where a pretrained diffusion model gives near-constant predictions. This leaves the data term free to enforce rendering consistency while still injecting some learned material structure. It is a distinct framing compared to the usual approach of regressing toward diffusion outputs directly.

The work does a solid job of building an end-to-end pipeline that recovers geometry, materials, and illumination together from multi-view images. The reported gains on Synthetic4Relight, Stanford-ORB, and DTC-Synthetic for both reconstruction accuracy and relighting quality are the kind of concrete evidence that matters in this area. The decision to avoid hard targets is the right one and matches the stress-test note that no obvious internal contradiction appears.

The soft spot is that the abstract supplies almost no implementation detail: no equation for the kernel, no description of how the near-constant threshold is set, and no ablation that isolates the regularizer from other pipeline choices. Without those, it is hard to tell whether the improvements are robust or dataset-specific. The evaluation stays on synthetic data, so real capture noise and view inconsistency remain untested.

This is for groups already working on physics-based inverse rendering who need a practical prior that does not fight the image formation model. The idea is clear enough that a serious referee should see it; the framing is new and the problem is real even if the current write-up is light on verification.

Referee Report

2 major / 2 minor

Summary. The paper proposes treating predictions from a pretrained diffusion model as a similarity kernel (rather than hard targets) for regularizing material parameters during physics-based inverse rendering optimization. The regularization penalizes material deviations only over surface regions where diffusion outputs are near-constant, leaving the data term free to enforce consistency with the image-formation model. An end-to-end pipeline jointly optimizes geometry, materials, and illumination from multi-view images; the resulting assets are claimed to relight faithfully in standard renderers. Quantitative and qualitative improvements are reported over baselines on Synthetic4Relight, Stanford-ORB, and DTC-Synthetic.

Significance. If the regularization mechanism proves robust, the work offers a principled route to combine data-driven material priors with physical image formation without replacing either paradigm. The explicit design choice to avoid hard targets mitigates a common source of inconsistency in hybrid inverse-rendering methods and could improve generalization to novel lighting. The multi-dataset evaluation and emphasis on drop-in compatibility with existing pipelines are practical strengths.

major comments (2)

[§3] §3 (Method), regularization loss definition: the precise condition for 'near-constant' diffusion predictions and the weighting schedule between the regularization term and the rendering loss must be stated explicitly (including any thresholds or adaptive mechanisms). Without this, it is impossible to verify that the kernel does not systematically conflict with the data term on the claimed datasets.
[§4] §4 (Experiments), Table 2 and relighting metrics: the reported gains in PSNR/SSIM for novel lighting must be accompanied by per-scene variance and statistical significance tests; otherwise the cross-dataset superiority claim rests on aggregate numbers whose reliability cannot be assessed.

minor comments (2)

[Figure 3] Figure 3 caption and §4.1: clarify whether the diffusion model is frozen throughout optimization or fine-tuned on any of the evaluation scenes.
[§2] Related-work section: the discussion of prior diffusion-based inverse-rendering methods should cite the specific architectural differences that motivate the similarity-kernel formulation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation of minor revision. We address each major comment below and will incorporate the requested clarifications and additional analyses into the revised manuscript.

read point-by-point responses

Referee: [§3] §3 (Method), regularization loss definition: the precise condition for 'near-constant' diffusion predictions and the weighting schedule between the regularization term and the rendering loss must be stated explicitly (including any thresholds or adaptive mechanisms). Without this, it is impossible to verify that the kernel does not systematically conflict with the data term on the claimed datasets.

Authors: We agree that the current description in §3 is insufficiently precise. The revised manuscript will explicitly define the near-constant condition (regions where the per-pixel variance of the diffusion model outputs across an ensemble of samples falls below a fixed threshold) and will state the exact weighting schedule between the regularization term and the rendering loss, including the value of the balancing hyperparameter and whether it is held constant or adapted during optimization. revision: yes
Referee: [§4] §4 (Experiments), Table 2 and relighting metrics: the reported gains in PSNR/SSIM for novel lighting must be accompanied by per-scene variance and statistical significance tests; otherwise the cross-dataset superiority claim rests on aggregate numbers whose reliability cannot be assessed.

Authors: We acknowledge that aggregate metrics alone are insufficient to support the superiority claims. The revised version will augment Table 2 (and the corresponding relighting tables) with per-scene means and standard deviations, and will report the results of paired statistical significance tests (e.g., Wilcoxon signed-rank or t-tests) across scenes for each dataset. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's core mechanism defines a regularization loss that treats an external pretrained diffusion model's per-pixel outputs strictly as a similarity kernel (penalizing deviations only where predictions are near-constant) while leaving the data term free to enforce rendering consistency. No equation or claim reduces a derived quantity to the authors' own fitted parameters, self-citations, or ansatzes imported from prior work by the same authors. The pipeline jointly optimizes geometry, materials, and illumination using this external kernel plus the image-formation model, with no self-definitional, fitted-input-called-prediction, or uniqueness-imported steps evident. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.1-grok · 5758 in / 1063 out tokens · 23348 ms · 2026-07-01T06:44:23.827050+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 40 canonical work pages · 1 internal anchor

[1]

Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the per- mutohedral lattice. Comput. Graph. Forum29(2), 753–762 (2010).https://doi. org/10.1111/J.1467-8659.2009.01645.X

work page doi:10.1111/j.1467-8659.2009.01645.x 2010
[2]

In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Alzayer, H., Henzler, P., Barron, J.T., Huang, J., Srinivasan, P.P., Verbin, D.: Generative multiview relighting for 3d reconstruction under extreme illumination variation. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 10933–10942. Com- puter Vision Foundation / IEEE (2025)....

work page doi:10.1109/cvpr52734 2025
[3]

In: Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XXVIII

Attal, B., Verbin, D., Mildenhall, B., Hedman, P., Barron, J.T., O’Toole, M., Srini- vasan, P.P.: Flash cache: Reducing bias in radiance cache based inverse render- ing. In: Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XXVIII. Lecture Notes in Com- puter Science, vol. 15086, pp. 20–3...

2024
[4]

Burley, B.: Physically-based shading at disney (2012)

2012
[5]

ACM Trans

Chen, A., Xu, Z., Wei, X., Tang, S., Su, H., Geiger, A.: Dictionary fields: Learning a neural basis decomposition. ACM Trans. Graph.42(4), 156:1–156:12 (2023). https://doi.org/10.1145/3592135

work page doi:10.1145/3592135 2023
[6]

CoRRabs/2302.01226(2023).https: //doi.org/10.48550/ARXIV.2302.01226

Chen, A., Xu, Z., Wei, X., Tang, S., Su, H., Geiger, A.: Factor fields: A unified framework for neural fields and beyond. CoRRabs/2302.01226(2023).https: //doi.org/10.48550/ARXIV.2302.01226

work page doi:10.48550/arxiv.2302.01226 2023
[7]

In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G

Chen, X., Peng, S., Yang, D., Liu, Y., Pan, B., Lv, C., Zhou, X.: Intrinsicanything: Learning diffusion priors for inverse rendering under unknown illumination. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds.) Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, Septem- ber 29-October 4, 2024, Proceed...

work page doi:10.1007/978-3-031- 2024
[8]

CoRRabs/2507.01305(2025).https://doi.org/10

Chinchuthakun, W., Phongthawee, P., Raj, A., Jampani, V., Khungurn, P., Suwa- janakorn, S.: Diffusionlight-turbo: Accelerated light probes for free via single-pass chrome ball inpainting. CoRRabs/2507.01305(2025).https://doi.org/10. 48550/ARXIV.2507.01305

work page arXiv 2025
[9]

In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Chung, H., Choi, S., Baek, S.: Differentiable inverse rendering with interpretable basis brdfs. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 475–

2025
[11]

In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24,

Chung, H., Kim, J., Kim, S., Ye, J.C.: Parallel diffusion models of operator and image for blind inverse problems. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24,

2023
[12]

In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

pp. 6059–6069. IEEE (2023).https://doi.org/10.1109/CVPR52729.2023. 00587

work page doi:10.1109/cvpr52729.2023 2023
[13]

Dong, Z., Chen, K., Lv, Z., Yu, H., Zhang, Y., Zhang, C., Zhu, Y., Tian, S., Li, Z., Moffatt, G., Christofferson, S., Fort, J., Pan, X., Yan, M., Wu, J., Ren, C.Y., Newcombe, R.A.: Digital twin catalog: A large-scale photorealistic 3d object digital twin dataset. In: 2025 IEEE/CVF Conference on Computer Vision and Diffusion-Based Material Regularization f...

2025
[15]

CoRRabs/2411.19322 (2024).https://doi.org/10.48550/ARXIV.2411.19322

Fischer, M., Georgiev, I., Groueix, T., Kim, V.G., Ritschel, T., Deschaintre, V.: Sama: Material-aware 3d selection and segmentation. CoRRabs/2411.19322 (2024).https://doi.org/10.48550/ARXIV.2411.19322

work page doi:10.48550/arxiv.2411.19322 2024
[16]

Scott Armstrong, ed.Expert Opinions in Forecasting: The Role of the Delphi Technique

Gao, J., Gu, C., Lin, Y., Li, Z., Zhu, H., Cao, X., Zhang, L., Yao, Y.: Relightable 3d gaussians: Realistic point cloud relighting with BRDF decomposition and ray tracing. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds.) Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024...

work page doi:10.1007/978- 2024
[17]

In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Gu, C., Wei, X., Zeng, Z., Yao, Y., Zhang, L.: IRGS: inter-reflective gaussian splatting with 2d gaussian ray tracing. In: 2025 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11- 15, 2025. pp. 10943–10952. Computer Vision Foundation / IEEE (2025).https: //doi.org/10.1109/CVPR52734.2025.01022

work page doi:10.1109/cvpr52734.2025.01022 2025
[18]

In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A

Hasselgren, J., Hofmann, N., Munkberg, J.: Shape, light, and material decom- position from images using monte carlo rendering and denoising. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neu- ral Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, Ne...

2022
[19]

Huang, X., Wang, T., Liu, Z., Wang, Q.: Material anything: Generating materials forany3dobjectviadiffusion.In:2025IEEE/CVFConferenceonComputerVision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 26556–26565. Computer Vision Foundation / IEEE (2025).https://doi.org/10. 1109/CVPR52734.2025.02473

work page arXiv 2025
[20]

ACM Trans

Jakob, W., Speierer, S., Roussel, N., Vicini, D.: DR.JIT: a just-in-time compiler for differentiable rendering. ACM Trans. Graph.41(4), 124:1–124:19 (2022).https: //doi.org/10.1145/3528223.3530099

work page doi:10.1145/3528223.3530099 2022
[21]

Jin, H., Li, Y., Luan, F., Xiangli, Y., Bi, S., Zhang, K., Xu, Z., Sun, J., Snavely, N.: Neural gaffer: Relighting any object via diffusion. In: Advances in Neural In- formation Processing Systems 38: Annual Conference on Neural Information Pro- cessing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024 (2024)

2024
[22]

In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Jin, H., Liu, I., Xu, P., Zhang, X., Han, S., Bi, S., Zhou, X., Xu, Z., Su, H.: Tensoir: Tensorial inverse rendering. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pp. 165–174. IEEE (2023).https://doi.org/10.1109/CVPR52729.2023.00024

work page doi:10.1109/cvpr52729.2023.00024 2023
[23]

ACM Trans

Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph.42(4), 139:1–139:14 (2023). https://doi.org/10.1145/3592433

work page doi:10.1145/3592433 2023
[24]

In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

Kuang, Z., Zhang, Y., Yu, H., Agarwala, S., Wu, S., Wu, J.: Stanford-orb: A real- world 3d object inverse rendering benchmark. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Information Pro- cessing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, L...

2023
[25]

In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

Lee, Y., Kim, K., Kim, H., Sung, M.: Syncdiffusion: Coherent montage via syn- chronized joint diffusions. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10-16,...

2023
[26]

In: Dy, J.G., Krause, A

Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M., Aila, T.: Noise2noise: Learning image restoration without clean data. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018. Proceedings of Machine Learning Researc...

2018
[27]

Lensch, H.P.A., Kautz, J., Goesele, M., Heidrich, W., Seidel, H.: Image-based re- constructionofspatialappearanceandgeometricdetail.ACMTrans.Graph.22(2), 234–257 (2003).https://doi.org/10.1145/636886.636891

work page doi:10.1145/636886.636891 2003
[28]

CoRRabs/2501.18590(2025)

Liang,R.,Gojcic,Z.,Ling,H.,Munkberg,J.,Hasselgren,J.,Lin,Z.,Gao,J.,Keller, A., Vijaykumar, N., Fidler, S., Wang, Z.: Diffusionrenderer: Neural inverse and forward rendering with video diffusion models. CoRRabs/2501.18590(2025). https://doi.org/10.48550/ARXIV.2501.18590

work page doi:10.48550/arxiv.2501.18590 2025
[29]

Luxdit: Lighting estimation with video diffusion transformer.arXiv preprint arXiv:2509.03680, 2025

Liang, R., He, K., Gojcic, Z., Gilitschenski, I., Fidler, S., Vijaykumar, N., Wang, Z.: Luxdit: Lighting estimation with video diffusion transformer. CoRR abs/2509.03680(2025).https://doi.org/10.48550/ARXIV.2509.03680

work page doi:10.48550/arxiv.2509.03680 2025
[30]

In: International Conference on 3D Vision, 3DV 2025, Singapore, March 25- 28, 2025

Litman, Y., Patashnik, O., Deng, K., Agrawal, A., Zawar, R., la Torre, F.D., Tul- siani, S.: Materialfusion: Enhancing inverse rendering with material diffusion pri- ors. In: International Conference on 3D Vision, 3DV 2025, Singapore, March 25- 28, 2025. pp. 802–812. IEEE (2025).https://doi.org/10.1109/3DV66043.2025. 00079

work page doi:10.1109/3dv66043.2025 2025
[31]

CoRRabs/2508.06494(2025).https://doi.org/10

Litman, Y., la Torre, F.D., Tulsiani, S.: Lightswitch: Multi-view relighting with material-guided diffusion. CoRRabs/2508.06494(2025).https://doi.org/10. 48550/ARXIV.2508.06494

work page arXiv 2025
[32]

In: Stone, M.C

Lorensen, W.E., Cline, H.E.: Marching cubes: A high resolution 3d surface con- struction algorithm. In: Stone, M.C. (ed.) Proceedings of the 14th Annual Confer- ence on Computer Graphics and Interactive Techniques, SIGGRAPH ’87, Ana- heim, California, USA, July 27-31, 1987. pp. 163–169. ACM (1987).https: //doi.org/10.1145/37401.37422

work page doi:10.1145/37401.37422 1987
[33]

Luan, F., Zhao, S., Bala, K., Dong, Z.: Unified shape and SVBRDF recovery us- ing differentiable monte carlo rendering. Comput. Graph. Forum40(4), 101–113 (2021).https://doi.org/10.1111/CGF.14344

work page doi:10.1111/cgf.14344 2021
[34]

In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022

Mildenhall, B., Hedman, P., Martin-Brualla, R., Srinivasan, P.P., Barron, J.T.: Nerf in the dark: High dynamic range view synthesis from noisy raw images. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pp. 16169–16178. IEEE (2022). https://doi.org/10.1109/CVPR52688.2022.01571

work page doi:10.1109/cvpr52688.2022.01571 2022
[35]

In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J. (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceed- ings, Part I. Lecture Notes in Computer Scien...

work page doi:10.1007/978-3-030-58452-8_24 2020
[36]

Munkberg, J., Wang, Z., Liang, R., Shen, T., Hasselgren, J.: Videomat: Extracting PBR materials from video diffusion models. Comput. Graph. Forum44(4) (2025). https://doi.org/10.1111/CGF.70180 Diffusion-Based Material Regularization for Physics-Based Inverse Rendering 19

work page doi:10.1111/cgf.70180 2025
[37]

In: NeurIPS 2023 Workshop on Deep Learning and Inverse Problems (2023)

Oscanoa, J., Alkan, C., Abraham, D., Nurdinova, A., Ennis, D., Vasanawala, S., Mardani, M., Pauly, J.M.: Variational diffusion models for MRI blind inverse prob- lems. In: NeurIPS 2023 Workshop on Deep Learning and Inverse Problems (2023)

2023
[38]

ACM Trans

Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M.F., Hoppe, H., Toyama, K.: Digital photography with flash and no-flash image pairs. ACM Trans. Graph. 23(3), 664–672 (2004).https://doi.org/10.1145/1015706.1015777

work page doi:10.1145/1015706.1015777 2004
[39]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Phongthawee, P., Chinchuthakun, W., Sinsunthithet, N., Jampani, V., Raj, A., Khungurn, P., Suwajanakorn, S.: Diffusionlight: Light probes for free by painting a chrome ball. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pp. 98–108. IEEE (2024).https://doi.org/10.1109/CVPR52733.2024.00018

work page doi:10.1109/cvpr52733.2024.00018 2024
[40]

Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations,

Schmitt, C., Donné, S., Riegler, G., Koltun, V., Geiger, A.: On joint estimation of pose, geometry and svbrdf from a handheld scanner. In: 2020 IEEE/CVF Confer- enceonComputerVisionandPatternRecognition,CVPR2020,Seattle,WA,USA, June 13-19, 2020. pp. 3490–3500. Computer Vision Foundation / IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.00355

work page doi:10.1109/cvpr42600.2020.00355 2020
[41]

ACM Trans

Sharma, P., Philip, J., Gharbi, M., Freeman, B., Durand, F., Deschaintre, V.: Materialistic: Selecting similar materials in images. ACM Trans. Graph.42(4), 154:1–154:14 (2023).https://doi.org/10.1145/3592390

work page doi:10.1145/3592390 2023
[42]

In: IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023

Sun, C., Cai, G., Li, Z., Yan, K., Zhang, C., Marshall, C.S., Huang, J., Zhao, S., Dong, Z.: Neural-pbir reconstruction of shape, material, and illumination. In: IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. pp. 18000–18010. IEEE (2023).https://doi.org/10. 1109/ICCV51070.2023.01654

work page arXiv 2023
[43]

In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Sun, H., Gao, Y., Xie, J., Yang, J., Wang, B.: SVG-IR: spatially-varying gaus- sian splatting for inverse rendering. In: 2025 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 16143–16152. Computer Vision Foundation / IEEE (2025).https: //doi.org/10.1109/CVPR52734.2025.01505

work page doi:10.1109/cvpr52734.2025.01505 2025
[44]

CoRRabs/2510.03163(2025).https://doi.org/10.48550/ARXIV.2510

Tang, J., Lavine, M., Verbin, D., Garbin, S.J., Nießner, M., Martin-Brualla, R., Srinivasan,P.P.,Henzler,P.:ROGR:relightable3dobjectsusinggenerativerelight- ing. CoRRabs/2510.03163(2025).https://doi.org/10.48550/ARXIV.2510. 03163

work page doi:10.48550/arxiv.2510 2025
[45]

Phd thesis, EPFL (2022)

Vicini, D.A.: Efficient and Accurate Physically-Based Differentiable Rendering. Phd thesis, EPFL (2022)

2022
[46]

In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W

Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, Ne...

2021
[47]

Wiersma, R., Philip, J., Hasan, M., Mullia, K., Luan, F., Eisemann, E., De- schaintre, V.: Uncertainty for SVBRDF acquisition using frequency analysis. In: Proceedings of the Special Interest Group on Computer Graphics and Interac- tive Techniques Conference Papers, SIGGRAPH Conference Papers ’25, Vancou- ver, BC, Canada, August 10-14, 2025. pp. 169:1–169...

work page doi:10.1145/3721238.3730592 2025
[48]

https://doi.org/10.48550/ARXIV.2511.18900 20 J

Wu, X., Zhu, P., Lyu, J., Liu, X., Guo, J., Guo, Y., Xu, W., Lyu, C.: Matmart: Materialreconstructionof3dobjectsviadiffusion.CoRRabs/2511.18900(2025). https://doi.org/10.48550/ARXIV.2511.18900 20 J. Ling et al

work page doi:10.48550/arxiv.2511.18900 2025
[49]

In: ACM SIGGRAPH 2024 Conference Papers, SIGGRAPH 2024, Denver, CO, USA, 27 July 2024-1 August

Zeng, C., Dong, Y., Peers, P., Kong, Y., Wu, H., Tong, X.: Dilightnet: Fine-grained lighting control for diffusion-based image generation. In: ACM SIGGRAPH 2024 Conference Papers, SIGGRAPH 2024, Denver, CO, USA, 27 July 2024-1 August

2024
[50]

p. 73. ACM (2024).https://doi.org/10.1145/3641519.3657396

work page doi:10.1145/3641519.3657396 2024
[51]

In: Burbano, A., Zorin, D., Jarosz, W

Zeng, Z., Deschaintre, V., Georgiev, I., Hold-Geoffroy, Y., Hu, Y., Luan, F., Yan, L., Hasan, M.: Rgb↔x: Image decomposition and synthesis using material- and lighting-aware diffusion models. In: Burbano, A., Zorin, D., Jarosz, W. (eds.) ACM SIGGRAPH2024ConferencePapers,SIGGRAPH2024,Denver,CO,USA,27July 2024-1 August 2024. p. 75. ACM (2024).https://doi.or...

work page doi:10.1145/3641519 2024
[52]

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable ef- fectiveness of deep features as a perceptual metric. CoRRabs/1801.03924(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[53]

In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025

Zhang, T., Kuang, Z., Jin, H., Xu, Z., Bi, S., Tan, H., Zhang, H., Hu, Y., Hasan, M., Freeman, W.T., Zhang, K., Luan, F.: Relitlrm: Generative relightable radi- ance for large reconstruction models. In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenRe- view.net (2025)

2025
[54]

In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022

Zhang, Y., Sun, J., He, X., Fu, H., Jia, R., Zhou, X.: Modeling indirect illumination for inverse rendering. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pp. 18622–18631. IEEE (2022).https://doi.org/10.1109/CVPR52688.2022.01809

work page doi:10.1109/cvpr52688.2022.01809 2022
[55]

In: ACM SIGGRAPH 2020 Courses

Zhao, S., Jakob, W., Li, T.M.: Physics-based differentiable rendering: A compre- hensive introduction. In: ACM SIGGRAPH 2020 Courses. pp. 14:1–14:30 (2020)

2020
[56]

Zhao, X., Srinivasan, P.P., Verbin, D., Park, K., Martin-Brualla, R., Henzler, P.: Illuminerf: 3d relighting without inverse rendering. In: Advances in Neural Infor- mation Processing Systems 38: Annual Conference on Neural Information Process- ing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024 (2024)

2024
[57]

In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Sin- gapore, April 24-28, 2025

Zheng, H., Chu, W., Zhang, B., Wu, Z., Wang, A., Feng, B., Zou, C., Sun, Y., Kovachki, N.B., Ross, Z.E., Bouman, K.L., Yue, Y.: Inversebench: Benchmark- ing plug-and-play diffusion priors for inverse problems in physical sciences. In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Sin- gapore, April 24-28, 2025. OpenReview...

2025
[58]

ACM Trans

Zhou, Z., Chen, G., Dong, Y., Wipf, D.P., Yu, Y., Snyder, J.M., Tong, X.: Sparse- as-possible SVBRDF acquisition. ACM Trans. Graph.35(6), 189:1–189:12 (2016). https://doi.org/10.1145/2980179.2980247

work page doi:10.1145/2980179.2980247 2016
[59]

In: The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

Zhu, J., Zhuang, P., Koyejo, S.: HIFA: high-fidelity text-to-3d generation with advanced diffusion guidance. In: The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net (2024)

2024
[60]

ACM Trans

Zhu, Z., Wang, B., Yang, J.: GS-ROR2: bidirectional-guided 3dgs and SDF for reflective object relighting and reconstruction. ACM Trans. Graph.45(1), 4:1–4:19 (2026).https://doi.org/10.1145/3759248

work page doi:10.1145/3759248 2026
[61]

In: Deussen, O., Keller, A., Bala, K., Dutré, P., Fellner, D.W., Spencer, S.N

Zickler, T.E., Enrique, S., Ramamoorthi, R., Belhumeur, P.N.: Reflectance shar- ing: Image-based rendering from a sparse set of images. In: Deussen, O., Keller, A., Bala, K., Dutré, P., Fellner, D.W., Spencer, S.N. (eds.) Proceedings of the Eurographics Symposium on Rendering Techniques, Konstanz, Germany, June 29- July 1, 2005. pp. 253–264. Eurographics ...

work page doi:10.2312/egwr/egsr05/253-264 2005

[1] [1]

Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the per- mutohedral lattice. Comput. Graph. Forum29(2), 753–762 (2010).https://doi. org/10.1111/J.1467-8659.2009.01645.X

work page doi:10.1111/j.1467-8659.2009.01645.x 2010

[2] [2]

In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Alzayer, H., Henzler, P., Barron, J.T., Huang, J., Srinivasan, P.P., Verbin, D.: Generative multiview relighting for 3d reconstruction under extreme illumination variation. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 10933–10942. Com- puter Vision Foundation / IEEE (2025)....

work page doi:10.1109/cvpr52734 2025

[3] [3]

In: Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XXVIII

Attal, B., Verbin, D., Mildenhall, B., Hedman, P., Barron, J.T., O’Toole, M., Srini- vasan, P.P.: Flash cache: Reducing bias in radiance cache based inverse render- ing. In: Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part XXVIII. Lecture Notes in Com- puter Science, vol. 15086, pp. 20–3...

2024

[4] [4]

Burley, B.: Physically-based shading at disney (2012)

2012

[5] [5]

ACM Trans

Chen, A., Xu, Z., Wei, X., Tang, S., Su, H., Geiger, A.: Dictionary fields: Learning a neural basis decomposition. ACM Trans. Graph.42(4), 156:1–156:12 (2023). https://doi.org/10.1145/3592135

work page doi:10.1145/3592135 2023

[6] [6]

CoRRabs/2302.01226(2023).https: //doi.org/10.48550/ARXIV.2302.01226

Chen, A., Xu, Z., Wei, X., Tang, S., Su, H., Geiger, A.: Factor fields: A unified framework for neural fields and beyond. CoRRabs/2302.01226(2023).https: //doi.org/10.48550/ARXIV.2302.01226

work page doi:10.48550/arxiv.2302.01226 2023

[7] [7]

In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G

Chen, X., Peng, S., Yang, D., Liu, Y., Pan, B., Lv, C., Zhou, X.: Intrinsicanything: Learning diffusion priors for inverse rendering under unknown illumination. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds.) Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, Septem- ber 29-October 4, 2024, Proceed...

work page doi:10.1007/978-3-031- 2024

[8] [8]

CoRRabs/2507.01305(2025).https://doi.org/10

Chinchuthakun, W., Phongthawee, P., Raj, A., Jampani, V., Khungurn, P., Suwa- janakorn, S.: Diffusionlight-turbo: Accelerated light probes for free via single-pass chrome ball inpainting. CoRRabs/2507.01305(2025).https://doi.org/10. 48550/ARXIV.2507.01305

work page arXiv 2025

[9] [9]

In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Chung, H., Choi, S., Baek, S.: Differentiable inverse rendering with interpretable basis brdfs. In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 475–

2025

[10] [11]

In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24,

Chung, H., Kim, J., Kim, S., Ye, J.C.: Parallel diffusion models of operator and image for blind inverse problems. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24,

2023

[11] [12]

In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

pp. 6059–6069. IEEE (2023).https://doi.org/10.1109/CVPR52729.2023. 00587

work page doi:10.1109/cvpr52729.2023 2023

[12] [13]

Dong, Z., Chen, K., Lv, Z., Yu, H., Zhang, Y., Zhang, C., Zhu, Y., Tian, S., Li, Z., Moffatt, G., Christofferson, S., Fort, J., Pan, X., Yan, M., Wu, J., Ren, C.Y., Newcombe, R.A.: Digital twin catalog: A large-scale photorealistic 3d object digital twin dataset. In: 2025 IEEE/CVF Conference on Computer Vision and Diffusion-Based Material Regularization f...

2025

[13] [15]

CoRRabs/2411.19322 (2024).https://doi.org/10.48550/ARXIV.2411.19322

Fischer, M., Georgiev, I., Groueix, T., Kim, V.G., Ritschel, T., Deschaintre, V.: Sama: Material-aware 3d selection and segmentation. CoRRabs/2411.19322 (2024).https://doi.org/10.48550/ARXIV.2411.19322

work page doi:10.48550/arxiv.2411.19322 2024

[14] [16]

Scott Armstrong, ed.Expert Opinions in Forecasting: The Role of the Delphi Technique

Gao, J., Gu, C., Lin, Y., Li, Z., Zhu, H., Cao, X., Zhang, L., Yao, Y.: Relightable 3d gaussians: Realistic point cloud relighting with BRDF decomposition and ray tracing. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds.) Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024...

work page doi:10.1007/978- 2024

[15] [17]

In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Gu, C., Wei, X., Zeng, Z., Yao, Y., Zhang, L.: IRGS: inter-reflective gaussian splatting with 2d gaussian ray tracing. In: 2025 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11- 15, 2025. pp. 10943–10952. Computer Vision Foundation / IEEE (2025).https: //doi.org/10.1109/CVPR52734.2025.01022

work page doi:10.1109/cvpr52734.2025.01022 2025

[16] [18]

In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A

Hasselgren, J., Hofmann, N., Munkberg, J.: Shape, light, and material decom- position from images using monte carlo rendering and denoising. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neu- ral Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, Ne...

2022

[17] [19]

Huang, X., Wang, T., Liu, Z., Wang, Q.: Material anything: Generating materials forany3dobjectviadiffusion.In:2025IEEE/CVFConferenceonComputerVision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 26556–26565. Computer Vision Foundation / IEEE (2025).https://doi.org/10. 1109/CVPR52734.2025.02473

work page arXiv 2025

[18] [20]

ACM Trans

Jakob, W., Speierer, S., Roussel, N., Vicini, D.: DR.JIT: a just-in-time compiler for differentiable rendering. ACM Trans. Graph.41(4), 124:1–124:19 (2022).https: //doi.org/10.1145/3528223.3530099

work page doi:10.1145/3528223.3530099 2022

[19] [21]

Jin, H., Li, Y., Luan, F., Xiangli, Y., Bi, S., Zhang, K., Xu, Z., Sun, J., Snavely, N.: Neural gaffer: Relighting any object via diffusion. In: Advances in Neural In- formation Processing Systems 38: Annual Conference on Neural Information Pro- cessing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024 (2024)

2024

[20] [22]

In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Jin, H., Liu, I., Xu, P., Zhang, X., Han, S., Bi, S., Zhou, X., Xu, Z., Su, H.: Tensoir: Tensorial inverse rendering. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. pp. 165–174. IEEE (2023).https://doi.org/10.1109/CVPR52729.2023.00024

work page doi:10.1109/cvpr52729.2023.00024 2023

[21] [23]

ACM Trans

Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph.42(4), 139:1–139:14 (2023). https://doi.org/10.1145/3592433

work page doi:10.1145/3592433 2023

[22] [24]

In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

Kuang, Z., Zhang, Y., Yu, H., Agarwala, S., Wu, S., Wu, J.: Stanford-orb: A real- world 3d object inverse rendering benchmark. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Information Pro- cessing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, L...

2023

[23] [25]

In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

Lee, Y., Kim, K., Kim, H., Sung, M.: Syncdiffusion: Coherent montage via syn- chronized joint diffusions. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10-16,...

2023

[24] [26]

In: Dy, J.G., Krause, A

Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M., Aila, T.: Noise2noise: Learning image restoration without clean data. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018. Proceedings of Machine Learning Researc...

2018

[25] [27]

Lensch, H.P.A., Kautz, J., Goesele, M., Heidrich, W., Seidel, H.: Image-based re- constructionofspatialappearanceandgeometricdetail.ACMTrans.Graph.22(2), 234–257 (2003).https://doi.org/10.1145/636886.636891

work page doi:10.1145/636886.636891 2003

[26] [28]

CoRRabs/2501.18590(2025)

Liang,R.,Gojcic,Z.,Ling,H.,Munkberg,J.,Hasselgren,J.,Lin,Z.,Gao,J.,Keller, A., Vijaykumar, N., Fidler, S., Wang, Z.: Diffusionrenderer: Neural inverse and forward rendering with video diffusion models. CoRRabs/2501.18590(2025). https://doi.org/10.48550/ARXIV.2501.18590

work page doi:10.48550/arxiv.2501.18590 2025

[27] [29]

Luxdit: Lighting estimation with video diffusion transformer.arXiv preprint arXiv:2509.03680, 2025

Liang, R., He, K., Gojcic, Z., Gilitschenski, I., Fidler, S., Vijaykumar, N., Wang, Z.: Luxdit: Lighting estimation with video diffusion transformer. CoRR abs/2509.03680(2025).https://doi.org/10.48550/ARXIV.2509.03680

work page doi:10.48550/arxiv.2509.03680 2025

[28] [30]

In: International Conference on 3D Vision, 3DV 2025, Singapore, March 25- 28, 2025

Litman, Y., Patashnik, O., Deng, K., Agrawal, A., Zawar, R., la Torre, F.D., Tul- siani, S.: Materialfusion: Enhancing inverse rendering with material diffusion pri- ors. In: International Conference on 3D Vision, 3DV 2025, Singapore, March 25- 28, 2025. pp. 802–812. IEEE (2025).https://doi.org/10.1109/3DV66043.2025. 00079

work page doi:10.1109/3dv66043.2025 2025

[29] [31]

CoRRabs/2508.06494(2025).https://doi.org/10

Litman, Y., la Torre, F.D., Tulsiani, S.: Lightswitch: Multi-view relighting with material-guided diffusion. CoRRabs/2508.06494(2025).https://doi.org/10. 48550/ARXIV.2508.06494

work page arXiv 2025

[30] [32]

In: Stone, M.C

Lorensen, W.E., Cline, H.E.: Marching cubes: A high resolution 3d surface con- struction algorithm. In: Stone, M.C. (ed.) Proceedings of the 14th Annual Confer- ence on Computer Graphics and Interactive Techniques, SIGGRAPH ’87, Ana- heim, California, USA, July 27-31, 1987. pp. 163–169. ACM (1987).https: //doi.org/10.1145/37401.37422

work page doi:10.1145/37401.37422 1987

[31] [33]

Luan, F., Zhao, S., Bala, K., Dong, Z.: Unified shape and SVBRDF recovery us- ing differentiable monte carlo rendering. Comput. Graph. Forum40(4), 101–113 (2021).https://doi.org/10.1111/CGF.14344

work page doi:10.1111/cgf.14344 2021

[32] [34]

In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022

Mildenhall, B., Hedman, P., Martin-Brualla, R., Srinivasan, P.P., Barron, J.T.: Nerf in the dark: High dynamic range view synthesis from noisy raw images. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pp. 16169–16178. IEEE (2022). https://doi.org/10.1109/CVPR52688.2022.01571

work page doi:10.1109/cvpr52688.2022.01571 2022

[33] [35]

In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J. (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceed- ings, Part I. Lecture Notes in Computer Scien...

work page doi:10.1007/978-3-030-58452-8_24 2020

[34] [36]

Munkberg, J., Wang, Z., Liang, R., Shen, T., Hasselgren, J.: Videomat: Extracting PBR materials from video diffusion models. Comput. Graph. Forum44(4) (2025). https://doi.org/10.1111/CGF.70180 Diffusion-Based Material Regularization for Physics-Based Inverse Rendering 19

work page doi:10.1111/cgf.70180 2025

[35] [37]

In: NeurIPS 2023 Workshop on Deep Learning and Inverse Problems (2023)

Oscanoa, J., Alkan, C., Abraham, D., Nurdinova, A., Ennis, D., Vasanawala, S., Mardani, M., Pauly, J.M.: Variational diffusion models for MRI blind inverse prob- lems. In: NeurIPS 2023 Workshop on Deep Learning and Inverse Problems (2023)

2023

[36] [38]

ACM Trans

Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M.F., Hoppe, H., Toyama, K.: Digital photography with flash and no-flash image pairs. ACM Trans. Graph. 23(3), 664–672 (2004).https://doi.org/10.1145/1015706.1015777

work page doi:10.1145/1015706.1015777 2004

[37] [39]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Phongthawee, P., Chinchuthakun, W., Sinsunthithet, N., Jampani, V., Raj, A., Khungurn, P., Suwajanakorn, S.: Diffusionlight: Light probes for free by painting a chrome ball. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, WA, USA, June 16-22, 2024. pp. 98–108. IEEE (2024).https://doi.org/10.1109/CVPR52733.2024.00018

work page doi:10.1109/cvpr52733.2024.00018 2024

[38] [40]

Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations,

Schmitt, C., Donné, S., Riegler, G., Koltun, V., Geiger, A.: On joint estimation of pose, geometry and svbrdf from a handheld scanner. In: 2020 IEEE/CVF Confer- enceonComputerVisionandPatternRecognition,CVPR2020,Seattle,WA,USA, June 13-19, 2020. pp. 3490–3500. Computer Vision Foundation / IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.00355

work page doi:10.1109/cvpr42600.2020.00355 2020

[39] [41]

ACM Trans

Sharma, P., Philip, J., Gharbi, M., Freeman, B., Durand, F., Deschaintre, V.: Materialistic: Selecting similar materials in images. ACM Trans. Graph.42(4), 154:1–154:14 (2023).https://doi.org/10.1145/3592390

work page doi:10.1145/3592390 2023

[40] [42]

In: IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023

Sun, C., Cai, G., Li, Z., Yan, K., Zhang, C., Marshall, C.S., Huang, J., Zhao, S., Dong, Z.: Neural-pbir reconstruction of shape, material, and illumination. In: IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. pp. 18000–18010. IEEE (2023).https://doi.org/10. 1109/ICCV51070.2023.01654

work page arXiv 2023

[41] [43]

In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025

Sun, H., Gao, Y., Xie, J., Yang, J., Wang, B.: SVG-IR: spatially-varying gaus- sian splatting for inverse rendering. In: 2025 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, CVPR 2025, Nashville, TN, USA, June 11-15, 2025. pp. 16143–16152. Computer Vision Foundation / IEEE (2025).https: //doi.org/10.1109/CVPR52734.2025.01505

work page doi:10.1109/cvpr52734.2025.01505 2025

[42] [44]

CoRRabs/2510.03163(2025).https://doi.org/10.48550/ARXIV.2510

Tang, J., Lavine, M., Verbin, D., Garbin, S.J., Nießner, M., Martin-Brualla, R., Srinivasan,P.P.,Henzler,P.:ROGR:relightable3dobjectsusinggenerativerelight- ing. CoRRabs/2510.03163(2025).https://doi.org/10.48550/ARXIV.2510. 03163

work page doi:10.48550/arxiv.2510 2025

[43] [45]

Phd thesis, EPFL (2022)

Vicini, D.A.: Efficient and Accurate Physically-Based Differentiable Rendering. Phd thesis, EPFL (2022)

2022

[44] [46]

In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W

Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, Ne...

2021

[45] [47]

Wiersma, R., Philip, J., Hasan, M., Mullia, K., Luan, F., Eisemann, E., De- schaintre, V.: Uncertainty for SVBRDF acquisition using frequency analysis. In: Proceedings of the Special Interest Group on Computer Graphics and Interac- tive Techniques Conference Papers, SIGGRAPH Conference Papers ’25, Vancou- ver, BC, Canada, August 10-14, 2025. pp. 169:1–169...

work page doi:10.1145/3721238.3730592 2025

[46] [48]

https://doi.org/10.48550/ARXIV.2511.18900 20 J

Wu, X., Zhu, P., Lyu, J., Liu, X., Guo, J., Guo, Y., Xu, W., Lyu, C.: Matmart: Materialreconstructionof3dobjectsviadiffusion.CoRRabs/2511.18900(2025). https://doi.org/10.48550/ARXIV.2511.18900 20 J. Ling et al

work page doi:10.48550/arxiv.2511.18900 2025

[47] [49]

In: ACM SIGGRAPH 2024 Conference Papers, SIGGRAPH 2024, Denver, CO, USA, 27 July 2024-1 August

Zeng, C., Dong, Y., Peers, P., Kong, Y., Wu, H., Tong, X.: Dilightnet: Fine-grained lighting control for diffusion-based image generation. In: ACM SIGGRAPH 2024 Conference Papers, SIGGRAPH 2024, Denver, CO, USA, 27 July 2024-1 August

2024

[48] [50]

p. 73. ACM (2024).https://doi.org/10.1145/3641519.3657396

work page doi:10.1145/3641519.3657396 2024

[49] [51]

In: Burbano, A., Zorin, D., Jarosz, W

Zeng, Z., Deschaintre, V., Georgiev, I., Hold-Geoffroy, Y., Hu, Y., Luan, F., Yan, L., Hasan, M.: Rgb↔x: Image decomposition and synthesis using material- and lighting-aware diffusion models. In: Burbano, A., Zorin, D., Jarosz, W. (eds.) ACM SIGGRAPH2024ConferencePapers,SIGGRAPH2024,Denver,CO,USA,27July 2024-1 August 2024. p. 75. ACM (2024).https://doi.or...

work page doi:10.1145/3641519 2024

[50] [52]

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable ef- fectiveness of deep features as a perceptual metric. CoRRabs/1801.03924(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[51] [53]

In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025

Zhang, T., Kuang, Z., Jin, H., Xu, Z., Bi, S., Tan, H., Zhang, H., Hu, Y., Hasan, M., Freeman, W.T., Zhang, K., Luan, F.: Relitlrm: Generative relightable radi- ance for large reconstruction models. In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenRe- view.net (2025)

2025

[52] [54]

In: IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022

Zhang, Y., Sun, J., He, X., Fu, H., Jia, R., Zhou, X.: Modeling indirect illumination for inverse rendering. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. pp. 18622–18631. IEEE (2022).https://doi.org/10.1109/CVPR52688.2022.01809

work page doi:10.1109/cvpr52688.2022.01809 2022

[53] [55]

In: ACM SIGGRAPH 2020 Courses

Zhao, S., Jakob, W., Li, T.M.: Physics-based differentiable rendering: A compre- hensive introduction. In: ACM SIGGRAPH 2020 Courses. pp. 14:1–14:30 (2020)

2020

[54] [56]

Zhao, X., Srinivasan, P.P., Verbin, D., Park, K., Martin-Brualla, R., Henzler, P.: Illuminerf: 3d relighting without inverse rendering. In: Advances in Neural Infor- mation Processing Systems 38: Annual Conference on Neural Information Process- ing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024 (2024)

2024

[55] [57]

In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Sin- gapore, April 24-28, 2025

Zheng, H., Chu, W., Zhang, B., Wu, Z., Wang, A., Feng, B., Zou, C., Sun, Y., Kovachki, N.B., Ross, Z.E., Bouman, K.L., Yue, Y.: Inversebench: Benchmark- ing plug-and-play diffusion priors for inverse problems in physical sciences. In: The Thirteenth International Conference on Learning Representations, ICLR 2025, Sin- gapore, April 24-28, 2025. OpenReview...

2025

[56] [58]

ACM Trans

Zhou, Z., Chen, G., Dong, Y., Wipf, D.P., Yu, Y., Snyder, J.M., Tong, X.: Sparse- as-possible SVBRDF acquisition. ACM Trans. Graph.35(6), 189:1–189:12 (2016). https://doi.org/10.1145/2980179.2980247

work page doi:10.1145/2980179.2980247 2016

[57] [59]

In: The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024

Zhu, J., Zhuang, P., Koyejo, S.: HIFA: high-fidelity text-to-3d generation with advanced diffusion guidance. In: The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net (2024)

2024

[58] [60]

ACM Trans

Zhu, Z., Wang, B., Yang, J.: GS-ROR2: bidirectional-guided 3dgs and SDF for reflective object relighting and reconstruction. ACM Trans. Graph.45(1), 4:1–4:19 (2026).https://doi.org/10.1145/3759248

work page doi:10.1145/3759248 2026

[59] [61]

In: Deussen, O., Keller, A., Bala, K., Dutré, P., Fellner, D.W., Spencer, S.N

Zickler, T.E., Enrique, S., Ramamoorthi, R., Belhumeur, P.N.: Reflectance shar- ing: Image-based rendering from a sparse set of images. In: Deussen, O., Keller, A., Bala, K., Dutré, P., Fellner, D.W., Spencer, S.N. (eds.) Proceedings of the Eurographics Symposium on Rendering Techniques, Konstanz, Germany, June 29- July 1, 2005. pp. 253–264. Eurographics ...

work page doi:10.2312/egwr/egsr05/253-264 2005