pith. machine review for the scientific record. sign in

arxiv: 2605.09688 · v1 · submitted 2026-05-10 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

ConFixGS: Learning to Fix Feedforward 3D Gaussian Splatting with Confidence-Aware Diffusion Priors in Driving Scenes

Jiaqi Ma, Markus Gross, Olaf Wysocki, Rui Song, Tianhui Cai, Xingcheng Zhou, Zewei Zhou, Zhiyu Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:31 UTC · model grok-4.3

classification 💻 cs.CV
keywords 3D Gaussian Splattingdiffusion priorsnovel view synthesisdriving scenesconfidence mapsfeedforward reconstructionsparse viewsreprojection consistency
0
0 comments X

The pith

ConFixGS repairs feedforward 3D Gaussian Splatting in driving scenes by validating diffusion enhancements against support-view consistency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ConFixGS as a plug-and-play fix for feedforward 3D Gaussian Splatting models that produce poor results when only sparse trajectory views are available in driving data. It generates local pseudo-targets enhanced by diffusion models and then applies reprojection cross-checking from neighboring support views to create dense confidence maps. These maps direct which enhanced details to incorporate during refinement, keeping consistent content while discarding hallucinations. A sympathetic reader would care because accurate novel view synthesis from limited moving-camera feeds directly affects the reliability of 3D scene understanding in real-world navigation tasks.

Core claim

ConFixGS begins with a pretrained feedforward 3DGS model, produces diffusion-enhanced local pseudo-targets, and validates them through reprojection-based cross-checking against support views to build dense confidence maps. The maps then guide refinement so that reliable details from the priors are kept while hallucinated or inconsistent evidence is suppressed. On Waymo, nuScenes, and KITTI this yields improved novel view synthesis with PSNR gains of up to 3.68 dB and FID reduced by nearly half.

What carries the argument

The confidence-aware fusion pipeline that creates diffusion-enhanced pseudo-targets and filters them via reprojection cross-checking to produce dense maps that control refinement.

If this is right

  • Feedforward 3DGS models become usable for challenging sparse-view driving reconstructions without per-scene optimization.
  • Diffusion priors can be safely integrated into geometric reconstruction pipelines when filtered by view-consistency checks.
  • Novel view synthesis quality improves measurably on standard autonomous-driving benchmarks.
  • The same confidence-guided principle can be applied to other generative priors beyond diffusion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may generalize to indoor or handheld sparse-view settings if the cross-checking remains robust to different motion patterns.
  • Reducing hallucinations in this way could lower the camera density required for acceptable 3D driving maps.
  • If the refinement step is made efficient, the method could support online map updates from vehicle fleets.

Load-bearing premise

Reprojection cross-checking against support views can reliably separate useful diffusion-enhanced details from hallucinated or inconsistent content in trajectory-based sparse-view driving scenes.

What would settle it

Running the refinement step without the confidence maps and measuring whether PSNR on held-out novel views drops below the reported gains or whether visual artifacts increase in regions the maps previously down-weighted.

Figures

Figures reproduced from arXiv: 2605.09688 by Jiaqi Ma, Markus Gross, Olaf Wysocki, Rui Song, Tianhui Cai, Xingcheng Zhou, Zewei Zhou, Zhiyu Huang.

Figure 1
Figure 1. Figure 1: ConFixGS: a plug-and-play repair of feedforward 3DGS in sparse driving scenes. Left: Our confidence-guided method enhances state-of-the-art feedforward backbones, yielding better novel view rendering and consistent gains across metrics. Right: The repaired 3D Gaussians generalize well beyond the original camera trajectory, supporting novel view synthesis under large lateral offsets and elevated drone-like … view at source ↗
Figure 2
Figure 2. Figure 2: ConFixGS Framework Overview. ConFixGS consists of three stages: (i) initial feed￾forward reconstruction with local pseudo-view episodes, (ii) input-observation-guided confidence estimation through reprojection, and (iii) confidence-modulated global repair optimization. and ci ∈ R 3 is the RGB color. During Stage 3.5 we adopt the standard logit/log reparameterization oˆi = σ −1 (oi) and sˆi = log si so that… view at source ↗
Figure 3
Figure 3. Figure 3: Visual ablation study. We visualize the effects of individual components by removing one module at a time. We use WorldMirror [71] as the feedforward backbone [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparison. We use DepthSplat [69] as the feedforward backbone for our approach. More results are shown in Figs. 7 and 8, with additional backbone comparisons in 9, 10, and 11. 5 Conclusion In summary, we present ConFixGS, a confidence-guided repair approach specifically designed to fix feedforward 3DGS in sparse-view driving scenes. ConFixGS uses a training-free reprojection check to cross-validate… view at source ↗
Figure 9
Figure 9. Figure 9: Moreover, to demonstrate that our approach is plug-and-play, we apply [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of global and local feedforward 3DGS rendering on Waymo, nuScenes, and KITTI. For each scene we compare the global feedforward (FW) rendering of G0, which has to reconstruct the entire trajectory from sparse, weakly overlapping support views, against the local FW rendering produced from a small subset around each novel view, and report the diffusion-enhanced version. Across all three datasets, l… view at source ↗
Figure 6
Figure 6. Figure 6: Additional novel view synthesis results under more challenging viewpoint shifts, including lateral offsets of 1 m and 3 m, as well as a drone-style viewpoint at approximately 2.5 m height with a 20◦ downward pitch. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative comparison of DepthSplat [69] before and after ConFixGS enhancement – I. 25 [PITH_FULL_IMAGE:figures/full_fig_p025_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative comparison of DepthSplat [69] before and after ConFixGS enhancement – II. 26 [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative comparison of WorldMirror [71] before and after ConFixGS enhancement [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative comparison of AnySplat [70] before and after ConFixGS enhancement [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Qualitative comparison of DrivingForward [72] before and after ConFixGS enhance￾ment. 27 [PITH_FULL_IMAGE:figures/full_fig_p027_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Qualitative comparison with Difix3D+ [17]. We use DepthSplat [69] as the feedforward backbone for our approach. 28 [PITH_FULL_IMAGE:figures/full_fig_p028_12.png] view at source ↗
read the original abstract

Feedforward 3D Gaussian Splatting (3DGS) often struggles in trajectory-based sparse-view driving scenes. Existing Gaussian repair methods mainly target optimization-based 3DGS, while diffusion-based repair is typically restricted to iterative refinement near observed viewpoints, leaving feedforward 3DGS repair underexplored. We propose ConFixGS, a plug-and-play method that learns to fix feedforward 3DGS with confidence-aware diffusion priors. Starting from a pretrained feedforward model, ConFixGS generates diffusion-enhanced local pseudo-targets and validates them through reprojection-based cross-checking against support views. The resulting dense confidence maps guide refinement, enhancing reliable details while suppressing hallucinated or inconsistent evidence. On Waymo, nuScenes, and KITTI, ConFixGS improves challenging novel view synthesis, with PSNR gains of up to 3.68 dB and FID reduced by nearly half. Our results highlight confidence-aware fusion of generative priors and support-view consistency as a key principle for robust feedforward 3D driving scene reconstruction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces ConFixGS, a plug-and-play refinement method for feedforward 3D Gaussian Splatting in trajectory-based sparse-view driving scenes. It generates diffusion-enhanced local pseudo-targets from pretrained diffusion models and validates them via reprojection-based cross-checking against support views to produce dense confidence maps. These maps then guide refinement of the initial 3DGS output, aiming to retain reliable details while suppressing hallucinations. The authors report quantitative gains on Waymo, nuScenes, and KITTI, with PSNR improvements up to 3.68 dB and FID reduced by nearly half for novel view synthesis.

Significance. If the confidence-aware filtering step proves reliable, the work would offer a practical advance for feedforward 3D reconstruction in autonomous driving by integrating generative priors without per-scene optimization. The focus on geometric consistency checks to control diffusion outputs addresses a relevant limitation in sparse-view settings and could influence hybrid reconstruction pipelines. The reported metrics indicate potential utility, though the absence of supporting implementation details and targeted validation limits immediate assessment of broader impact.

major comments (1)
  1. [Method (confidence map generation and refinement)] The core claim rests on the reprojection cross-checking step (described in the method overview and confidence map generation) reliably distinguishing useful diffusion-enhanced content from hallucinations. In the low-parallax, near-collinear trajectory regimes of the evaluated datasets, this check may fail to expose geometrically coherent but incorrect syntheses (e.g., fabricated lane markings or foliage) that reproject consistently across the limited support views. This is load-bearing for the reported PSNR/FID gains, as high-confidence erroneous regions would be incorporated into the refined 3DGS. The manuscript provides no dedicated analysis, ablation on baseline distance, or controlled hallucination tests to substantiate the filtering efficacy.
minor comments (2)
  1. [Abstract] The abstract states concrete PSNR and FID numbers without specifying the exact baseline feedforward model, dataset splits, or comparison methods used to compute the 'up to 3.68 dB' gain.
  2. [Method and Experiments] No implementation details, diffusion model architecture, or training procedure for the confidence predictor are provided, which hinders reproducibility of the plug-and-play claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment below and will incorporate additional validation in the revised manuscript.

read point-by-point responses
  1. Referee: [Method (confidence map generation and refinement)] The core claim rests on the reprojection cross-checking step (described in the method overview and confidence map generation) reliably distinguishing useful diffusion-enhanced content from hallucinations. In the low-parallax, near-collinear trajectory regimes of the evaluated datasets, this check may fail to expose geometrically coherent but incorrect syntheses (e.g., fabricated lane markings or foliage) that reproject consistently across the limited support views. This is load-bearing for the reported PSNR/FID gains, as high-confidence erroneous regions would be incorporated into the refined 3DGS. The manuscript provides no dedicated analysis, ablation on baseline distance, or controlled hallucination tests to substantiate the filtering efficacy.

    Authors: We agree that the reliability of the reprojection-based cross-checking is central to our approach, particularly in the challenging low-parallax conditions typical of driving trajectories. While our experiments on Waymo, nuScenes, and KITTI demonstrate consistent improvements in PSNR and FID, indicating that the confidence maps effectively filter hallucinations in practice, we acknowledge the lack of targeted ablations. In the revision, we will add an analysis of the confidence map generation, including ablations varying the baseline distance between support views and controlled tests using synthetic hallucinations to validate the filtering efficacy. This will strengthen the evidence for the method's robustness. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation; method is externally grounded

full rationale

The paper presents ConFixGS as a plug-and-play refinement pipeline that starts from an independently pretrained feedforward 3DGS model, applies an external diffusion prior to generate pseudo-targets, and uses geometric reprojection against support views to produce confidence maps. No equations, fitted parameters, or self-referential definitions appear in the abstract or description; the central claims rest on the empirical performance of these independent components rather than any reduction of outputs to inputs by construction. The approach is self-contained against external benchmarks and does not invoke load-bearing self-citations or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.0 · 5512 in / 1179 out tokens · 50729 ms · 2026-05-12T03:31:01.655502+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

110 extracted references · 110 canonical work pages · 3 internal anchors

  1. [1]

    Hugsim: A real-time, photo-realistic and closed-loop simulator for autonomous driving.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

    Hongyu Zhou, Longzhong Lin, Jiabao Wang, Yichong Lu, Dongfeng Bai, Bingbing Liu, Yue Wang, Andreas Geiger, and Yiyi Liao. Hugsim: A real-time, photo-realistic and closed-loop simulator for autonomous driving.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  2. [2]

    Pseudo-simulation for autonomous driving

    Wei Cao, Marcel Hallgarten, Tianyu Li, Daniel Dauner, Xunjiang Gu, Caojun Wang, Yakov Miron, Marco Aiello, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, and Kashyap Chitta. Pseudo-simulation for autonomous driving. InConference on Robot Learning (CoRL), 2025

  3. [3]

    BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving

    Seth Z Zhao, Luobin Wang, Hongwei Ruan, Yuxin Bao, Yilan Chen, Ziyang Leng, Abhijit Ravichandran, Honglin He, Zewei Zhou, Xu Han, et al. Bridgesim: Unveiling the ol-cl gap in end-to-end autonomous driving.arXiv preprint arXiv:2604.10856, 2026

  4. [4]

    3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4), July 2023

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4), July 2023

  5. [5]

    pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction

    David Charatan, Sizhe Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. InCVPR, 2024

  6. [6]

    Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images

    Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images. InEuropean conference on computer vision, pages 370–386. Springer, 2024

  7. [7]

    Freesplat: Generalizable 3d gaussian splatting towards free view synthesis of indoor scenes.Advances in Neural Information Processing Systems, 37:107326–107349, 2024

    Yunsong Wang, Tianxin Huang, Hanlin Chen, and Gim Hee Lee. Freesplat: Generalizable 3d gaussian splatting towards free view synthesis of indoor scenes.Advances in Neural Information Processing Systems, 37:107326–107349, 2024

  8. [8]

    PF3plat: Pose-free feed-forward 3d gaussian splatting for novel view synthesis

    Sunghwan Hong, Jaewoo Jung, Heeseong Shin, Jisang Han, Jiaolong Yang, Chong Luo, and Seungryong Kim. PF3plat: Pose-free feed-forward 3d gaussian splatting for novel view synthesis. InForty-second International Conference on Machine Learning, 2025

  9. [9]

    arXiv preprint arXiv:2410.24207 (2024)

    Botao Ye, Sifei Liu, Haofei Xu, Xueting Li, Marc Pollefeys, Ming-Hsuan Yang, and Songyou Peng. No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images.arXiv preprint arXiv:2410.24207, 2024

  10. [10]

    latentsplat: Autoencoding variational gaussians for fast generalizable 3d reconstruction

    Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, and Jan Eric Lenssen. latentsplat: Autoencoding variational gaussians for fast generalizable 3d reconstruction. InEuropean conference on computer vision, pages 456–473. Springer, 2024

  11. [11]

    Splatt3r: Zero-shot gaussian splatting from uncalibrated image pairs.arXiv preprint arXiv:2408.13912, 2024

    Brandon Smart, Chuanxia Zheng, Iro Laina, and Victor Adrian Prisacariu. Splatt3r: Zero-shot gaussian splatting from uncalibrated image pairs.arXiv preprint arXiv:2408.13912, 2024

  12. [12]

    Yonosplat: You only need one model for feedforward 3d gaussian splatting.arXiv preprint arXiv:2511.07321, 2025

    Botao Ye, Boqi Chen, Haofei Xu, Daniel Barath, and Marc Pollefeys. Yonosplat: You only need one model for feedforward 3d gaussian splatting.arXiv preprint arXiv:2511.07321, 2025

  13. [13]

    Driv- ingscene: A multi-task online feed-forward 3d gaussian splatting method for dynamic driving scenes

    Qirui Hou, Wenzhang Sun, Chang Zeng, Chunfeng Wang, Hao Li, and Jianxun Cui. Driv- ingscene: A multi-task online feed-forward 3d gaussian splatting method for dynamic driving scenes. InICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 10287–10291. IEEE, 2026

  14. [14]

    EnerGS: Energy-Based Gaussian Splatting with Partial Geometric Priors

    Rui Song, Tianhui Cai, Markus Gross, Yun Zhang, Walter Zimmer, Zhiyu Huang, Olaf Wysocki, and Jiaqi Ma. Energs: Energy-based gaussian splatting with partial geometric priors. arXiv preprint arXiv:2604.26238, 2026

  15. [15]

    Vgd: Visual geometry gaussian splatting for feed-forward surround-view driving reconstruction.arXiv preprint arXiv:2510.19578, 2025

    Junhong Lin, Kangli Wang, Shunzhou Wang, Songlin Fan, Ge Li, and Wei Gao. Vgd: Visual geometry gaussian splatting for feed-forward surround-view driving reconstruction.arXiv preprint arXiv:2510.19578, 2025

  16. [16]

    Ggs: Gener- alizable gaussian splatting for lane switching in autonomous driving

    Huasong Han, Kaixuan Zhou, Xiaoxiao Long, Yusen Wang, and Chunxia Xiao. Ggs: Gener- alizable gaussian splatting for lane switching in autonomous driving. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 3329–3337, 2025. 10

  17. [17]

    Difix3d+: Improving 3d reconstructions with single-step diffusion models

    Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, and Huan Ling. Difix3d+: Improving 3d reconstructions with single-step diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 26024–26035, 2025

  18. [18]

    3dgs-enhancer: Enhancing unbounded 3d gaussian splatting with view-consistent 2d diffusion priors.Advances in Neural Information Processing Systems, 37:133305–133327, 2024

    Xi Liu, Chaoyi Zhou, and Siyu Huang. 3dgs-enhancer: Enhancing unbounded 3d gaussian splatting with view-consistent 2d diffusion priors.Advances in Neural Information Processing Systems, 37:133305–133327, 2024

  19. [19]

    Fixinggs: Enhancing 3d gaussian splatting via training-free score distillation.arXiv preprint arXiv:2509.18759, 2025

    Zhaorui Wang, Yi Gu, Deming Zhou, and Renjing Xu. Fixinggs: Enhancing 3d gaussian splatting via training-free score distillation.arXiv preprint arXiv:2509.18759, 2025

  20. [20]

    Gsfix3d: Diffusion-guided repair of novel views in gaussian splatting.arXiv preprint arXiv:2508.14717, 2025

    Jiaxin Wei, Stefan Leutenegger, and Simon Schaefer. Gsfix3d: Diffusion-guided repair of novel views in gaussian splatting.arXiv preprint arXiv:2508.14717, 2025

  21. [21]

    arXiv preprint arXiv:2508.09667 , year=

    Xingyilang Yin, Qi Zhang, Jiahao Chang, Ying Feng, Qingnan Fan, Xi Yang, Chi-Man Pun, Huaqi Zhang, and Xiaodong Cun. Gsfixer: Improving 3d gaussian splatting with reference- guided video diffusion priors.arXiv preprint arXiv:2508.09667, 2025

  22. [22]

    Ri3d: Few-shot gaussian splatting with repair and inpainting diffusion priors

    Avinash Paliwal, Xilong Zhou, Wei Ye, Jinhui Xiong, Rakesh Ranjan, and Nima Khademi Kalantari. Ri3d: Few-shot gaussian splatting with repair and inpainting diffusion priors. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 25094– 25103, 2025

  23. [23]

    Lm-gaussian: Boost sparse-view 3d gaussian splatting with large model priors.arXiv preprint arXiv:2409.03456, 2024

    Hanyang Yu, Xiaoxiao Long, and Ping Tan. Lm-gaussian: Boost sparse-view 3d gaussian splatting with large model priors.arXiv preprint arXiv:2409.03456, 2024

  24. [24]

    Streetcrafter: Street view synthesis with controllable video diffusion models

    Yunzhi Yan, Zhen Xu, Haotong Lin, Haian Jin, Haoyu Guo, Yida Wang, Kun Zhan, Xianpeng Lang, Hujun Bao, Xiaowei Zhou, et al. Streetcrafter: Street view synthesis with controllable video diffusion models. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 822–832, 2025

  25. [25]

    Genwarp: Single image to novel views with semantic-preserving generative warping.Advances in Neural Information Processing Systems, 37:80220–80243, 2024

    Junyoung Seo, Kazumi Fukuda, Takashi Shibuya, Takuya Narihira, Naoki Murata, Shoukang Hu, Chieh-Hsin Lai, Seungryong Kim, and Yuki Mitsufuji. Genwarp: Single image to novel views with semantic-preserving generative warping.Advances in Neural Information Processing Systems, 37:80220–80243, 2024

  26. [26]

    Multidiff: Consistent novel view synthesis from a single im- age

    Norman Müller, Katja Schwarz, Barbara Rössle, Lorenzo Porzi, Samuel Rota Bulo, Matthias Nießner, and Peter Kontschieder. Multidiff: Consistent novel view synthesis from a single im- age. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10258–10268, 2024

  27. [27]

    Driving view synthesis on free-form trajectories with generative prior.arXiv preprint arXiv:2412.01717, 2024

    Zeyu Yang, Zijie Pan, Yuankun Yang, Xiatian Zhu, and Li Zhang. Driving view synthesis on free-form trajectories with generative prior.arXiv preprint arXiv:2412.01717, 2024

  28. [28]

    Freesim: Toward free-viewpoint camera simulation in driving scenes

    Lue Fan, Hao Zhang, Qitai Wang, Hongsheng Li, and Zhaoxiang Zhang. Freesim: Toward free-viewpoint camera simulation in driving scenes. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12004–12014, 2025

  29. [29]

    Freevs: Generative view synthesis on free driving trajectory.arXiv preprint arXiv:2410.18079, 2024

    Qitai Wang, Lue Fan, Yuqi Wang, Yuntao Chen, and Zhaoxiang Zhang. Freevs: Generative view synthesis on free driving trajectory.arXiv preprint arXiv:2410.18079, 2024

  30. [30]

    Nerf: Representing scenes as neural radiance fields for view synthesis

    Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

  31. [31]

    Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields

    Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. InProceedings of the IEEE/CVF international conference on computer vision, pages 5855–5864, 2021

  32. [32]

    Nerf++: Analyzing and improving neural radiance fields.arXiv preprint arXiv:2010.07492, 2020

    Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. Nerf++: Analyzing and improving neural radiance fields.arXiv preprint arXiv:2010.07492, 2020. 11

  33. [33]

    Mip-nerf 360: Unbounded anti-aliased neural radiance fields

    Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5470–5479, 2022

  34. [34]

    Mip-splatting: Alias-free 3d gaussian splatting

    Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, and Andreas Geiger. Mip-splatting: Alias-free 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19447–19456, 2024

  35. [35]

    Scaffold-gs: Structured 3d gaussians for view-adaptive rendering

    Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20654–20664, 2024

  36. [36]

    2d gaussian splatting for geometrically accurate radiance fields

    Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accurate radiance fields. InSIGGRAPH 2024 Conference Papers. Association for Computing Machinery, 2024

  37. [37]

    Pgsr: Planar-based gaussian splatting for efficient and high-fidelity surface reconstruction.IEEE Transactions on Visualization and Computer Graphics, 2024

    Danpeng Chen, Hai Li, Weicai Ye, Yifan Wang, Weijian Xie, Shangjin Zhai, Nan Wang, Haomin Liu, Hujun Bao, and Guofeng Zhang. Pgsr: Planar-based gaussian splatting for efficient and high-fidelity surface reconstruction.IEEE Transactions on Visualization and Computer Graphics, 2024

  38. [38]

    Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering

    Antoine Guédon and Vincent Lepetit. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5354–5363, 2024

  39. [39]

    Gaussianpro: 3d gaussian splatting with progressive propagation

    Kai Cheng, Xiaoxiao Long, Kaizhi Yang, Yao Yao, Wei Yin, Yuexin Ma, Wenping Wang, and Xuejin Chen. Gaussianpro: 3d gaussian splatting with progressive propagation. InForty-first International Conference on Machine Learning, 2024

  40. [40]

    Taming 3dgs: High-quality radiance fields with limited resources

    Saswat Subhajyoti Mallick, Rahul Goel, Bernhard Kerbl, Markus Steinberger, Francisco Vi- cente Carrasco, and Fernando De La Torre. Taming 3dgs: High-quality radiance fields with limited resources. InSIGGRAPH Asia 2024 Conference Papers, pages 1–11, 2024

  41. [41]

    Geogaussian: Geometry-aware gaussian splatting for scene rendering

    Yanyan Li, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, and Federico Tombari. Geogaussian: Geometry-aware gaussian splatting for scene rendering. InEuropean conference on computer vision, pages 441–457. Springer, 2024

  42. [42]

    Dn-splatter: Depth and normal priors for gaussian splatting and meshing

    Matias Turkulainen, Xuqian Ren, Iaroslav Melekhov, Otto Seiskari, Esa Rahtu, and Juho Kannala. Dn-splatter: Depth and normal priors for gaussian splatting and meshing. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2421–2431. IEEE, 2025

  43. [43]

    Depth-regularized optimization for 3d gaussian splatting in few-shot images

    Jaeyoung Chung, Jeongtaek Oh, and Kyoung Mu Lee. Depth-regularized optimization for 3d gaussian splatting in few-shot images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 811–820, 2024

  44. [44]

    Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization

    Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, and Lin Gu. Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20775–20785, 2024

  45. [45]

    Fatesgs: Fast and accurate sparse-view surface reconstruction using gaussian splatting with depth-feature consistency

    Han Huang, Yulun Wu, Chao Deng, Ge Gao, Ming Gu, and Yu-Shen Liu. Fatesgs: Fast and accurate sparse-view surface reconstruction using gaussian splatting with depth-feature consistency. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 3644–3652, 2025

  46. [46]

    Cdgs: Confidence-aware depth regularization for 3d gaussian splatting.arXiv preprint arXiv:2502.14684, 2025

    Qilin Zhang, Olaf Wysocki, Steffen Urban, and Boris Jutzi. Cdgs: Confidence-aware depth regularization for 3d gaussian splatting.arXiv preprint arXiv:2502.14684, 2025

  47. [47]

    Det-gs: Depth-and edge-aware regularization for high-fidelity 3d gaussian splatting.arXiv preprint arXiv:2508.04099, 2025

    Zexu Huang, Min Xu, and Stuart Perry. Det-gs: Depth-and edge-aware regularization for high-fidelity 3d gaussian splatting.arXiv preprint arXiv:2508.04099, 2025. 12

  48. [48]

    Nerf is a valuable assistant for 3d gaussian splatting

    Shuangkang Fang, I Shen, Takeo Igarashi, Yufeng Wang, ZeSheng Wang, Yi Yang, Wenrui Ding, Shuchang Zhou, et al. Nerf is a valuable assistant for 3d gaussian splatting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 26230– 26240, 2025

  49. [49]

    Streetsurf: Extending multi-view implicit surface reconstruction to street views.arXiv preprint arXiv:2306.04988, 2023

    Jianfei Guo, Nianchen Deng, Xinyang Li, Yeqi Bai, Botian Shi, Chiyu Wang, Chenjing Ding, Dongliang Wang, and Yikang Li. Streetsurf: Extending multi-view implicit surface reconstruction to street views.arXiv preprint arXiv:2306.04988, 2023

  50. [50]

    Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes

    Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21634–21643, 2024

  51. [51]

    Street gaussians: Modeling dynamic urban scenes with gaussian splatting

    Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, and Sida Peng. Street gaussians: Modeling dynamic urban scenes with gaussian splatting. InEuropean Conference on Computer Vision, pages 156–173. Springer, 2024

  52. [52]

    Li-gs: Gaussian splatting with lidar incorporated for accurate large-scale reconstruction.IEEE Robotics and Automation Letters, 2024

    Changjian Jiang, Ruilan Gao, Kele Shao, Yue Wang, Rong Xiong, and Yu Zhang. Li-gs: Gaussian splatting with lidar incorporated for accurate large-scale reconstruction.IEEE Robotics and Automation Letters, 2024

  53. [53]

    Lidar-enhanced 3d gaussian splatting mapping.arXiv preprint arXiv:2503.05425, 2025

    Jian Shen, Huai Yu, Ji Wu, Wen Yang, and Gui-Song Xia. Lidar-enhanced 3d gaussian splatting mapping.arXiv preprint arXiv:2503.05425, 2025

  54. [54]

    Geomgs: Lidar-guided geometry-aware gaussian splatting for robot localization.arXiv preprint arXiv:2501.13417, 2025

    Jaewon Lee, Mangyu Kong, Minseong Park, and Euntai Kim. Geomgs: Lidar-guided geometry- aware gaussian splatting for robot localization.arXiv preprint arXiv:2501.13417, 2025

  55. [55]

    D$^2$GS: Dense depth regularization for liDAR-free urban scene reconstruction

    Kejing Xia, Jidong Jia, Ke Jin, Yucai Bai, Li Sun, Dacheng Tao, and Youjian Zhang. D$^2$GS: Dense depth regularization for liDAR-free urban scene reconstruction. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026

  56. [56]

    Streetsurfgs: Scalable urban street surface reconstruction with planar-based gaussian splatting

    Xiao Cui, Weicai Ye, Yifan Wang, Guofeng Zhang, Wengang Zhou, Tong He, and Houqiang Li. Streetsurfgs: Scalable urban street surface reconstruction with planar-based gaussian splatting. IEEE Transactions on Circuits and Systems for Video Technology, 2025

  57. [57]

    Evolsplat: Efficient volume-based gaussian splatting for urban view synthesis

    Sheng Miao, Jiaxin Huang, Dongfeng Bai, Xu Yan, Hongyu Zhou, Yue Wang, Bingbing Liu, Andreas Geiger, and Yiyi Liao. Evolsplat: Efficient volume-based gaussian splatting for urban view synthesis. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 11286–11296, 2025

  58. [58]

    Splatad: Real-time lidar and camera rendering with 3d gaussian splatting for autonomous driving

    Georg Hess, Carl Lindström, Maryam Fatemi, Christoffer Petersson, and Lennart Svensson. Splatad: Real-time lidar and camera rendering with 3d gaussian splatting for autonomous driving. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 11982–11992, 2025

  59. [59]

    4d gaussian splatting for real-time dynamic scene rendering

    Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20310–20320, 2024

  60. [60]

    Ad-gs: Object-aware b-spline gaussian splatting for self-supervised autonomous driving

    Jiawei Xu, Kai Deng, Zexin Fan, Shenlong Wang, Jin Xie, and Jian Yang. Ad-gs: Object-aware b-spline gaussian splatting for self-supervised autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 24770–24779, 2025

  61. [61]

    Periodic vibration gaus- sian: Dynamic urban scene reconstruction and real-time rendering.International Journal of Computer Vision, 134(3):83, 2026

    Yurui Chen, Chun Gu, Junzhe Jiang, Xiatian Zhu, and Li Zhang. Periodic vibration gaus- sian: Dynamic urban scene reconstruction and real-time rendering.International Journal of Computer Vision, 134(3):83, 2026

  62. [62]

    Storm: Spatio-temporal reconstruction model for large-scale outdoor scenes.arXiv preprint arXiv:2501.00602, 2024

    Jiawei Yang, Jiahui Huang, Yuxiao Chen, Yan Wang, Boyi Li, Yurong You, Apoorva Sharma, Maximilian Igl, Peter Karkus, Danfei Xu, et al. Storm: Spatio-temporal reconstruction model for large-scale outdoor scenes.arXiv preprint arXiv:2501.00602, 2024. 13

  63. [63]

    Mvsplat360: Feed-forward 360 scene synthesis from sparse views.Advances in Neural Information Processing Systems, 37:107064–107086, 2024

    Yuedong Chen, Chuanxia Zheng, Haofei Xu, Bohan Zhuang, Andrea Vedaldi, Tat-Jen Cham, and Jianfei Cai. Mvsplat360: Feed-forward 360 scene synthesis from sparse views.Advances in Neural Information Processing Systems, 37:107064–107086, 2024

  64. [64]

    Transplat: Generalizable 3d gaussian splatting from sparse multi-view images with transformers

    Chuanrui Zhang, Yingshuang Zou, Zhuoling Li, Minmin Yi, and Haoqian Wang. Transplat: Generalizable 3d gaussian splatting from sparse multi-view images with transformers. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 9869–9877, 2025

  65. [65]

    Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers

    Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, and Song-Hai Zhang. Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10324–10335, 2024

  66. [66]

    Lgm: Large multi-view gaussian model for high-resolution 3d content creation

    Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. Lgm: Large multi-view gaussian model for high-resolution 3d content creation. InEuropean Conference on Computer Vision, pages 1–18. Springer, 2024

  67. [67]

    Gs-lrm: Large reconstruction model for 3d gaussian splatting

    Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. Gs-lrm: Large reconstruction model for 3d gaussian splatting. InEuropean Conference on Computer Vision, pages 1–19. Springer, 2024

  68. [68]

    arXiv preprint arXiv:2410.22128 (2024) 3

    Sunghwan Hong, Jaewoo Jung, Heeseong Shin, Jisang Han, Jiaolong Yang, Chong Luo, and Seungryong Kim. Pf3plat: Pose-free feed-forward 3d gaussian splatting.arXiv preprint arXiv:2410.22128, 2024

  69. [69]

    Depthsplat: Connecting gaussian splatting and depth

    Haofei Xu, Songyou Peng, Fangjinhua Wang, Hermann Blum, Daniel Barath, Andreas Geiger, and Marc Pollefeys. Depthsplat: Connecting gaussian splatting and depth. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 16453–16463, 2025

  70. [70]

    Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.ACM Transactions on Graphics (TOG), 44(6):1–16, 2025

    Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, et al. Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.ACM Transactions on Graphics (TOG), 44(6):1–16, 2025

  71. [71]

    Worldmirror: Universal 3d world reconstruction with any-prior prompting.arXiv preprint arXiv:2510.10726, 2025

    Yifan Liu, Zhiyuan Min, Zhenwei Wang, Junta Wu, Tengfei Wang, Yixuan Yuan, Yawei Luo, and Chunchao Guo. Worldmirror: Universal 3d world reconstruction with any-prior prompting. arXiv preprint arXiv:2510.10726, 2025

  72. [72]

    Drivingforward: Feed-forward 3d gaussian splatting for driving scene reconstruction from flexible surround-view input

    Qijian Tian, Xin Tan, Yuan Xie, and Lizhuang Ma. Drivingforward: Feed-forward 3d gaussian splatting for driving scene reconstruction from flexible surround-view input. InProceedings of the AAAI Conference on Artificial Intelligence, 2025

  73. [73]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022

  74. [74]

    Instructpix2pix: Learning to follow image editing instructions

    Tim Brooks, Aleksander Holynski, and Alexei A Efros. Instructpix2pix: Learning to follow image editing instructions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18392–18402, 2023

  75. [75]

    Zero-1-to-3: Zero-shot one image to 3d object

    Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl V ondrick. Zero-1-to-3: Zero-shot one image to 3d object. InProceedings of the IEEE/CVF international conference on computer vision, pages 9298–9309, 2023

  76. [76]

    arXiv preprint arXiv:2309.03453 , year=

    Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, and Wenping Wang. Syncdreamer: Generating multiview-consistent images from a single-view image.arXiv preprint arXiv:2309.03453, 2023

  77. [77]

    MVDream: Multi-view diffusion for 3d generation

    Yichun Shi, Peng Wang, Jianglong Ye, Long Mai, Kejie Li, and Xiao Yang. MVDream: Multi-view diffusion for 3d generation. InThe Twelfth International Conference on Learning Representations, 2024. 14

  78. [78]

    Score distillation sampling with learned manifold corrective

    Thiemo Alldieck, Nikos Kolotouros, and Cristian Sminchisescu. Score distillation sampling with learned manifold corrective. InEuropean Conference on Computer Vision, pages 1–18. Springer, 2024

  79. [79]

    Instruct-nerf2nerf: Editing 3d scenes with instructions

    Ayaan Haque, Matthew Tancik, Alexei Efros, Aleksander Holynski, and Angjoo Kanazawa. Instruct-nerf2nerf: Editing 3d scenes with instructions. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2023

  80. [80]

    Susskind, Christian Theobalt, Lingjie Liu, and Ravi Ramamoorthi

    Jiatao Gu, Alex Trevithick, Kai-En Lin, Joshua M. Susskind, Christian Theobalt, Lingjie Liu, and Ravi Ramamoorthi. NerfDiff: Single-image view synthesis with NeRF-guided distillation from 3D-aware diffusion. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th Internatio...

Showing first 80 references.