pith. machine review for the scientific record. sign in

arxiv: 2605.05664 · v1 · submitted 2026-05-07 · 💻 cs.CV

Recognition: unknown

Sparse-to-Complete: From Sparse Image Captures to Complete 3D Scenes

Authors on Pith no claims yet

Pith reviewed 2026-05-08 14:56 UTC · model grok-4.3

classification 💻 cs.CV
keywords sparse-view 3D reconstructiondiffusion-based image restorationview-consistency samplingcamera trajectory planning3D Gaussian splattingscene completionmulti-view consistencyimage repair for reconstruction
0
0 comments X

The pith

A framework reconstructs complete high-fidelity 3D scenes from only six to eight input images by repairing views with a scene-specific diffusion model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents S2C-3D as a method to produce complete 3D scenes even when only a handful of photos are available. It finetunes a diffusion model on the sparse inputs and their artificially degraded versions so that the model learns to restore renderings that match the scene's own distribution. A separate sampling step then measures consistency between neighboring restored views and uses that measure as a conditioning signal to generate images that agree across viewpoints. Finally an iterative camera planning routine builds paths that link each new viewpoint to its nearest neighbors and keeps only those that increase overall scene visibility. Together these steps yield 3D Gaussians without holes, blur, or conflicting surfaces.

Core claim

S2C-3D reconstructs high-fidelity and complete 3D scenes from six to eight images by combining a diffusion model finetuned on the input views and their degraded counterparts for scene-specific restoration, a training-free view-consistency conditioned sampling process that quantifies agreement between neighboring repaired images and injects it into the frozen model's sampling, and a trajectory planning scheme that iteratively connects new cameras to their two nearest neighbors while retaining only paths that meaningfully increase coverage; the resulting Gaussians are free of missing regions and artifacts.

What carries the argument

View-consistency conditioned sampling that quantifies agreement between neighboring restored images and injects the resulting consistency score as a conditioning signal into the diffusion sampling process to produce mutually consistent views for Gaussian optimization.

If this is right

  • 3D Gaussian splatting can operate reliably with far fewer input views than previously required.
  • Scene coverage can be achieved by iteratively selecting cameras that connect to existing ones and measurably increase visibility.
  • Multi-view conflicts can be resolved at sampling time inside a frozen diffusion model rather than through additional training or post-processing.
  • High-fidelity scene models become obtainable from minimal capture sessions without leaving holes or introducing blur.
  • The same restoration and consistency machinery can be applied to refine intermediate renderings during optimization.
  • pith_inferences:[

Load-bearing premise

Finetuning a pretrained diffusion model on the sparse input views together with their degraded versions lets the model repair Gaussian renderings while the added consistency condition prevents new multi-view conflicts without any extra training.

What would settle it

Run S2C-3D on a benchmark set of scenes that have both the six-view sparse inputs and dense ground-truth captures, then measure whether the output Gaussians match the dense reference in completeness and absence of artifacts.

Figures

Figures reproduced from arXiv: 2605.05664 by Kun Zhou, Tianjia Shao, Yin Yang, Yiyang Shen.

Figure 1
Figure 1. Figure 1: A room reconstructed by the original 3DGS [Kerbl et al view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our method. (a) We input a sparse set of (e.g., 4) unposed images into a feed-forward visual geometry reconstruction model view at source ↗
Figure 3
Figure 3. Figure 3: Detailed illustrations of the computation of the information gain view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparisons of novel view synthesis on the ScanNet++ dataset. view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative comparisons of novel view synthesis on the S2C-Scene dataset. view at source ↗
Figure 6
Figure 6. Figure 6: Ablation study of reconstructed scenes by different models. view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative comparison between images repaired by the finetuned diffusion model and those produced by the view-consistency conditioned diffusion view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of 3D scene coverage achieved by virtual view at source ↗
read the original abstract

We introduce S2C-3D, a novel sparse-view 3D reconstruction framework for high-fidelity and complete scene reconstruction from as few as six to eight images. Our framework features three components: a specialized diffusion model for scene-specific image restoration, a training-free view-consistency conditioned sampling process in the diffusion model for refined Gaussian optimization, and a camera trajectory planning scheme to ensure comprehensive scene coverage. The specialized diffusion model is developed by finetuning a pretrained architecture on the input views and their corresponding degraded counterparts. The adaptation to the scene distribution allows the model to repair Gaussian renderings while effectively eliminating domain gaps. Meanwhile, the trajectory planning scheme optimizes scene coverage by connecting each newly sampled camera to its two nearest neighbors. By iteratively constructing paths and retaining only those that significantly enhance visibility, the scheme establishes a trajectory that covers the entire scene. To address multi-view conflicts, the view-consistency conditioned sampling process quantifies the consistency between neighboring repaired images. This information is injected as a condition into the sampling process of the frozen diffusion model, facilitating the generation of view-consistent images without additional training. Consequently, our approach produces high-fidelity 3D Gaussians that are robust to artifacts. Experimental results demonstrate that S2C-3D outperforms state-of-the-art methods, constructing high-quality scenes that are free from missing regions, blurring, or other artifacts with very sparse inputs. The source code and data are available at https://gapszju.github.io/S2C-3D.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces S2C-3D, a sparse-view 3D reconstruction framework that reconstructs complete, high-fidelity scenes from as few as 6-8 input images. It comprises three components: (1) finetuning a pretrained diffusion model on the sparse input views and their degraded counterparts to create a scene-specific restorer for Gaussian renderings, (2) a training-free view-consistency conditioned sampling process that quantifies consistency between neighboring repaired images and injects this as a conditioning signal into the frozen diffusion model to resolve multi-view conflicts, and (3) an iterative camera trajectory planning scheme that connects new views to nearest neighbors while retaining only those that significantly improve visibility. The paper claims this pipeline yields artifact-free 3D Gaussians that outperform state-of-the-art methods.

Significance. If the empirical claims are substantiated, the work would offer a practical advance in sparse-view 3D reconstruction by adapting diffusion models in a scene-specific manner and enforcing consistency without extra training. The open release of code and data is a clear strength that supports reproducibility. The combination of finetuning for domain adaptation and training-free consistency conditioning could influence downstream applications in robotics, AR/VR, and scene understanding, provided the load-bearing steps are rigorously validated.

major comments (2)
  1. [Abstract] Abstract and method description: The view-consistency conditioned sampling is presented as quantifying consistency between neighboring repaired images and injecting it as a condition into the frozen diffusion model's sampling process. However, no concrete metric (e.g., pixel-wise variance, feature distance, or epipolar error), equation, or injection mechanism is specified. This step is load-bearing for the central claim of eliminating multi-view conflicts and Gaussian artifacts without introducing new inconsistencies, as the entire post-initial-Gaussian pipeline depends on it.
  2. [Abstract] Abstract: The claim that 'S2C-3D outperforms state-of-the-art methods, constructing high-quality scenes that are free from missing regions, blurring, or other artifacts' is asserted without any reported quantitative metrics, baselines, ablation studies, or implementation details in the provided description. This prevents evaluation of whether the finetuning and consistency sampling actually deliver the stated gains.
minor comments (1)
  1. [Abstract] The trajectory planning description mentions 'connecting each newly sampled camera to its two nearest neighbors' and 'retaining only those that significantly enhance visibility,' but lacks pseudocode or a precise optimization criterion for the 'significantly enhance' threshold.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We have prepared detailed point-by-point responses to the major comments and revised the paper to improve clarity and substantiation of the key technical components.

read point-by-point responses
  1. Referee: [Abstract] Abstract and method description: The view-consistency conditioned sampling is presented as quantifying consistency between neighboring repaired images and injecting it as a condition into the frozen diffusion model's sampling process. However, no concrete metric (e.g., pixel-wise variance, feature distance, or epipolar error), equation, or injection mechanism is specified. This step is load-bearing for the central claim of eliminating multi-view conflicts and Gaussian artifacts without introducing new inconsistencies, as the entire post-initial-Gaussian pipeline depends on it.

    Authors: We agree that the abstract and high-level method overview did not provide sufficient technical detail on this critical component. In the revised manuscript, we have expanded the relevant method section to specify the concrete consistency metric, include the corresponding equation, and describe the exact injection mechanism into the frozen diffusion model's sampling process. We have also added supporting analysis to show how this resolves multi-view conflicts without new artifacts. revision: yes

  2. Referee: [Abstract] Abstract: The claim that 'S2C-3D outperforms state-of-the-art methods, constructing high-quality scenes that are free from missing regions, blurring, or other artifacts' is asserted without any reported quantitative metrics, baselines, ablation studies, or implementation details in the provided description. This prevents evaluation of whether the finetuning and consistency sampling actually deliver the stated gains.

    Authors: The abstract serves as a concise summary, but we acknowledge it should better direct readers to the supporting evidence. The full manuscript already reports quantitative comparisons, baselines, ablation studies, and implementation details in the experiments section and tables. We have revised the abstract to explicitly reference these elements (including key metrics and sections) so that the performance claims can be readily evaluated. revision: yes

Circularity Check

0 steps flagged

No circularity in empirical pipeline

full rationale

The paper presents S2C-3D as a three-component empirical framework: finetuning a pretrained diffusion model on input views plus degraded counterparts, a training-free view-consistency conditioned sampling process, and a camera trajectory planning scheme. These steps are described procedurally without equations, fitted parameters renamed as predictions, or derivations that reduce to inputs by construction. No self-citations, uniqueness theorems, or ansatzes from prior author work are invoked to justify core mechanisms. Performance claims rest on experimental comparisons to external SOTA methods, rendering the approach self-contained against benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

Assessment limited to abstract; framework likely rests on standard assumptions about diffusion models and Gaussian splatting but introduces new procedural components whose parameters are unspecified.

free parameters (1)
  • diffusion finetuning hyperparameters
    Finetuning occurs on input views but specific learning rates, epochs, or loss weights are not detailed.
axioms (1)
  • domain assumption Pretrained diffusion models can be finetuned on limited scene-specific data to repair renderings and close domain gaps.
    Central to the specialized diffusion model component described in the abstract.
invented entities (1)
  • view-consistency conditioned sampling process no independent evidence
    purpose: Injects quantified consistency between neighboring repaired images as a condition into diffusion sampling to ensure multi-view agreement.
    Newly described training-free mechanism in the framework.

pith-pipeline@v0.9.0 · 5578 in / 1348 out tokens · 69856 ms · 2026-05-08T14:56:06.872864+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

83 extracted references · 26 canonical work pages · 2 internal anchors

  1. [1]

    Wang, Yifan and Zhou, Jianjun and Zhu, Haoyi and Chang, Wenzheng and Zhou, Yang and Li, Zizun and Chen, Junyi and Pang, Jiangmiao and Shen, Chunhua and He, Tong , journal=

  2. [2]

    and Frahm, Jan-Michael , booktitle=

    Schönberger, Johannes L. and Frahm, Jan-Michael , booktitle=. Structure-from-Motion Revisited , year=

  3. [3]

    GenFusion: Closing the Loop between Reconstruction and Generation via Videos , year=

    Wu, Sibo and Xu, Congrong and Huang, Binbin and Geiger, Andreas and Chen, Anpei , booktitle=. GenFusion: Closing the Loop between Reconstruction and Generation via Videos , year=

  4. [4]

    Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models , year=

    Wu, Jay Zhangjie and Zhang, Yuxuan and Turki, Haithem and Ren, Xuanchi and Gao, Jun and Shou, Mike Zheng and Fidler, Sanja and Gojcic, Zan and Ling, Huan , booktitle=. Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models , year=

  5. [5]

    ACM Trans

    Yang, Chen and Li, Sikuang and Fang, Jiemin and Liang, Ruofan and Xie, Lingxi and Zhang, Xiaopeng and Shen, Wei and Tian, Qi , title =. ACM Trans. Graph. , month = nov, articleno =. 2024 , issue_date =. doi:10.1145/3687759 , abstract =

  6. [6]

    and Hariharan, Bharath and Pritch, Yael and Wadhwa, Neal and Aberman, Kfir and Rubinstein, Michael , title =

    Tang, Luming and Ruiz, Nataniel and Chu, Qinghao and Li, Yuanzhen and Holynski, Aleksander and Jacobs, David E. and Hariharan, Bharath and Pritch, Yael and Wadhwa, Neal and Aberman, Kfir and Rubinstein, Michael , title =. ACM Trans. Graph. , month = jul, articleno =. 2024 , issue_date =. doi:10.1145/3658237 , abstract =

  7. [7]

    Adversarial Diffusion Distillation

    Sauer, Axel and Lorenz, Dominik and Blattmann, Andreas and Rombach, Robin , title =. Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXXXVI , pages =. 2024 , isbn =. doi:10.1007/978-3-031-73016-0_6 , abstract =

  8. [8]

    ACM Trans

    Zhang, Han and Yao, Yucong and Xie, Ke and Fu, Chi-Wing and Zhang, Hao and Huang, Hui , title =. ACM Trans. Graph. , month = dec, articleno =. 2021 , issue_date =. doi:10.1145/3478513.3480483 , abstract =

  9. [9]

    Proceedings of the Fourth Eurographics Symposium on Geometry Processing , pages =

    Kazhdan, Michael and Bolitho, Matthew and Hoppe, Hugues , title =. Proceedings of the Fourth Eurographics Symposium on Geometry Processing , pages =. 2006 , isbn =

  10. [10]

    Submodular Trajectory Optimization for Aerial 3D Scanning , year=

    Roberts, Mike and Shah, Shital and Dey, Debadeepta and Truong, Anh and Sinha, Sudipta and Kapoor, Ashish and Hanrahan, Pat and Joshi, Neel , booktitle=. Submodular Trajectory Optimization for Aerial 3D Scanning , year=

  11. [11]

    FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model , year=

    Yu, Jiwen and Wang, Yinhuai and Zhao, Chen and Ghanem, Bernard and Zhang, Jian , booktitle=. FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model , year=

  12. [12]

    Predicting structured data , volume=

    A tutorial on energy-based learning , author=. Predicting structured data , volume=

  13. [13]

    Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =

    Zhao, Min and Bao, Fan and Li, Chongxuan and Zhu, Jun , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

  14. [14]

    The Replica Dataset: A Digital Replica of Indoor Spaces

    The replica dataset: A digital replica of indoor spaces , author=. arXiv preprint arXiv:1906.05797 , year=

  15. [15]

    ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes , year=

    Yeshwanth, Chandan and Liu, Yueh-Cheng and Nießner, Matthias and Dai, Angela , booktitle=. ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes , year=

  16. [16]

    and Sheikh, H.R

    Zhou Wang and Bovik, A.C. and Sheikh, H.R. and Simoncelli, E.P. , journal=. Image quality assessment: from error visibility to structural similarity , year=

  17. [17]

    and Shechtman, Eli and Wang, Oliver , booktitle=

    Zhang, Richard and Isola, Phillip and Efros, Alexei A. and Shechtman, Eli and Wang, Oliver , booktitle=. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , year=

  18. [18]

    Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs , year=

    Zhong, Yingji and Li, Zhihao and Chen, Dave Zhenyu and Hong, Lanqing and Xu, Dan , booktitle=. Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs , year=

  19. [19]

    DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization , year=

    Li, Jiahe and Zhang, Jiawei and Bai, Xiao and Zheng, Jin and Ning, Xin and Zhou, Jun and Gu, Lin , booktitle=. DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization , year=

  20. [20]

    Denoising Diffusion Probabilistic Models , url =

    Ho, Jonathan and Jain, Ajay and Abbeel, Pieter , booktitle =. Denoising Diffusion Probabilistic Models , url =

  21. [21]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =

    Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =. 2022 , pages =

  22. [22]

    Mildenhall, B

    Mildenhall, Ben and Srinivasan, Pratul P. and Tancik, Matthew and Barron, Jonathan T. and Ramamoorthi, Ravi and Ng, Ren , title =. Commun. ACM , month = dec, pages =. 2021 , issue_date =. doi:10.1145/3503250 , abstract =

  23. [23]

    Depth-supervised NeRF: Fewer Views and Faster Training for Free , year=

    Deng, Kangle and Liu, Andrew and Zhu, Jun-Yan and Ramanan, Deva , booktitle=. Depth-supervised NeRF: Fewer Views and Faster Training for Free , year=

  24. [24]

    https://doi.org/10.1145/3592433 Xiaonan Kong and Riley G

    Kerbl, Bernhard and Kopanas, Georgios and Leimkuehler, Thomas and Drettakis, George , title =. ACM Trans. Graph. , month = jul, articleno =. 2023 , issue_date =. doi:10.1145/3592433 , abstract =

  25. [25]

    and Mildenhall, Ben and Barron, Jonathan T

    Hedman, Peter and Srinivasan, Pratul P. and Mildenhall, Ben and Barron, Jonathan T. and Debevec, Paul , booktitle=. Baking Neural Radiance Fields for Real-Time View Synthesis , year=

  26. [26]

    and Mildenhall, Ben and Sajjadi, Mehdi S

    Niemeyer, Michael and Barron, Jonathan T. and Mildenhall, Ben and Sajjadi, Mehdi S. M. and Geiger, Andreas and Radwan, Noha , booktitle=. RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs , year=

  27. [27]

    FreeNeRF: Improving Few-Shot Neural Rendering with Free Frequency Regularization , year=

    Yang, Jiawei and Pavone, Marco and Wang, Yue , booktitle=. FreeNeRF: Improving Few-Shot Neural Rendering with Free Frequency Regularization , year=

  28. [28]

    PlenOctrees for Real-time Rendering of Neural Radiance Fields , year=

    Yu, Alex and Li, Ruilong and Tancik, Matthew and Li, Hao and Ng, Ren and Kanazawa, Angjoo , booktitle=. PlenOctrees for Real-time Rendering of Neural Radiance Fields , year=

  29. [29]

    FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization , year=

    Zhang, Jiahui and Zhan, Fangneng and Xu, Muyu and Lu, Shijian and Xing, Eric , booktitle=. FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization , year=

  30. [30]

    Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =

    Han, Liang and Zhang, Xu and Song, Haichuan and Shi, Kanle and Liu, Yu-Shen and Han, Zhizhong , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =. 2025 , pages =

  31. [31]

    NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting , year=

    Zheng, Yulong and Jiang, Zicheng and He, Shengfeng and Sun, Yandu and Dong, Junyu and Zhang, Huaidong and Du, Yong , booktitle=. NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting , year=

  32. [32]

    PixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction , year=

    Charatan, David and Li, Sizhe Lester and Tagliasacchi, Andrea and Sitzmann, Vincent , booktitle=. PixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction , year=

  33. [33]

    MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo , year=

    Chen, Anpei and Xu, Zexiang and Zhao, Fuqiang and Zhang, Xiaoshuai and Xiang, Fanbo and Yu, Jingyi and Su, Hao , booktitle=. MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo , year=

  34. [34]

    LaRa: Efficient Large-Baseline Radiance Fields

    Chen, Anpei and Xu, Haofei and Esposito, Stefano and Tang, Siyu and Geiger, Andreas. LaRa: Efficient Large-Baseline Radiance Fields. Computer Vision -- ECCV 2024. 2025

  35. [35]

    MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-view Images

    Chen, Yuedong and Xu, Haofei and Zheng, Chuanxia and Zhuang, Bohan and Pollefeys, Marc and Geiger, Andreas and Cham, Tat-Jen and Cai, Jianfei. MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-view Images. Computer Vision -- ECCV 2024. 2025

  36. [36]

    pixelNeRF: Neural Radiance Fields from One or Few Images , year=

    Yu, Alex and Ye, Vickie and Tancik, Matthew and Kanazawa, Angjoo , booktitle=. pixelNeRF: Neural Radiance Fields from One or Few Images , year=

  37. [37]

    Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLVI , pages =

    Xing, Jinbo and Xia, Menghan and Zhang, Yong and Chen, Haoxin and Yu, Wangbo and Liu, Hanyuan and Liu, Gongye and Wang, Xintao and Shan, Ying and Wong, Tien-Tsin , title =. Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLVI , pages =. 2024 , isbn =. doi:10.1007/978-3-031-72952-2_23 , a...

  38. [38]

    and Zheng, Enliang and Frahm, Jan-Michael and Pollefeys, Marc

    Sch \"o nberger, Johannes L. and Zheng, Enliang and Frahm, Jan-Michael and Pollefeys, Marc. Pixelwise View Selection for Unstructured Multi-View Stereo. Computer Vision -- ECCV 2016. 2016

  39. [39]

    Hierarchical Place Recognition for Topological Mapping , year=

    Garcia-Fidalgo, Emilio and Ortiz, Alberto , journal=. Hierarchical Place Recognition for Topological Mapping , year=

  40. [40]

    Single-Image Depth Prediction Makes Feature Matching Easier

    Toft, Carl and Turmukhambetov, Daniyar and Sattler, Torsten and Kahl, Fredrik and Brostow, Gabriel J. Single-Image Depth Prediction Makes Feature Matching Easier. Computer Vision -- ECCV 2020. 2020

  41. [41]

    MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds , year=

    Tang, Zhenggang and Fan, Yuchen and Wang, Dilin and Xu, Hongyu and Ranjan, Rakesh and Schwing, Alexander and Yan, Zhicheng , booktitle=. MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds , year=

  42. [42]

    PlaneFormers: From Sparse View Planes to 3D Reconstruction

    Agarwala, Samir and Jin, Linyi and Rockwell, Chris and Fouhey, David F. PlaneFormers: From Sparse View Planes to 3D Reconstruction. Computer Vision -- ECCV 2022. 2022

  43. [43]

    ReconX: Reconstruct Any Scene From Sparse Views With Video Diffusion Model , year=

    Liu, Fangfu and Sun, Wenqiang and Wang, Hanyang and Wang, Yikai and Sun, Haowen and Ye, Junliang and Zhang, Jun and Duan, Yueqi , journal=. ReconX: Reconstruct Any Scene From Sparse Views With Video Diffusion Model , year=

  44. [44]

    ACM SIGGRAPH 2022 Conference Proceedings , articleno =

    Saharia, Chitwan and Chan, William and Chang, Huiwen and Lee, Chris and Ho, Jonathan and Salimans, Tim and Fleet, David and Norouzi, Mohammad , title =. ACM SIGGRAPH 2022 Conference Proceedings , articleno =. 2022 , isbn =. doi:10.1145/3528233.3530757 , abstract =

  45. [45]

    and Ren, Jian , title =

    Zhang, Zhixing and Han, Ligong and Ghosh, Arnab and Metaxas, Dimitris N. and Ren, Jian , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =. 2023 , pages =

  46. [46]

    Video Diffusion Models , url =

    Ho, Jonathan and Salimans, Tim and Gritsenko, Alexey and Chan, William and Norouzi, Mohammad and Fleet, David J , booktitle =. Video Diffusion Models , url =

  47. [47]

    Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

    Stable video diffusion: Scaling latent video diffusion models to large datasets , author=. arXiv preprint arXiv:2311.15127 , year=

  48. [48]

    U-Net: Convolutional Networks for Biomedical Image Segmentation

    Ronneberger, Olaf and Fischer, Philipp and Brox, Thomas. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015. 2015

  49. [49]

    Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =

    Fischer, Tobias and Bul\`o, Samuel Rota and Yang, Yung-Hsu and Keetha, Nikhil and Porzi, Lorenzo and M\"uller, Norman and Schwarz, Katja and Luiten, Jonathon and Pollefeys, Marc and Kontschieder, Peter , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =. 2025 , pages =

  50. [50]

    Computational Visual Media , volume=

    Neural 3D reconstruction from sparse views using geometric priors , author=. Computational Visual Media , volume=. 2023 , publisher=

  51. [51]

    and Pollefeys, Marc , booktitle=

    Zhu, Zihan and Peng, Songyou and Larsson, Viktor and Xu, Weiwei and Bao, Hujun and Cui, Zhaopeng and Oswald, Martin R. and Pollefeys, Marc , booktitle=. NICE-SLAM: Neural Implicit Scalable Encoding for SLAM , year=

  52. [52]

    ACM SIGGRAPH 2024 Conference Papers , articleno =

    Peng, Zhexi and Shao, Tianjia and Liu, Yong and Zhou, Jingke and Yang, Yin and Wang, Jingdong and Zhou, Kun , title =. ACM SIGGRAPH 2024 Conference Papers , articleno =. 2024 , isbn =. doi:10.1145/3641519.3657455 , abstract =

  53. [53]

    Score-Based Generative Modeling through Stochastic Differential Equations , booktitle =

    Yang Song and Jascha Sohl. Score-Based Generative Modeling through Stochastic Differential Equations , booktitle =. 2021 , url =

  54. [54]

    Adding Conditional Control to Text-to-Image Diffusion Models , year=

    Zhang, Lvmin and Rao, Anyi and Agrawala, Maneesh , booktitle=. Adding Conditional Control to Text-to-Image Diffusion Models , year=

  55. [55]

    Mou, Chong and Wang, Xintao and Xie, Liangbin and Wu, Yanze and Zhang, Jian and Qi, Zhongang and Shan, Ying , title =. Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence , ar...

  56. [56]

    ACM SIGGRAPH 2024 Conference Papers , articleno =

    Ye, Keyang and Hou, Qiming and Zhou, Kun , title =. ACM SIGGRAPH 2024 Conference Papers , articleno =. 2024 , isbn =. doi:10.1145/3641519.3657456 , abstract =

  57. [57]

    ACM Trans

    Ye, Keyang and Shao, Tianjia and Zhou, Kun , title =. ACM Trans. Graph. , month = jul, articleno =. 2025 , issue_date =. doi:10.1145/3730925 , abstract =

  58. [58]

    Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering , year=

    Lu, Tao and Yu, Mulin and Xu, Linning and Xiangli, Yuanbo and Wang, Limin and Lin, Dahua and Dai, Bo , booktitle=. Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering , year=

  59. [59]

    VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction , year=

    Lin, Jiaqi and Li, Zhihao and Tang, Xiao and Liu, Jianzhuang and Liu, Shiyong and Liu, Jiayue and Lu, Yangdi and Wu, Xiaofei and Xu, Songcen and Yan, Youliang and Yang, Wenming , booktitle=. VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction , year=

  60. [60]

    ACM Trans

    Peng, Zhexi and Yang, Yin and Shao, Tianjia and Jiang, Chenfanfu and Zhou, Kun , title =. ACM Trans. Graph. , month = jul, articleno =. 2024 , issue_date =. doi:10.1145/3658233 , abstract =

  61. [61]

    Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views , year=

    Bao, Chong and Zhang, Xiyu and Yu, Zehao and Shi, Jiale and Zhang, Guofeng and Peng, Songyou and Cui, Zhaopeng , booktitle=. Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views , year=

  62. [62]

    Instant neural graphics primitives with a multiresolution hash encoding , year =

    M\". Instant neural graphics primitives with a multiresolution hash encoding , year =. ACM Trans. Graph. , month = jul, articleno =. doi:10.1145/3528223.3530127 , abstract =

  63. [63]

    and Mildenhall, Ben and Verbin, Dor and Srinivasan, Pratul P

    Barron, Jonathan T. and Mildenhall, Ben and Verbin, Dor and Srinivasan, Pratul P. and Hedman, Peter , booktitle=. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields , year=

  64. [64]

    ACM SIGGRAPH 2024 Conference Papers , articleno =

    Huang, Binbin and Yu, Zehao and Chen, Anpei and Geiger, Andreas and Gao, Shenghua , title =. ACM SIGGRAPH 2024 Conference Papers , articleno =. 2024 , isbn =. doi:10.1145/3641519.3657428 , abstract =

  65. [65]

    Papantonakis, Panagiotis and Kopanas, Georgios and Kerbl, Bernhard and Lanvin, Alexandre and Drettakis, George , title =. Proc. ACM Comput. Graph. Interact. Tech. , month = may, articleno =. 2024 , issue_date =. doi:10.1145/3651282 , abstract =

  66. [66]

    WildGaussians: 3D Gaussian Splatting In the Wild , url =

    Kulhanek, Jonas and Peng, Songyou and Kukelova, Zuzana and Pollefeys, Marc and Sattler, Torsten , booktitle =. WildGaussians: 3D Gaussian Splatting In the Wild , url =. doi:10.52202/079017-0670 , editor =

  67. [67]

    ACM Trans

    Sabour, Sara and Goli, Lily and Kopanas, George and Matthews, Mark and Lagun, Dmitry and Guibas, Leonidas and Jacobson, Alec and Fleet, David and Tagliasacchi, Andrea , title =. ACM Trans. Graph. , month = apr, articleno =. 2025 , issue_date =. doi:10.1145/3727143 , abstract =

  68. [68]

    and Davison, Andrew J

    Matsuki, Hidenobu and Murai, Riku and Kelly, Paul H.J. and Davison, Andrew J. , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =. 2024 , pages =

  69. [69]

    , booktitle=

    Fu, Yang and Wang, Xiaolong and Liu, Sifei and Kulkarni, Amey and Kautz, Jan and Efros, Alexei A. , booktitle=. COLMAP-Free 3D Gaussian Splatting , year=

  70. [70]

    Mip-Splatting: Alias-Free 3D Gaussian Splatting , year=

    Yu, Zehao and Chen, Anpei and Huang, Binbin and Sattler, Torsten and Geiger, Andreas , booktitle=. Mip-Splatting: Alias-Free 3D Gaussian Splatting , year=

  71. [71]

    PGSR: Planar-Based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction , year=

    Chen, Danpeng and Li, Hai and Ye, Weicai and Wang, Yifan and Xie, Weijian and Zhai, Shangjin and Wang, Nan and Liu, Haomin and Bao, Hujun and Zhang, Guofeng , journal=. PGSR: Planar-Based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction , year=

  72. [72]

    Recent advances in 3D Gaussian splatting , year=

    Wu, Tong and Yuan, Yu-Jie and Zhang, Ling-Xiao and Yang, Jie and Cao, Yan-Pei and Yan, Ling-Qi and Gao, Lin , journal=. Recent advances in 3D Gaussian splatting , year=

  73. [73]

    Nerf: Neural radiance field in 3d vision, a comprehensive review,

    Nerf: Neural radiance field in 3d vision, a comprehensive review , author=. arXiv preprint arXiv:2210.00379 , year=

  74. [74]

    GANeRF: Leveraging Discriminators to Optimize Neural Radiance Fields , year =

    Roessle, Barbara and M\". GANeRF: Leveraging Discriminators to Optimize Neural Radiance Fields , year =. ACM Trans. Graph. , month = dec, articleno =. doi:10.1145/3618402 , abstract =

  75. [75]

    SCube: Instant Large-Scale Scene Reconstruction using VoxSplats , url =

    Ren, Xuanchi and Lu, Yifan and Liang, Hanxue and Wu, Zhangjie and Ling, Huan and Chen, Mike and Fidler, Sanja and Williams, Francis and Huang, Jiahui , booktitle =. SCube: Instant Large-Scale Scene Reconstruction using VoxSplats , url =. doi:10.52202/079017-3099 , editor =

  76. [76]

    ACM Trans

    Xie, Ke and Yang, Hao and Huang, Shengqiu and Lischinski, Dani and Christie, Marc and Xu, Kai and Gong, Minglun and Cohen-Or, Daniel and Huang, Hui , title =. ACM Trans. Graph. , month = jul, articleno =. 2018 , issue_date =. doi:10.1145/3197517.3201284 , abstract =

  77. [77]

    Computer Graphics Forum , volume =

    Yang, Hao and Xie, Ke and Huang, Shengqiu and Huang, Hui , title =. Computer Graphics Forum , volume =. doi:https://doi.org/10.1111/cgf.13559 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13559 , abstract =

  78. [78]

    ACM Trans

    Roberts, Mike and Hanrahan, Pat , title =. ACM Trans. Graph. , month = jul, articleno =. 2016 , issue_date =. doi:10.1145/2897824.2925980 , abstract =

  79. [79]

    and Wu, Jiajun , booktitle=

    Yu, Hong-Xing and Duan, Haoyi and Herrmann, Charles and Freeman, William T. and Wu, Jiajun , booktitle=. WonderWorld: Interactive 3D Scene Generation from a Single Image , year=

  80. [80]

    Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =

    Ni, Chaojun and Wang, Xiaofeng and Zhu, Zheng and Wang, Weijie and Li, Haoyun and Zhao, Guosheng and Li, Jie and Qin, Wenkang and Huang, Guan and Mei, Wenjun , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =. 2025 , pages =

Showing first 80 references.