GenRecon lifts object-level generative priors to scene-scale reconstruction by chunking scenes and using projection-based conditioning on multi-view features, claiming 16% better results than prior methods.
Geome- trycrafter: Consistent geometry estimation for open-world videos with diffusion priors
6 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 6representative citing papers
AmbiSuR adds intrinsic photometric disambiguation and a self-indication module to Gaussian Splatting to resolve ambiguities and improve surface reconstruction accuracy.
Pano2World generates an explorable 3D Gaussian scene directly from a single indoor panorama via coarse proxy rendering, view-aware joint denoising, and a latent feature adapter.
UniVidX unifies diverse video generation tasks into one conditional diffusion model using stochastic condition masking, decoupled gated LoRAs, and cross-modal self-attention.
ViPE estimates camera intrinsics, motion, and dense near-metric depth from uncalibrated videos, outperforming baselines on TUM and KITTI while releasing annotations for 96M frames across real and generated videos.
citing papers explorer
-
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
GenRecon lifts object-level generative priors to scene-scale reconstruction by chunking scenes and using projection-based conditioning on multi-view features, claiming 16% better results than prior methods.
-
Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surface Reconstruction
AmbiSuR adds intrinsic photometric disambiguation and a self-indication module to Gaussian Splatting to resolve ambiguities and improve surface reconstruction accuracy.
-
Pano2World: End-to-End 3D Generation via Unified Multi-View Sequences
Pano2World generates an explorable 3D Gaussian scene directly from a single indoor panorama via coarse proxy rendering, view-aware joint denoising, and a latent feature adapter.
-
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
UniVidX unifies diverse video generation tasks into one conditional diffusion model using stochastic condition masking, decoupled gated LoRAs, and cross-modal self-attention.
-
ViPE: Video Pose Engine for 3D Geometric Perception
ViPE estimates camera intrinsics, motion, and dense near-metric depth from uncalibrated videos, outperforming baselines on TUM and KITTI while releasing annotations for 96M frames across real and generated videos.
- PAGE-4D: VGGT-4D Perception via Disentangled Pose and Geometry Estimation