A generative video model conditioned on pixel-aligned 3D renderings produces consistent dynamic 3D Gaussian splats from monocular video and sets new SOTA in 4D reconstruction.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
StreetNVS presents a multi-sensor conditioned video diffusion framework for street-view novel view synthesis that outperforms baselines with sparse LiDAR and handles extreme out-of-trajectory paths on the Waymo dataset.
citing papers explorer
-
World from Motion: Generative Dynamic Gaussian Reconstruction from Monocular Video
A generative video model conditioned on pixel-aligned 3D renderings produces consistent dynamic 3D Gaussian splats from monocular video and sets new SOTA in 4D reconstruction.
-
Effective Multi-sensor Conditioning for Street-view Novel-view Synthesis
StreetNVS presents a multi-sensor conditioned video diffusion framework for street-view novel view synthesis that outperforms baselines with sparse LiDAR and handles extreme out-of-trajectory paths on the Waymo dataset.