Can video diffusion model reconstruct 4d geometry?

Bernard Ghanem · 2025 · arXiv 2503.21082

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

cs.CV · 2026-05-12 · unverdicted · novelty 8.0

TrackCraft3R is the first method to repurpose a video diffusion transformer as a feed-forward dense 3D tracker via dual-latent representations and temporal RoPE alignment, achieving SOTA performance with lower compute.

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

cs.CV · 2025-07-10 · unverdicted · novelty 6.0

Geometry Forcing aligns video diffusion representations with geometric foundation model features via angular cosine and scale regression objectives to improve 3D consistency in generated videos.

citing papers explorer

Showing 2 of 2 citing papers.

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking cs.CV · 2026-05-12 · unverdicted · none · ref 51
TrackCraft3R is the first method to repurpose a video diffusion transformer as a feed-forward dense 3D tracker via dual-latent representations and temporal RoPE alignment, achieving SOTA performance with lower compute.
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling cs.CV · 2025-07-10 · unverdicted · none · ref 48
Geometry Forcing aligns video diffusion representations with geometric foundation model features via angular cosine and scale regression objectives to improve 3D consistency in generated videos.

Can video diffusion model reconstruct 4d geometry?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer