Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.
Ac3d: Analyzing and improving 3d camera control in video diffusion transformers
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 3years
2026 3roles
background 1polarities
background 1representative citing papers
h-control augments hard-replacement guidance with block-conditional pseudo-Gibbs refinement on unobserved latent sites and adaptive 3D patch freezing to achieve superior FVD on RealEstate10K and DAVIS.
citing papers explorer
-
Geo-Align: Video Generation Alignment via Metric Geometry Reward
Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.
-
$h$-control: Training-Free Camera Control via Block-Conditional Gibbs Refinement
h-control augments hard-replacement guidance with block-conditional pseudo-Gibbs refinement on unobserved latent sites and adaptive 3D patch freezing to achieve superior FVD on RealEstate10K and DAVIS.
- World-R1: Reinforcing 3D Constraints for Text-to-Video Generation