KVPO aligns streaming autoregressive video generators with human preferences via ODE-native GRPO, using KV cache for semantic exploration and TVE for velocity-based policy modeling, yielding gains in quality and alignment.
In: Proceedings of the AAAI Conference on Artificial Intelligence
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
KVPO aligns streaming autoregressive video generators with human preferences via ODE-native GRPO, using KV cache for semantic exploration and TVE for velocity-based policy modeling, yielding gains in quality and alignment.