← back to paper
arxiv: 2605.11723 · 2 revisions
CaC: Advancing Video Reward Models via Hierarchical Spatiotemporal Concentrating