FAV aligns few-step generative models by amortizing SVGD updates from reward-tilted sampling into generator parameters via fixed-point regression, requiring only sample access, and shows outperformance on robotics tasks plus scaling on image generators.
arXiv preprint arXiv:2509.25170 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 3years
2026 3roles
background 1polarities
background 1representative citing papers
The choice of closeness measure in diffusion reward alignment determines the computational primitives and tractable reward classes, with linear exponential tilts sufficing for KL with convex rewards and proximal oracles for Wasserstein with concave or low-dimensional Lipschitz rewards.