cc/paper_files/paper/1996/file/ 68d13cf26c4b4f4f932e3eff990093ba-Paper

URL https://proceedings · 1996

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Recovering Hidden Reward in Diffusion-Based Policies

cs.RO · 2026-05-01 · unverdicted · novelty 6.0

EnergyFlow shows that denoising score matching on diffusion policies recovers the gradient of the expert's soft Q-function under maximum-entropy optimality, enabling non-adversarial reward extraction and improved policy generalization.

citing papers explorer

Showing 1 of 1 citing paper.

Recovering Hidden Reward in Diffusion-Based Policies cs.RO · 2026-05-01 · unverdicted · none · ref 10
EnergyFlow shows that denoising score matching on diffusion policies recovers the gradient of the expert's soft Q-function under maximum-entropy optimality, enabling non-adversarial reward extraction and improved policy generalization.

cc/paper_files/paper/1996/file/ 68d13cf26c4b4f4f932e3eff990093ba-Paper

fields

years

verdicts

representative citing papers

citing papers explorer