ReTVL uses retry events as sparse supervision to train mistake-sensitive value functions that reweight demonstration chunks for improved behavior cloning on real-robot manipulation tasks.
Dragan, Shankar Sastry, and Sanjit A
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Robots detect underspecified reward features via demonstration variation and query targeted natural language explanations to improve reward recovery from imperfect demos.
citing papers explorer
-
Beyond Monotonic Progress: Retry-Supervised Value Learning for Robot Imitation
ReTVL uses retry events as sparse supervision to train mistake-sensitive value functions that reweight demonstration chunks for improved behavior cloning on real-robot manipulation tasks.
-
Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations
Robots detect underspecified reward features via demonstration variation and query targeted natural language explanations to improve reward recovery from imperfect demos.