Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration

Edward Johns

arxiv: 2105.06411 · v2 · pith:O7VOAIM7new · submitted 2021-05-13 · 💻 cs.RO · cs.LG

Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration

Edward Johns This is my paper

classification 💻 cs.RO cs.LG

keywords demonstrationend-effectorstateimitationinteractionlearningmanipulationmethod

0 comments

read the original abstract

We introduce a simple new method for visual imitation learning, which allows a novel robot manipulation task to be learned from a single human demonstration, without requiring any prior knowledge of the object being interacted with. Our method models imitation learning as a state estimation problem, with the state defined as the end-effector's pose at the point where object interaction begins, as observed from the demonstration. By then modelling a manipulation task as a coarse, approach trajectory followed by a fine, interaction trajectory, this state estimator can be trained in a self-supervised manner, by automatically moving the end-effector's camera around the object. At test time, the end-effector moves to the estimated state through a linear path, at which point the original demonstration's end-effector velocities are simply replayed. This enables convenient acquisition of a complex interaction trajectory, without actually needing to explicitly learn a policy. Real-world experiments on 8 everyday tasks show that our method can learn a diverse range of skills from a single human demonstration, whilst also yielding a stable and interpretable controller.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

One Demonstration Is Enough for Real-World Robotic Reinforcement Learning
cs.RO 2026-07 unverdicted novelty 6.0

AutoSERL achieves strong performance on six real-world robot manipulation tasks using RL guided by a single demonstration via sliding-window intervention, safety recovery, and automatic termination.