pith. machine review for the scientific record. sign in

arxiv: 1703.07326 · v3 · submitted 2017-03-21 · 💻 cs.AI · cs.LG· cs.NE· cs.RO

Recognition: unknown

One-Shot Imitation Learning

Bradly C. Stadie, Ilya Sutskever, Jonas Schneider, Jonathan Ho, Marcin Andrychowicz, Pieter Abbeel, Wojciech Zaremba, Yan Duan

Authors on Pith no claims yet
classification 💻 cs.AI cs.LGcs.NEcs.RO
keywords tasktasksdemonstrationdifferentblocksdemonstrationsimitationlearning
0
0 comments X
read the original abstract

Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from very few demonstrations of any given task, and instantly generalize to new situations of the same task, without requiring task-specific engineering. In this paper, we propose a meta-learning framework for achieving such capability, which we call one-shot imitation learning. Specifically, we consider the setting where there is a very large set of tasks, and each task has many instantiations. For example, a task could be to stack all blocks on a table into a single tower, another task could be to place all blocks on a table into two-block towers, etc. In each case, different instances of the task would consist of different sets of blocks with different initial states. At training time, our algorithm is presented with pairs of demonstrations for a subset of all tasks. A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration. At test time, a demonstration of a single instance of a new task is presented, and the neural net is expected to perform well on new instances of this new task. The use of soft attention allows the model to generalize to conditions and tasks unseen in the training data. We anticipate that by training this model on a much greater variety of tasks and settings, we will obtain a general system that can turn any demonstrations into robust policies that can accomplish an overwhelming variety of tasks. Videos available at https://bit.ly/nips2017-oneshot .

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

    cs.RO 2023-04 conditional novelty 7.0

    Low-cost imprecise robots achieve 80-90% success on six fine bimanual manipulation tasks using imitation learning with a new Action Chunking with Transformers algorithm trained on only 10 minutes of demonstrations.

  2. Graph Attention Networks

    stat.ML 2017-10 accept novelty 7.0

    Graph Attention Networks compute learnable attention coefficients over node neighborhoods to produce weighted feature aggregations, achieving state-of-the-art results on citation networks and inductive protein-protein...

  3. SID: Sliding into Distribution for Robust Few-Demonstration Manipulation

    cs.RO 2026-05 unverdicted novelty 6.0

    SID achieves approximately 90% success on six real-world manipulation tasks with only two demonstrations under out-of-distribution initializations, with less than 10% performance drop under distractors and disturbances.

  4. Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

    cs.RO 2024-01 conditional novelty 6.0

    A low-cost whole-body teleoperation system enables effective imitation learning for complex bimanual mobile manipulation by co-training on mobile and static demonstration datasets.

  5. Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets

    cs.RO 2021-09 accept novelty 6.0

    A large multi-task multi-domain robot dataset combined with 50 new demonstrations yields 2x higher success rates on never-before-seen tasks in new domains.