A large multi-task multi-domain robot dataset combined with 50 new demonstrations yields 2x higher success rates on never-before-seen tasks in new domains.
End-to-end training of deep visuomotor policies
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
Potential-based reward shaping preserves optimality of stochastic policies and accelerates learning when added to soft Q-learning and advantage actor-critic algorithms.
Deep reinforcement learning learns robust policies for flexible robots but is sensitive to sensor choice.
citing papers explorer
-
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
A large multi-task multi-domain robot dataset combined with 50 new demonstrations yields 2x higher success rates on never-before-seen tasks in new domains.
-
Potential-Based Advice for Stochastic Policy Learning
Potential-based reward shaping preserves optimality of stochastic policies and accelerates learning when added to soft Q-learning and advantage actor-critic algorithms.
-
On Training Flexible Robots using Deep Reinforcement Learning
Deep reinforcement learning learns robust policies for flexible robots but is sensitive to sensor choice.