A large multi-task multi-domain robot dataset combined with 50 new demonstrations yields 2x higher success rates on never-before-seen tasks in new domains.
End-to-end training of deep visuomotor policies
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
Potential-based reward shaping preserves optimality of stochastic policies and accelerates learning when added to soft Q-learning and advantage actor-critic algorithms.
Deep reinforcement learning learns robust policies for flexible robots but is sensitive to sensor choice.
citing papers explorer
No citing papers match the current filters.