AutoSERL achieves strong performance on six real-world robot manipulation tasks using RL guided by a single demonstration via sliding-window intervention, safety recovery, and automatic termination.
arXiv preprint arXiv:2508.12252 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.RO 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
FADA is a three-stage Planner-IDM method that achieves few-shot domain adaptation for humanoid control by distilling an oracle policy then finetuning only the IDM on short target-domain rollouts via supervised learning.
Contrastive learning bounds the Lipschitz constant of a trajectory dynamics encoder to support outcome-centric zero-shot adaptation in MuJoCo robotics tasks under severe dynamics shifts.
citing papers explorer
-
One Demonstration Is Enough for Real-World Robotic Reinforcement Learning
AutoSERL achieves strong performance on six real-world robot manipulation tasks using RL guided by a single demonstration via sliding-window intervention, safety recovery, and automatic termination.
-
FADA: Few-Shot Domain Adaptation via Dynamics Alignment for Humanoid Control
FADA is a three-stage Planner-IDM method that achieves few-shot domain adaptation for humanoid control by distilling an oracle policy then finetuning only the IDM on short target-domain rollouts via supervised learning.
-
Dynamics Are Learned, Not Told: Semi-Supervised Discovery of Latent Dynamics Geometries For Zero-Shot Policy Adaptation
Contrastive learning bounds the Lipschitz constant of a trajectory dynamics encoder to support outcome-centric zero-shot adaptation in MuJoCo robotics tasks under severe dynamics shifts.