SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning

Junmo Cho; Sung Ju Hwang; Taewook Nam; Youngsoo Jang

arxiv: 2512.00062 · v2 · pith:5HH5HIJEnew · submitted 2025-11-24 · 💻 cs.RO · cs.AI· cs.LG

SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning

Taewook Nam , Junmo Cho , Youngsoo Jang , Sung Ju Hwang This is my paper

classification 💻 cs.RO cs.AIcs.LG

keywords executionpolicyspeedaugaccelerationtaskdemonstrationslearningmanipulation

0 comments

read the original abstract

Robotic policy learning for complex real-world manipulation tasks has seen rapid recent progress, enabled in large part by the ability to collect demonstrations through human operation. However, policies trained from such demonstrations often execute tasks far more slowly than the robot's physical capabilities, as demonstration data is collected under practical constraints that favor conservative, success-oriented trajectories over execution speed. Existing policy acceleration methods determine execution tempo through data preprocessing or heuristic rules, rather than learning execution speed optimized for the task. In this paper, we propose SpeedAug, a policy acceleration framework that enables policies to learn task-optimal execution tempo via reinforcement learning (RL). SpeedAug first learns a tempo-enriched prior policy from speed-augmented demonstrations that captures diverse execution tempos. Building on this tempo-enriched prior, RL fine-tuning guides exploration to refine action trajectories and optimize execution tempo efficiently. Experiments on robotic manipulation benchmarks demonstrate that SpeedAug substantially improves the sample efficiency of policy acceleration while maintaining high success rates, achieving fast and stable task execution. Applied to a real-world manipulation task, SpeedAug improves task throughput by 1.8x using only 16 minutes of online interactions without compromising the success rate.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies
cs.RO 2026-06 unverdicted novelty 6.0

TempoVLA learns a single VLA policy with controllable execution speed via variable-speed trajectory augmentation and explicit speed conditioning.
VOLT: Vision and Language Trajectory Segmentation for Faster-than-Demonstration Policies
cs.RO 2026-06 unverdicted novelty 6.0

VOLT is a vision-and-language trajectory segmentation method that selectively downsamples non-critical segments of demonstrations to train faster-than-demonstration imitation learning policies.