pith. sign in

arxiv: 2512.00062 · v2 · pith:5HH5HIJEnew · submitted 2025-11-24 · 💻 cs.RO · cs.AI· cs.LG

SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning

classification 💻 cs.RO cs.AIcs.LG
keywords executionpolicyspeedaugaccelerationtaskdemonstrationslearningmanipulation
0
0 comments X
read the original abstract

Robotic policy learning for complex real-world manipulation tasks has seen rapid recent progress, enabled in large part by the ability to collect demonstrations through human operation. However, policies trained from such demonstrations often execute tasks far more slowly than the robot's physical capabilities, as demonstration data is collected under practical constraints that favor conservative, success-oriented trajectories over execution speed. Existing policy acceleration methods determine execution tempo through data preprocessing or heuristic rules, rather than learning execution speed optimized for the task. In this paper, we propose SpeedAug, a policy acceleration framework that enables policies to learn task-optimal execution tempo via reinforcement learning (RL). SpeedAug first learns a tempo-enriched prior policy from speed-augmented demonstrations that captures diverse execution tempos. Building on this tempo-enriched prior, RL fine-tuning guides exploration to refine action trajectories and optimize execution tempo efficiently. Experiments on robotic manipulation benchmarks demonstrate that SpeedAug substantially improves the sample efficiency of policy acceleration while maintaining high success rates, achieving fast and stable task execution. Applied to a real-world manipulation task, SpeedAug improves task throughput by 1.8x using only 16 minutes of online interactions without compromising the success rate.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

    cs.RO 2026-06 unverdicted novelty 6.0

    TempoVLA learns a single VLA policy with controllable execution speed via variable-speed trajectory augmentation and explicit speed conditioning.

  2. VOLT: Vision and Language Trajectory Segmentation for Faster-than-Demonstration Policies

    cs.RO 2026-06 unverdicted novelty 6.0

    VOLT is a vision-and-language trajectory segmentation method that selectively downsamples non-critical segments of demonstrations to train faster-than-demonstration imitation learning policies.