Failure-Aware RL: Reliable Offline- to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation

· 2026 · arXiv 2601.07821

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

SafeManip: A Property-Driven Benchmark for Temporal Safety Evaluation in Robotic Manipulation

cs.RO · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

SafeManip is a benchmark applying reusable LTLf templates across eight safety categories to evaluate temporal properties in robotic manipulation on VLA policies.

DreamAvoid: Critical-Phase Test-Time Dreaming to Avoid Failures in VLA Policies

cs.RO · 2026-05-12 · unverdicted · novelty 7.0

DreamAvoid uses a Dream Trigger, Action Proposer, and Dream Evaluator trained on success/failure/boundary data to let VLA policies avoid critical-phase failures via test-time future dreaming.

One Demonstration Is Enough for Real-World Robotic Reinforcement Learning

cs.RO · 2026-07-02 · unverdicted · novelty 6.0

AutoSERL achieves strong performance on six real-world robot manipulation tasks using RL guided by a single demonstration via sliding-window intervention, safety recovery, and automatic termination.

Robot Critics that Sweat the Small Stuff

cs.RO · 2026-06-19 · unverdicted · novelty 6.0

Fine-tuning VLMs with pairwise progress supervision from policy rollouts improves fine-grained failure detection and boosts robot manipulation success by 11% real-world and 5.9% in simulation.

UniIntervene: Agentic Intervention for Efficient Real-World Reinforcement Learning

cs.RO · 2026-06-10 · unverdicted · novelty 6.0

UniIntervene uses future-conditioned action-value estimation and a temporal value-risk critic to trigger memory-based recovery interventions, reporting 8.6% higher success rates and 57% fewer human interventions than prior HiL-RL methods on real manipulation tasks.

AEGIS: A Backup Reflex for Physical AI

cs.AI · 2026-06-04 · unverdicted · novelty 6.0

AEGIS uses activation probes for early-warning detection of high-risk steps in weak policies and selectively escalates to stronger policies, recovering 10.1% of lost trajectories on LIBERO-Spatial while activating the strong policy on only 38% of steps.

Beyond Action Residuals: Real-World Robot Policy Steering via Bottleneck Latent Reinforcement Learning

cs.RO · 2026-05-19 · unverdicted · novelty 6.0

ZPRL adapts frozen flow-matching imitation policies via RL perturbations on a task-relevant bottleneck latent, yielding 33.7% higher average success on four real-world manipulation tasks than action-residual baselines.

Learning-augmented robotic automation for real-world manufacturing

cs.RO · 2026-04-24 · conditional · novelty 6.0

A learning-augmented robotic system automated deformable cable insertion and soldering on a live electric-motor production line for 5 hours 10 minutes, producing 108 motors at 99.4% pass rate with under 20 minutes of real-world data per task and no physical fencing.

FAR: Failure-Aware Retry for Test-Time Recovery and Continual Policy Improvement

cs.RO · 2026-07-01 · unverdicted · novelty 4.0

FAR combines failure-contrastive preference adaptation with action perturbations for test-time recovery and continual policy improvement, reporting 17.6% and 11.7% success gains over diffusion policies in simulation and real-world manipulation tasks.

Rule-based High-Level Coaching for Goal-Conditioned Reinforcement Learning in Search-and-Rescue UAV Missions Under Limited-Simulation Training

cs.RO · 2026-04-29 · unverdicted · novelty 4.0

Rule-based high-level guidance combined with goal-conditioned reinforcement learning enables safer and more efficient online adaptation for UAV search-and-rescue tasks under limited simulation training.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Failure-Aware RL: Reliable Offline- to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer