R3m: A universal visual representation for robot manipulation

Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta · 2022

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization

cs.LG · 2025-04-22 · unverdicted · novelty 6.0

π_{0.5} is a VLA model that achieves long-horizon dexterous manipulation in entirely new homes through co-training on heterogeneous tasks and multi-source data including web and semantic predictions.

Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success

cs.RO · 2025-02-27 · accept · novelty 6.0

OpenVLA-OFT fine-tuning boosts LIBERO success rate from 76.5% to 97.1%, speeds action generation 26x, and outperforms baselines on real bimanual dexterous tasks.

Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation

cs.RO · 2024-12-19 · conditional · novelty 6.0

Seer, a transformer-based PIDM pre-trained on large robotic datasets like DROID, outperforms prior methods on simulation and real-world robotic manipulation benchmarks with gains up to 43%.

citing papers explorer

Showing 3 of 3 citing papers.

$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization cs.LG · 2025-04-22 · unverdicted · none · ref 58
π_{0.5} is a VLA model that achieves long-horizon dexterous manipulation in entirely new homes through co-training on heterogeneous tasks and multi-source data including web and semantic predictions.
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success cs.RO · 2025-02-27 · accept · none · ref 33
OpenVLA-OFT fine-tuning boosts LIBERO success rate from 76.5% to 97.1%, speeds action generation 26x, and outperforms baselines on real bimanual dexterous tasks.
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation cs.RO · 2024-12-19 · conditional · none · ref 30
Seer, a transformer-based PIDM pre-trained on large robotic datasets like DROID, outperforms prior methods on simulation and real-world robotic manipulation benchmarks with gains up to 43%.

R3m: A universal visual representation for robot manipulation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer