GEAR-VLA learns geometry-aware action representations via coarse-to-fine pretraining, gradient-decoupled DiT action expert, semantic-aligned 3D integration, and embodiment canonicalization, reporting SOTA results on LIBERO benchmarks and over 80% success on unseen embodiments and 212 unseen objects.
Dexgraspvla: A vision-language-action framework towards general dexterous grasping
2 Pith papers cite this work, alongside 3 external citations. Polarity classification is still indexing.
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
HandITL enables seamless human intervention in VLA policies for bimanual dexterous manipulation, cutting jitter by 99.8% and improving refined policies by 19% over standard teleoperation.
citing papers explorer
-
GEAR-VLA: Learning Geometry-Aware Action Representations for Generalizable Robotic Manipulation
GEAR-VLA learns geometry-aware action representations via coarse-to-fine pretraining, gradient-decoupled DiT action expert, semantic-aligned 3D integration, and embodiment canonicalization, reporting SOTA results on LIBERO benchmarks and over 80% success on unseen embodiments and 212 unseen objects.
-
Hand-in-the-Loop: Improving VLA Policies for Dexterous Manipulation via Seamless Hand-Arm Intervention
HandITL enables seamless human intervention in VLA policies for bimanual dexterous manipulation, cutting jitter by 99.8% and improving refined policies by 19% over standard teleoperation.