LIBERO and CALVIN fail multiple proposed diagnostics for shortcut solvability, statistical significance, overfitting, and data dependence, while a tiny 0.09B probe reaches near-SOTA on LIBERO.
Simvla: A simple vla baseline for robotic manipulation.arXiv preprint arXiv:2602.18224, 2026
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
QuoVLA introduces a quotient-space framework that compresses VLM latents into action-sufficient representations via quantization and dual-branch design for better VLA generalization.
Evo-Depth is a compact VLA model using a lightweight implicit depth encoder from RGB views plus progressive alignment to boost manipulation performance without added hardware.
citing papers explorer
-
QuoVLA: Quotient Space for Vision-Language-Action Models
QuoVLA introduces a quotient-space framework that compresses VLM latents into action-sufficient representations via quantization and dual-branch design for better VLA generalization.
-
Evo-Depth: A Lightweight Depth-Enhanced Vision-Language-Action Model
Evo-Depth is a compact VLA model using a lightweight implicit depth encoder from RGB views plus progressive alignment to boost manipulation performance without added hardware.