Vision-language-action model with open- world embodied reasoning from pretrained knowledge

Zhongyi Zhou, Yichen Zhu, Junjie Wen, Chaomin Shen, Yi Xu · 2025 · arXiv 2505.21906

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

VLA-ATTC: Adaptive Test-Time Compute for VLA Models with Relative Action Critic Model

cs.RO · 2026-05-02 · unverdicted · novelty 6.0

VLA-ATTC equips VLA models with adaptive test-time compute via an uncertainty clutch and relative action critic, cutting failure rates by over 50% on LIBERO-LONG.

Sentinel-VLA: A Metacognitive VLA Model with Active Status Monitoring for Dynamic Reasoning and Error Recovery

cs.RO · 2026-05-02 · unverdicted · novelty 6.0

Sentinel-VLA introduces a metacognitive VLA model with a sentinel module for real-time status monitoring, dynamic reasoning, and error recovery, plus a self-evolving continual learning method, raising real-world task success by over 30% versus prior SOTA.

Embodied Interpretability: Linking Causal Understanding to Generalization in Vision-Language-Action Models

cs.RO · 2026-05-01 · unverdicted · novelty 6.0

Interventional attribution via ISS and NMR diagnoses causal misalignment in VLA policies and predicts their generalization performance across manipulation tasks.

GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning

cs.RO · 2026-04-28 · unverdicted · novelty 6.0

GS-Playground delivers a high-throughput photorealistic simulator for vision-informed robot learning via parallel physics integrated with batch 3D Gaussian Splatting at 10^4 FPS and an automated Real2Sim workflow for consistent environments.

citing papers explorer

Showing 4 of 4 citing papers.

VLA-ATTC: Adaptive Test-Time Compute for VLA Models with Relative Action Critic Model cs.RO · 2026-05-02 · unverdicted · none · ref 32
VLA-ATTC equips VLA models with adaptive test-time compute via an uncertainty clutch and relative action critic, cutting failure rates by over 50% on LIBERO-LONG.
Sentinel-VLA: A Metacognitive VLA Model with Active Status Monitoring for Dynamic Reasoning and Error Recovery cs.RO · 2026-05-02 · unverdicted · none · ref 21
Sentinel-VLA introduces a metacognitive VLA model with a sentinel module for real-time status monitoring, dynamic reasoning, and error recovery, plus a self-evolving continual learning method, raising real-world task success by over 30% versus prior SOTA.
Embodied Interpretability: Linking Causal Understanding to Generalization in Vision-Language-Action Models cs.RO · 2026-05-01 · unverdicted · none · ref 38
Interventional attribution via ISS and NMR diagnoses causal misalignment in VLA policies and predicts their generalization performance across manipulation tasks.
GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning cs.RO · 2026-04-28 · unverdicted · none · ref 63
GS-Playground delivers a high-throughput photorealistic simulator for vision-informed robot learning via parallel physics integrated with batch 3D Gaussian Splatting at 10^4 FPS and an automated Real2Sim workflow for consistent environments.

Vision-language-action model with open- world embodied reasoning from pretrained knowledge

fields

years

verdicts

representative citing papers

citing papers explorer