DreamAvoid uses a Dream Trigger, Action Proposer, and Dream Evaluator trained on success/failure/boundary data to let VLA policies avoid critical-phase failures via test-time future dreaming.
Evolve-vla: Test-time training from environment feedback for vision- language-action models.arXiv preprint arXiv:2512.14666
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9verdicts
UNVERDICTED 9roles
background 4polarities
background 4representative citing papers
T^2VLA is a test-time reinforcement learning framework for VLAs that uses internal confidence to define intrinsic rewards via similarity to high-confidence expert demonstrations and a dual-expert bootstrapping mechanism.
GRA extracts 2D waypoints from synthetic videos to supervise VLA vision while restricting action training to real data, outperforming pseudo-action baselines on real-robot tasks.
Agentic-VLA enables efficient online adaptation of VLA models, delivering +12.3% on long-horizon tasks, +28.5% in 1-shot learning, and 2.4x faster convergence on LIBERO through three new components.
Anchor-Centric Adaptation escapes the diversity trap by prioritizing repeated demonstrations at core anchors over broad coverage, yielding higher success rates under fixed data budgets in robotic manipulation.
PALM improves long-horizon robotic manipulation success by distilling affordance representations for object interaction and predicting within-subtask progress in a VLA model.
Action-state consistency in World Action Models distinguishes successful from failed imagined futures and supports value-free selection of better rollouts via consensus among predictions.
T³VF applies test-time training on natural future-prediction supervision pairs with adaptive filtering to mitigate OOD shifts in VF-VLA models at modest extra inference cost.
FAR combines failure-contrastive preference adaptation with action perturbations for test-time recovery and continual policy improvement, reporting 17.6% and 11.7% success gains over diffusion policies in simulation and real-world manipulation tasks.
citing papers explorer
No citing papers match the current filters.