VEGA improves spatial reasoning in VLA models for robotics by aligning visual encoder features with 3D-supervised DINOv2 representations via a temporary projector and cosine similarity loss.
Film: Visual reasoning with a general conditioning layer
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
The CARM module boosts neural routing solvers by adaptively modulating embeddings with constraint variables, enabling better use of global observations and improved performance on constrained VRPs.
citing papers explorer
-
VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models
VEGA improves spatial reasoning in VLA models for robotics by aligning visual encoder features with 3D-supervised DINOv2 representations via a temporary projector and cosine similarity loss.
-
Rethinking Constraint Awareness for Efficient State Embedding of Neural Routing Solver
The CARM module boosts neural routing solvers by adaptively modulating embeddings with constraint variables, enabling better use of global observations and improved performance on constrained VRPs.