Introduces LSM that outputs calibrated multimodal spatial distributions from language plus scene graph, fused via VL-Map to improve 3D target localization on VLA-3D benchmark and real robot.
LINGO-Space: Language-conditioned incremental grounding for space
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces Embodied Tool Protocol and tool externalization to improve embodied AI performance on perception and cognition tasks, with measured gains but limits on execution capabilities.
The paper identifies gaps in LLM spatial reasoning and advocates graph-enhanced approaches for future spatial search systems.
citing papers explorer
-
Language as a Sensor: Calibrated Spatial Belief Estimation in 3D Scenes from Natural Language
Introduces LSM that outputs calibrated multimodal spatial distributions from language plus scene graph, fused via VL-Map to improve 3D target localization on VLA-3D benchmark and real robot.
-
Enabling Extensible Embodied Capabilities with Tools
Introduces Embodied Tool Protocol and tool externalization to improve embodied AI performance on perception and cognition tasks, with measured gains but limits on execution capabilities.
-
Graph-Enhanced Large Language Models for Spatial Search
The paper identifies gaps in LLM spatial reasoning and advocates graph-enhanced approaches for future spatial search systems.