A zero-shot pipeline using SLAM, SAM segmentation, clustering and VLM multi-view reasoning produces semantic maps with object class and movability labels, reporting 98.93% mIoU and 89.17% mAcc on intralogistics data.
Relationship-aware hierarchical 3D scene graph for task reasoning,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Vision-Language Model Reasoning for Contextual Semantic Mapping in Intralogistics
A zero-shot pipeline using SLAM, SAM segmentation, clustering and VLM multi-view reasoning produces semantic maps with object class and movability labels, reporting 98.93% mIoU and 89.17% mAcc on intralogistics data.