OpenTrack3D achieves state-of-the-art open-vocabulary 3D instance segmentation by generating cross-view consistent proposals online with a visual-spatial tracker and replacing CLIP with an MLLM for improved compositional reasoning.
Clip2scene: Towards label-efficient 3d scene understanding by clip
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation
OpenTrack3D achieves state-of-the-art open-vocabulary 3D instance segmentation by generating cross-view consistent proposals online with a visual-spatial tracker and replacing CLIP with an MLLM for improved compositional reasoning.