TASOT performs annotation-free surgical temporal segmentation by extending ASOT with temporally aligned textual captions from a VLM fused into an unbalanced Gromov-Wasserstein optimal transport objective using DINOv3 and CLIP features, reporting F1 gains of +18.9 to +33.7 over zero-shot baselines on
International journal of computer assisted radiology and surgery19(11), 2249– 2257 (2024)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multimodal Optimal Transport for Training-free Temporal Segmentation in Surgical Robotics
TASOT performs annotation-free surgical temporal segmentation by extending ASOT with temporally aligned textual captions from a VLM fused into an unbalanced Gromov-Wasserstein optimal transport objective using DINOv3 and CLIP features, reporting F1 gains of +18.9 to +33.7 over zero-shot baselines on