An occlusion-aware extension to DAM4SAM adds a reliability state machine, branch-based recovery, delayed memory promotion, and selective native memory rules to improve robustness under long occlusions and reappearances without altering the backbone.
Mevis: A multi-modal dataset for referring motion expression video segmentation.IEEE Transactions on Pattern Analysis and Machine Intelligence
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A staged pipeline using ASR transcription, visual existence verification, Sa2VA coarse segmentation, and agent-guided SAM3 refinement won first place in the PVUW MeViS-Audio track by decomposing audio-conditioned Ref-VOS into sequential verification and refinement steps.
An agent-augmented Sa2VA pipeline for referring video object segmentation placed third in the MeViS-Text track of the 5th PVUW Challenge by adding verification, search, and refinement stages.
citing papers explorer
-
OAMVOS:2nd Report for 5th PVUW MOSE Track
An occlusion-aware extension to DAM4SAM adds a reliability state machine, branch-based recovery, delayed memory promotion, and selective native memory rules to improve robustness under long occlusions and reappearances without altering the backbone.
-
APRVOS: 1st Place Winner of 5th PVUW MeViS-Audio Track
A staged pipeline using ASR transcription, visual existence verification, Sa2VA coarse segmentation, and agent-guided SAM3 refinement won first place in the PVUW MeViS-Audio track by decomposing audio-conditioned Ref-VOS into sequential verification and refinement steps.
-
AgentRVOS for MeViS-Text Track of 5th PVUW Challenge: 3rd Method
An agent-augmented Sa2VA pipeline for referring video object segmentation placed third in the MeViS-Text track of the 5th PVUW Challenge by adding verification, search, and refinement stages.