The paper creates a real-world corruption benchmark for promptable video object segmentation and proposes MoGA, which uses object-specific memory to improve robustness and temporal consistency under adverse conditions.
Segment any- thing
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
DiTTA distills SAM2 temporal segmentation knowledge into image models via efficient test-time adaptation and a lightweight fusion module to produce annotation-free video semantic segmentation that matches or exceeds fully supervised performance.
Bridge learns low-rank bases for front-door causal adjustment to remove spurious correlations from domain shifts and integrates the approach with vision foundation models for improved object detection generalization.
citing papers explorer
-
Robust Promptable Video Object Segmentation
The paper creates a real-world corruption benchmark for promptable video object segmentation and proposes MoGA, which uses object-specific memory to improve robustness and temporal consistency under adverse conditions.
-
Bootstrapping Video Semantic Segmentation Model via Distillation-assisted Test-Time Adaptation
DiTTA distills SAM2 temporal segmentation knowledge into image models via efficient test-time adaptation and a lightweight fusion module to produce annotation-free video semantic segmentation that matches or exceeds fully supervised performance.
-
Bridge: Basis-Driven Causal Inference Marries VFMs for Domain Generalization
Bridge learns low-rank bases for front-door causal adjustment to remove spurious correlations from domain shifts and integrates the approach with vision foundation models for improved object detection generalization.