DiGSeg repurposes diffusion U-Nets as generalist segmentation learners by conditioning on image-mask latents and multi-scale CLIP text features, achieving strong cross-domain performance.
International Journal of Computer Vision127(3), 302–321 (2019)
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
AdaVFM integrates neural architecture search into vision foundation model backbones and uses a cloud multimodal LLM agent to enable runtime-adaptive lightweight subnet execution, delivering up to 7.9% higher accuracy and 77.9% lower FLOPs than fixed-size baselines on edge devices.
citing papers explorer
-
Diffusion Model as a Generalist Segmentation Learner
DiGSeg repurposes diffusion U-Nets as generalist segmentation learners by conditioning on image-mask latents and multi-scale CLIP text features, achieving strong cross-domain performance.
-
AdaVFM: Adaptive Vision Foundation Models for Edge Intelligence via LLM-Guided Execution
AdaVFM integrates neural architecture search into vision foundation model backbones and uses a cloud multimodal LLM agent to enable runtime-adaptive lightweight subnet execution, delivering up to 7.9% higher accuracy and 77.9% lower FLOPs than fixed-size baselines on edge devices.