A proposer-solver agent pair achieves supervised-level video temporal grounding and fine-grained captioning from 2.5K unlabeled videos via self-reinforcing evolution.
Learning transferable visual models from natural language supervi- sion
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
GR4CIL introduces gap-compensated routing to enable reliable task-aware knowledge routing in CLIP-based class incremental learning while preserving zero-shot generalization.
citing papers explorer
-
EvoGround: Self-Evolving Video Agents for Video Temporal Grounding
A proposer-solver agent pair achieves supervised-level video temporal grounding and fine-grained captioning from 2.5K unlabeled videos via self-reinforcing evolution.
-
GR4CIL: Gap-compensated Routing for CLIP-based Class Incremental Learning
GR4CIL introduces gap-compensated routing to enable reliable task-aware knowledge routing in CLIP-based class incremental learning while preserving zero-shot generalization.