BabyCL learns word-referent mappings from egocentric video in a single chronological pass via streaming visual learning, dual replay, and three contrastive losses, outperforming streaming baselines on the SAYCam 4AFC benchmark.
InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9777–9786
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
EgoEverything is a new benchmark for long-context egocentric video understanding that uses human gaze-based attention signals to generate questions reflecting natural behavior.
citing papers explorer
-
Continual Visual and Verbal Learning Through a Child's Egocentric Input
BabyCL learns word-referent mappings from egocentric video in a single chronological pass via streaming visual learning, dual replay, and three contrastive losses, outperforming streaming baselines on the SAYCam 4AFC benchmark.