BabyCL learns word-referent mappings from egocentric video in a single chronological pass via streaming visual learning, dual replay, and three contrastive losses, outperforming streaming baselines on the SAYCam 4AFC benchmark.
InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9777–9786
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
EgoEverything is a new benchmark for long-context egocentric video understanding that uses human gaze-based attention signals to generate questions reflecting natural behavior.
citing papers explorer
-
EgoEverything: A Benchmark for Human Behavior Inspired Long Context Egocentric Video Understanding in AR Environment
EgoEverything is a new benchmark for long-context egocentric video understanding that uses human gaze-based attention signals to generate questions reflecting natural behavior.