VideoKR supplies 315K knowledge-intensive video reasoning examples and a dedicated benchmark, with experiments indicating post-training gains on reasoning tasks that require both video content and external knowledge.
Egotempo: A benchmark for egocentric video question answering requiring temporal reasoning
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Egostream introduces a diagnostic benchmark that expands 2,250 questions into 8,528 recall-conditioned evaluations to measure streaming episodic memory performance across detail, spatial, temporal, event, social, causal, and prospective dimensions in egocentric vision.
VISE is the first benchmark for sycophancy in Video-LLMs, with two training-free mitigation strategies based on key-frame selection and internal representation steering.
citing papers explorer
-
VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding
VideoKR supplies 315K knowledge-intensive video reasoning examples and a dedicated benchmark, with experiments indicating post-training gains on reasoning tasks that require both video content and external knowledge.
-
EGOSTREAM: A Diagnostic Benchmark for Streaming Episodic Memory in Egocentric Vision
Egostream introduces a diagnostic benchmark that expands 2,250 questions into 8,528 recall-conditioned evaluations to measure streaming episodic memory performance across detail, spatial, temporal, event, social, causal, and prospective dimensions in egocentric vision.
-
Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs
VISE is the first benchmark for sycophancy in Video-LLMs, with two training-free mitigation strategies based on key-frame selection and internal representation steering.