Video-mmlu: A massive multi-discipline lecture understanding benchmark

· 2025 · arXiv 2504.14693

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

cs.CV · 2026-06-03 · unverdicted · novelty 7.0

VideoKR supplies 315K knowledge-intensive video reasoning examples and a dedicated benchmark, with experiments indicating post-training gains on reasoning tasks that require both video content and external knowledge.

OneThinker: All-in-one Reasoning Model for Image and Video

cs.CV · 2025-12-02 · unverdicted · novelty 5.0

OneThinker unifies image and video reasoning in one model across 10 tasks via a 600k corpus, CoT-annotated SFT, and EMA-GRPO reinforcement learning, reporting strong results on 31 benchmarks plus some cross-task transfer.

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

cs.CV · 2026-06-05 · unverdicted · novelty 4.0

This is a survey that frames video MLLM research via a human-view formulation of perceptual representations, memory states, reasoning traces, and predictions, then reviews methods, datasets, benchmarks, and open problems.

citing papers explorer

Showing 3 of 3 citing papers.

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding cs.CV · 2026-06-03 · unverdicted · none · ref 13
VideoKR supplies 315K knowledge-intensive video reasoning examples and a dedicated benchmark, with experiments indicating post-training gains on reasoning tasks that require both video content and external knowledge.
OneThinker: All-in-one Reasoning Model for Image and Video cs.CV · 2025-12-02 · unverdicted · none · ref 66
OneThinker unifies image and video reasoning in one model across 10 tasks via a 600k corpus, CoT-annotated SFT, and EMA-GRPO reinforcement learning, reporting strong results on 31 benchmarks plus some cross-task transfer.
Watch, Remember, Reason: Human-View Video Understanding with MLLMs cs.CV · 2026-06-05 · unverdicted · none · ref 252
This is a survey that frames video MLLM research via a human-view formulation of perceptual representations, memory states, reasoning traces, and predictions, then reviews methods, datasets, benchmarks, and open problems.

Video-mmlu: A massive multi-discipline lecture understanding benchmark

fields

years

verdicts

representative citing papers

citing papers explorer