Videoreasonbench: Can mllms perform vision-centric complex video reasoning?

Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, Lin Sui, Xinhao Li, Yan Zhong, Y Charles, Xinyu Zhou, Xu Sun · 2025 · arXiv 2505.23359

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

RefereeBench: Are Video MLLMs Ready to be Multi-Sport Referees

cs.CV · 2026-04-17 · unverdicted · novelty 8.0

RefereeBench shows that even the strongest video MLLMs reach only around 60% accuracy on multi-sport refereeing tasks and struggle with rule application and temporal grounding.

Video Understanding Reward Modeling: A Robust Benchmark and Performant Reward Models

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

Introduces VURB benchmark and VUP-35K dataset to train discriminative and generative video reward models that achieve SOTA performance on VURB and VideoRewardBench.

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

Video-MME-v2 is a new benchmark that applies progressive visual-to-reasoning levels and non-linear group scoring to expose gaps in video MLLM capabilities.

EasyVideoR1: Easier RL for Video Understanding

cs.CV · 2026-04-18 · unverdicted · novelty 4.0

EasyVideoR1 delivers an optimized RL pipeline for video understanding in large vision-language models, achieving 1.47x throughput gains and aligned results on 22 benchmarks.

citing papers explorer

Showing 4 of 4 citing papers.

RefereeBench: Are Video MLLMs Ready to be Multi-Sport Referees cs.CV · 2026-04-17 · unverdicted · none · ref 30
RefereeBench shows that even the strongest video MLLMs reach only around 60% accuracy on multi-sport refereeing tasks and struggle with rule application and temporal grounding.
Video Understanding Reward Modeling: A Robust Benchmark and Performant Reward Models cs.CV · 2026-05-08 · unverdicted · none · ref 19
Introduces VURB benchmark and VUP-35K dataset to train discriminative and generative video reward models that achieve SOTA performance on VURB and VideoRewardBench.
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding cs.CV · 2026-04-06 · unverdicted · none · ref 18
Video-MME-v2 is a new benchmark that applies progressive visual-to-reasoning levels and non-linear group scoring to expose gaps in video MLLM capabilities.
EasyVideoR1: Easier RL for Video Understanding cs.CV · 2026-04-18 · unverdicted · none · ref 24
EasyVideoR1 delivers an optimized RL pipeline for video understanding in large vision-language models, achieving 1.47x throughput gains and aligned results on 22 benchmarks.

Videoreasonbench: Can mllms perform vision-centric complex video reasoning?

fields

years

verdicts

representative citing papers

citing papers explorer