MME-RealWorld is the largest manually annotated high-resolution benchmark for MLLMs, where even the best models achieve less than 60% accuracy on challenging real-world tasks.
A survey for founda- tion models in autonomous driving
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 3roles
background 1polarities
background 1representative citing papers
SCP defines a new benchmark task for predicting spatial causal outcomes beyond direct observation and shows that 23 leading models lag far behind humans on it.
SoccerMaster is the first soccer-specific vision foundation model that unifies tasks from player detection to event classification via multi-task pretraining and outperforms task-specific models on downstream evaluations.
citing papers explorer
-
SCP: Spatial Causal Prediction in Video
SCP defines a new benchmark task for predicting spatial causal outcomes beyond direct observation and shows that 23 leading models lag far behind humans on it.