MIA-Bench: Towards better instruction following evaluation of multimodal LLMs.arXiv preprint arXiv:2407.01509, 2024

Yusu Qian, Hanrong Ye, Jean-Philippe Fauconnier, Peter Grasch, Yinfei Yang, Zhe Gan · 2024 · arXiv 2407.01509

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

EVE: Verifiable Self-Evolution of MLLMs via Executable Visual Transformations

cs.CV · 2026-04-20 · unverdicted · novelty 8.0

EVE enables verifiable self-evolution of MLLMs by using a Challenger-Solver architecture to generate dynamic executable visual transformations that produce VQA problems with absolute execution-verified ground truth.

MMCL-Bench: Multimodal Context Learning from Visual Rules, Procedures, and Evidence

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

MMCL-Bench shows that even the strongest frontier multimodal models solve fewer than one-third of tasks requiring recovery and application of visual rules, procedures, and empirical patterns.

citing papers explorer

Showing 2 of 2 citing papers after filters.

EVE: Verifiable Self-Evolution of MLLMs via Executable Visual Transformations cs.CV · 2026-04-20 · unverdicted · none · ref 31
EVE enables verifiable self-evolution of MLLMs by using a Challenger-Solver architecture to generate dynamic executable visual transformations that produce VQA problems with absolute execution-verified ground truth.
MMCL-Bench: Multimodal Context Learning from Visual Rules, Procedures, and Evidence cs.CV · 2026-05-12 · unverdicted · none · ref 20
MMCL-Bench shows that even the strongest frontier multimodal models solve fewer than one-third of tasks requiring recovery and application of visual rules, procedures, and empirical patterns.

MIA-Bench: Towards better instruction following evaluation of multimodal LLMs.arXiv preprint arXiv:2407.01509, 2024

fields

years

verdicts

representative citing papers

citing papers explorer