A survey on multimodal benchmarks: In the era of large ai models

Li, L · 2024 · arXiv 2409.18142

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

EquiMem: Calibrating Shared Memory in Multi-Agent Debate via Game-Theoretic Equilibrium

cs.AI · 2026-05-10 · unverdicted · novelty 7.0

EquiMem calibrates shared memory in multi-agent debate by computing a game-theoretic equilibrium from agent queries and paths, outperforming heuristics and LLM validators across benchmarks while remaining robust to adversarial agents.

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

cs.CV · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

PVM adds a parallel branch to LVLMs that directly supplies visual embeddings to prevent attention decay over long generated sequences, yielding accuracy gains on reasoning tasks with minimal overhead.

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models

cs.CV · 2025-05-21 · unverdicted · novelty 5.0

LENS is a new multi-level benchmark dataset for evaluating MLLMs on perception-to-reasoning tasks using the same images across all levels with recent social media content.

Large Language Model-Brained GUI Agents: A Survey

cs.AI · 2024-11-27 · unverdicted · novelty 4.0

A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.

Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning

cs.CL · 2025-02-05 · unverdicted · novelty 2.0

Position paper claims multimodal LLMs can significantly advance scientific reasoning and proposes a four-stage roadmap plus challenges and suggestions.

citing papers explorer

Showing 5 of 5 citing papers.

EquiMem: Calibrating Shared Memory in Multi-Agent Debate via Game-Theoretic Equilibrium cs.AI · 2026-05-10 · unverdicted · none · ref 34
EquiMem calibrates shared memory in multi-agent debate by computing a game-theoretic equilibrium from agent queries and paths, outperforming heuristics and LLM validators across benchmarks while remaining robust to adversarial agents.
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs cs.CV · 2026-05-01 · unverdicted · none · ref 40 · 2 links
PVM adds a parallel branch to LVLMs that directly supplies visual embeddings to prevent attention decay over long generated sequences, yielding accuracy gains on reasoning tasks with minimal overhead.
LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models cs.CV · 2025-05-21 · unverdicted · none · ref 10
LENS is a new multi-level benchmark dataset for evaluating MLLMs on perception-to-reasoning tasks using the same images across all levels with recent social media content.
Large Language Model-Brained GUI Agents: A Survey cs.AI · 2024-11-27 · unverdicted · none · ref 57
A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning cs.CL · 2025-02-05 · unverdicted · none · ref 95
Position paper claims multimodal LLMs can significantly advance scientific reasoning and proposes a four-stage roadmap plus challenges and suggestions.

A survey on multimodal benchmarks: In the era of large ai models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer