Cogagent: A visual language model for gui agents

Wenyi Hong, Weihan Wang, Qingsong Lv, Jiazheng Xu, Wenmeng Yu, Junhui Ji, Yan Wang, Zihan Wang, Yuxuan Zhang, Juanzi Li, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

CutVerse benchmark evaluates GUI agents on 186 complex media post-production tasks in seven apps and reports 36% success rate for existing models.

General Agentic Planning Through Simulative Reasoning with World Models

cs.AI · 2025-07-31 · conditional · novelty 6.0

SiRA uses LLM world models for simulative reasoning to achieve up to 124% higher task completion and 32.2% navigation success versus reactive baselines in web environments.

citing papers explorer

Showing 2 of 2 citing papers.

CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing cs.CV · 2026-05-19 · unverdicted · none · ref 11
CutVerse benchmark evaluates GUI agents on 186 complex media post-production tasks in seven apps and reports 36% success rate for existing models.
General Agentic Planning Through Simulative Reasoning with World Models cs.AI · 2025-07-31 · conditional · none · ref 39
SiRA uses LLM world models for simulative reasoning to achieve up to 124% higher task completion and 32.2% navigation success versus reactive baselines in web environments.

Cogagent: A visual language model for gui agents

fields

years

verdicts

representative citing papers

citing papers explorer