VenusBench-Mobile is a new benchmark that exposes large performance gaps in mobile GUI agents on realistic user-centric tasks, with failures dominated by perception and memory deficiencies and near-zero success under environment changes.
There is a picture named fruit.png. Use the Draw app to open the image, circle the banana with a red pen, and stay on the screen after the task is completed
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics
VenusBench-Mobile is a new benchmark that exposes large performance gaps in mobile GUI agents on realistic user-centric tasks, with failures dominated by perception and memory deficiencies and near-zero success under environment changes.