Scalable data synthesis for computer use agents with step-level filtering.arXiv preprint arXiv:2512.10962, 2025

Yifei He, Pranit Chawla, Yaser Souri, Subhojit Som, Xia Song · 2025 · arXiv 2512.10962

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

cs.LG · 2026-06-01 · unverdicted · novelty 6.0

OpenWebRL trains a 4B visual web agent with online RL on live sites using 0.4K init trajectories and 2.2K RL tasks to reach 67% success on Online-Mind2Web and 64% on DeepShop, outperforming prior open agents.

Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents

cs.LG · 2026-05-27 · unverdicted · novelty 6.0

LearnWeak specializes small CUAs via weakness detection by a reference agent, targeted task synthesis, and error-aware training, delivering 11+ point gains on OSWorld.

citing papers explorer

Showing 2 of 2 citing papers after filters.

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents cs.LG · 2026-06-01 · unverdicted · none · ref 20
OpenWebRL trains a 4B visual web agent with online RL on live sites using 0.4K init trajectories and 2.2K RL tasks to reach 67% success on Online-Mind2Web and 64% on DeepShop, outperforming prior open agents.
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents cs.LG · 2026-05-27 · unverdicted · none · ref 11
LearnWeak specializes small CUAs via weakness detection by a reference agent, targeted task synthesis, and error-aware training, delivering 11+ point gains on OSWorld.

Scalable data synthesis for computer use agents with step-level filtering.arXiv preprint arXiv:2512.10962, 2025

fields

years

verdicts

representative citing papers

citing papers explorer