Os-symphony: A holistic framework for robust and generalist computer-using agent

Os-symphony: A holistic framework for robust, generalist computer-using agent · 2024 · arXiv 2601.07779

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

MacAgentBench: Benchmarking AI Agents on Real-World macOS Desktop

cs.AI · 2026-06-21 · unverdicted · novelty 7.0

MacAgentBench is a new benchmark for macOS AI agents with 676 tasks, deterministic multi-checkpoint evaluation, and tests across frameworks showing skill libraries drive performance more than framework design.

A History-Aware Visually Grounded Critic for Computer Use Agents

cs.AI · 2026-06-09 · unverdicted · novelty 7.0

HiViG is a test-time critic that combines macro-action history summarization with visual grounding of execution coordinates to reduce short-sighted and visually erroneous actions in long-horizon GUI agents.

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

ToolCUA introduces a trajectory scaling pipeline and staged RL to optimize GUI-tool switching, reaching 46.85% accuracy on OSWorld-MCP for a 66% relative gain over baseline.

VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation

cs.CL · 2026-04-23 · conditional · novelty 6.0

VLAA-GUI adds mandatory visual verifiers, multi-tier loop breakers, and on-demand search to GUI agents, reaching 77.5% on OSWorld and 61.0% on WindowsAgentArena with some models exceeding human performance.

Retrieval, Reward, and Training Protocols: What Matters in Training Search Agents?

cs.CL · 2026-05-27 · unverdicted · novelty 5.0

Controlled empirical study shows correcting Wikipedia data coverage yields larger gains than algorithm differences in LLM search agent training, with outcome-based rewards competitive.

citing papers explorer

Showing 3 of 3 citing papers after filters.

MacAgentBench: Benchmarking AI Agents on Real-World macOS Desktop cs.AI · 2026-06-21 · unverdicted · none · ref 5
MacAgentBench is a new benchmark for macOS AI agents with 676 tasks, deterministic multi-checkpoint evaluation, and tests across frameworks showing skill libraries drive performance more than framework design.
A History-Aware Visually Grounded Critic for Computer Use Agents cs.AI · 2026-06-09 · unverdicted · none · ref 13
HiViG is a test-time critic that combines macro-action history summarization with visual grounding of execution coordinates to reduce short-sighted and visually erroneous actions in long-horizon GUI agents.
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents cs.AI · 2026-05-12 · unverdicted · none · ref 51
ToolCUA introduces a trajectory scaling pipeline and staged RL to optimize GUI-tool switching, reaching 46.85% accuracy on OSWorld-MCP for a 66% relative gain over baseline.

Os-symphony: A holistic framework for robust and generalist computer-using agent

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer