pith. sign in

hub Canonical reference

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Canonical reference. 71% of citing Pith papers cite this work as background.

17 Pith papers citing it
Background 71% of classified citations
abstract

Reinforcement learning (RL) agents improve through trial-and-error, but when reward is sparse and the agent cannot discover successful action sequences, learning stagnates. This has been a notable problem in training deep RL agents to perform web-based tasks, such as booking flights or replying to emails, where a single mistake can ruin the entire sequence of actions. A common remedy is to "warm-start" the agent by pre-training it to mimic expert demonstrations, but this is prone to overfitting. Instead, we propose to constrain exploration using demonstrations. From each demonstration, we induce high-level "workflows" which constrain the allowable actions at each time step to be similar to those in the demonstration (e.g., "Step 1: click on a textbox; Step 2: enter some text"). Our exploration policy then learns to identify successful workflows and samples actions that satisfy these workflows. Workflows prune out bad exploration directions and accelerate the agent's ability to discover rewards. We use our approach to train a novel neural policy designed to handle the semi-structured nature of websites, and evaluate on a suite of web tasks, including the recent World of Bits benchmark. We achieve new state-of-the-art results, and show that workflow-guided exploration improves sample efficiency over behavioral cloning by more than 100x.

hub tools

citation-role summary

background 6 dataset 1

citation-polarity summary

representative citing papers

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

Open 4B and 8B visual web agents achieve state-of-the-art results on browser benchmarks by predicting actions from screenshots and instructions, outperforming similar open models and some closed larger-model agents, with full release of data and code planned.

Real-Time Procedural Learning From Experience for AI Agents

cs.AI · 2025-11-27 · unverdicted · novelty 6.0

PRAXIS enables AI agents to acquire procedural knowledge in real time by indexing and retrieving state-action-result experiences, leading to better accuracy, reliability, and efficiency on web browsing benchmarks.

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

cs.HC · 2024-01-17 · unverdicted · novelty 6.0

SeeClick improves visual GUI agents via GUI grounding pre-training on automatically curated data and introduces the ScreenSpot benchmark, with results indicating that stronger grounding boosts downstream task performance.

Gym-Anything: Turn any Software into an Agent Environment

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

Gym-Anything turns arbitrary software into agent environments via multi-agent setup and auditing, creating CUA-World with 10K+ long-horizon tasks and showing that trajectory distillation plus test-time auditing improves small VLMs.

citing papers explorer

Showing 17 of 17 citing papers.