pith. sign in

super hub Canonical reference

WebGPT: Browser-assisted question-answering with human feedback

Canonical reference. 92% of citing Pith papers cite this work as background.

199 Pith papers citing it
Background 92% of classified citations
abstract

We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using behavior cloning, and then performing rejection sampling against a reward model trained to predict human preferences. This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit.

hub tools

citation-role summary

background 49 method 1

citation-polarity summary

claims ledger

  • abstract We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using

authors

co-cited works

clear filters

representative citing papers

Revisable by Design: A Theory of Streaming LLM Agent Execution

cs.LG · 2026-04-25 · unverdicted · novelty 8.0

LLM agents achieve greater flexibility during execution by classifying actions via a reversibility taxonomy and using an Earliest-Conflict Rollback algorithm that matches full-restart quality while wasting far less completed work.

Discovering Latent Knowledge in Language Models Without Supervision

cs.CL · 2022-12-07 · conditional · novelty 8.0

An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.

Innovation: An Almost Characterization of Hallucination

cs.LG · 2026-05-26 · unverdicted · novelty 7.0

Introduces the 'innovation' property of LLMs and proves it is an almost characterization of hallucination while deriving new lower bounds on hallucination rates via missing mass.

The Behavioral Credibility Trilemma: When Calibrated Autonomy Becomes Impossible

cs.LG · 2026-05-25 · unverdicted · novelty 7.0

No confidence-gated RL policy can achieve maximum helpfulness, optimal calibration, and full autonomy under rational oversight when tasks exceed the agent's competence, because non-affine autonomy incentives destroy strict properness of scoring rules and cause confidence inflation.

Towards Camera-Robust 3D Localization: Equation-Anchored Tool-Use for MLLMs

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

Proposes an equation-anchored tool-use method for MLLMs that writes the pinhole back-projection equation in Chain-of-Thought and substitutes retrieved camera intrinsics and depths to achieve robustness in 3D object detection and visual grounding under rescaled intrinsics.

ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents

cs.AI · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

ClawForge is a generator framework that creates reproducible executable benchmarks for command-line agents under state conflict, with ClawForge-Bench showing frontier models reach at most 45.3% strict accuracy and that state inspection drives most performance gaps.

PolitNuggets: Benchmarking Agentic Discovery of Long-Tail Political Facts

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

PolitNuggets is a multilingual benchmark showing that AI agents struggle with fine-grained accuracy and efficiency when discovering long-tail political facts for elite biographies, linking performance to short-context extraction, multilingual robustness, and tool use.

Identifying AI Web Scrapers Using Canary Tokens

cs.CR · 2026-05-13 · conditional · novelty 7.0

Unique canary tokens served to visiting scrapers can be recovered from LLM outputs to identify which scrapers feed data to which of 22 tested production LLMs.

citing papers explorer

Showing 1 of 1 citing paper after filters.