pith. sign in

Better than your teacher: Llm agents that learn from privileged ai feedback

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.AI 1 cs.LG 1

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

What Drives Interactive Improvement from Feedback?

cs.AI · 2026-06-29 · unverdicted · novelty 7.0

Controlled student-teacher experiments across four benchmarks show interactive gains are driven more by the student's ability to use feedback than by teacher quality, with self-feedback adding little beyond unguided retries.

citing papers explorer

Showing 2 of 2 citing papers.

  • What Drives Interactive Improvement from Feedback? cs.AI · 2026-06-29 · unverdicted · none · ref 2

    Controlled student-teacher experiments across four benchmarks show interactive gains are driven more by the student's ability to use feedback than by teacher quality, with self-feedback adding little beyond unguided retries.

  • Autoregressive Diffusion World Models for Off-Policy Evaluation of LLM Agents cs.LG · 2026-06-04 · unverdicted · none · ref 3

    ADWM learns a latent diffusion world model with per-transition independent denoising and policy-conditioned guidance to enable accurate offline evaluation of LLM agent policies.