Title resolution pending

Imitating language via scalable inverse reinforcement learning · arXiv 2409.01369

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

cs.CL · 2026-06-08 · unverdicted · novelty 4.0

SG-OPD adds sign-consistency gating and phased teacher sampling to on-policy distillation, reporting average gains of 1.98 per sample and 7.50 per question over standard OPD on math benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling cs.CL · 2026-06-08 · unverdicted · none · ref 9
SG-OPD adds sign-consistency gating and phased teacher sampling to on-policy distillation, reporting average gains of 1.98 per sample and 7.50 per question over standard OPD on math benchmarks.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer