Learning to reason with curriculum i: Provable benefits of autocurriculum.arXiv preprint arXiv:2603.18325,

Nived Rajaraman, Audrey Huang, Miro Dudik, Robert Schapire, Dylan J Foster, Akshay Krishnamurthy · arXiv 2603.18325

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.LG · 2026-05-23 · unverdicted · novelty 7.0

CurveRL derives a quantile-coordinate reweighting rule from a utility functional on pass rates and shows it outperforms GRPO on reasoning benchmarks.

Showing 1 of 1 citing paper.

CurveRL: Principled Distribution-Aware Context Reweighting for LLM Reasoning cs.LG · 2026-05-23 · unverdicted · none · ref 24
CurveRL derives a quantile-coordinate reweighting rule from a utility functional on pass rates and shows it outperforms GRPO on reasoning benchmarks.