pith. sign in

hub Canonical reference

Laminar: A scalable asyn- chronous rl post-training framework

Canonical reference. 100% of citing Pith papers cite this work as background.

15 Pith papers citing it
Background 100% of classified citations

hub tools

citation-role summary

background 5

citation-polarity summary

years

2026 14 2025 1

roles

background 5

polarities

background 5

representative citing papers

AsyncOPD: How Stale Can On-Policy Distillation Be?

cs.LG · 2026-06-23 · conditional · novelty 6.0

AsyncOPD shows asynchronous OPD training reaches 1.6-3.8x higher throughput than synchronous baselines with comparable accuracy by using forward-KL estimators and multi-sample Monte Carlo correction for finite teacher caches.

Libra: Efficient Resource Management for Agentic RL Post-Training

cs.LG · 2026-06-02 · unverdicted · novelty 4.0

Libra optimizes GPU allocation across rollout and training in agentic RL via an elastic hybrid pool and C-MLFQ scheduler based on tool-return causal signals, claiming up to 3.0x throughput and 2.5x faster reward convergence on 48 A800 GPUs.

citing papers explorer

Showing 15 of 15 citing papers.