pith. machine review for the scientific record. sign in

← back to paper

Review history

arxiv: 2604.13010 · 2 revisions

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

  1. 2026-05-11 UNVERDICTED LOW v0.9.0 novelty 6.0
    49025 ms 5613 in 1476 out 2026-05-11T01:00:37.728795+00:00
  2. 2026-05-10 UNVERDICTED LOW v0.9.0 novelty 8.0
    106323 ms 5623 in 1350 out 2026-05-10T15:07:29.492147+00:00