pith. sign in

← back to paper

Review history

arxiv: 2604.04894 · 2 revisions

Asymmetric Advantage Modulation Calibrates Entropy Dynamics in RLVR

  1. 2026-05-13 CONDITIONAL LOW v0.9.0 novelty 6.0
    28840 ms 5567 in 1249 out 2026-05-13T07:56:16.131085+00:00
  2. 2026-05-10 UNVERDICTED LOW v0.9.0 novelty 7.0
    47962 ms 5557 in 1180 out 2026-05-10T19:56:43.309907+00:00