pith. sign in

← back to paper

Review history

arxiv: 2604.26360 · 2 revisions

Uncertainty-Aware Reward Discounting for Mitigating Reward Hacking

  1. 2026-07-01 UNVERDICTED LOW v0.9.1-grok novelty 5.0
    41662 ms 5811 in 1269 out 2026-07-01T08:23:02.331199+00:00
  2. 2026-05-07 UNVERDICTED LOW v0.9.0 novelty 6.0
    45050 ms 5543 in 1318 out 2026-05-07T13:40:17.797121+00:00