pith. sign in

P1: Mastering physics olympiads with reinforcement learning

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

citation-role summary

baseline 1

citation-polarity summary

fields

cs.CL 1 cs.LG 1

years

2026 2

verdicts

UNVERDICTED 2

roles

baseline 1

polarities

baseline 1

representative citing papers

TEMPO: Scaling Test-time Training for Large Reasoning Models

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

TEMPO scales test-time training for large reasoning models by interleaving policy refinement on unlabeled data with critic recalibration on labeled data via an EM formulation, yielding large gains on AIME tasks.

citing papers explorer

Showing 2 of 2 citing papers.