← back to paper
arxiv: 2604.26326 · 2 revisions
Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control