pith. machine review for the scientific record. sign in

hub

The unreasonable effectiveness of entropy minimization in llm reasoning

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

hub tools

citation-role summary

method 1

citation-polarity summary

years

2026 9 2025 1

roles

method 1

polarities

use method 1

representative citing papers

Can LLMs Learn to Reason Robustly under Noisy Supervision?

cs.LG · 2026-04-05 · conditional · novelty 6.0

Online Label Refinement lets LLMs learn robust reasoning from noisy supervision by correcting labels when majority answers show rising rollout success and stable history, delivering 3-4% gains on math and reasoning benchmarks even at high noise levels.

citing papers explorer

Showing 10 of 10 citing papers.