pith. machine review for the scientific record. sign in

hub

The unreasonable effectiveness of entropy minimization in llm reasoning

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

hub tools

citation-role summary

method 1

citation-polarity summary

years

2026 10 2025 1

roles

method 1

polarities

use method 1

clear filters

representative citing papers

Can LLMs Learn to Reason Robustly under Noisy Supervision?

cs.LG · 2026-04-05 · conditional · novelty 6.0

Online Label Refinement lets LLMs learn robust reasoning from noisy supervision by correcting labels when majority answers show rising rollout success and stable history, delivering 3-4% gains on math and reasoning benchmarks even at high noise levels.

citing papers explorer

Showing 3 of 3 citing papers after filters.