pith. the verified trust layer for science. sign in
Pith Number

pith:ZLDIFVCE

pith:2025:ZLDIFVCE5L7K3T5DLSWSPHGKPY
not attested not anchored not stored refs pending

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Fan Yang, Li Lyna Zhang, Mao Yang, Ning Shang, Xinyu Guan, Yifei Liu, Yi Zhu, Youran Sun

Small language models reach expert math reasoning by evolving their own search and evaluation processes over repeated rounds.

arxiv:2501.04519 v1 · 2025-01-08 · cs.CL

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Through 4 rounds of self-evolution with millions of synthesized solutions for 747k math problems, rStar-Math boosts SLMs' math reasoning to state-of-the-art levels. On the MATH benchmark, it improves Qwen2.5-Math-7B from 58.8% to 90.0% and Phi3-mini-3.8B from 41.4% to 86.4%, surpassing o1-preview by +4.5% and +0.9%.

C2weakest assumption

The process preference model trained on self-generated trajectories provides unbiased, accurate step-level guidance during MCTS search and does not overfit to patterns in the synthesized data or the specific benchmarks used for evaluation.

C3one line summary

Small LLMs reach 90% on the MATH benchmark and solve 53% of AIME problems by self-evolving through MCTS with a process preference model, surpassing o1-preview without distillation from larger models.

Formal links

2 machine-checked theorem links

Cited by

22 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:15.094996Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

cac682d444eafeadcfa35cad279cca7e0fd9d97d688f4665766cae59e4018d90

Aliases

arxiv: 2501.04519 · arxiv_version: 2501.04519v1 · doi: 10.48550/arxiv.2501.04519 · pith_short_12: ZLDIFVCE5L7K · pith_short_16: ZLDIFVCE5L7K3T5D · pith_short_8: ZLDIFVCE
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ZLDIFVCE5L7K3T5DLSWSPHGKPY \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: cac682d444eafeadcfa35cad279cca7e0fd9d97d688f4665766cae59e4018d90
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "82ab060f96a357f36dba7338071dab1efb58065f8b576e613279dabaa82e228f",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by-nc-sa/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-01-08T14:12:57Z",
    "title_canon_sha256": "2fbcdee4b0e850ff9f844008d3b960df9e2f2ad5a5c20fa8623f24b32d173b53"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2501.04519",
    "kind": "arxiv",
    "version": 1
  }
}