pith. sign in

hub

Measuring mathematical problem solving with the math dataset

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

hub tools

citation-role summary

dataset 3 background 1

citation-polarity summary

clear filters

representative citing papers

Emergent Slow Thinking in LLMs as Inverse Tree Freezing

cs.AI · 2025-09-28 · unverdicted · novelty 6.0

RLVR drives a concept network in LLMs through nucleation and freezing into inverse trees that support slow thinking, and intervening with brief SFT at peak frustration outperforms standard RLVR while post-freeze SFT causes forgetting.

RewardBench 2: Advancing Reward Model Evaluation

cs.CL · 2025-06-02 · unverdicted · novelty 6.0

RewardBench 2 is a new benchmark that supplies challenging fresh human prompts for reward model evaluation, yielding lower average scores but higher correlation with downstream best-of-N sampling and RLHF training performance.

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

cs.CL · 2023-09-29 · conditional · novelty 6.0

ToRA trains language models on interactive tool-use trajectories with imitation learning and output shaping to integrate reasoning and external tools, yielding 13-19% gains on math datasets and new highs like 44.6% on MATH for a 7B model.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning cs.AI · 2026-05-21 · unverdicted · none · ref 23

    PathCal calibrates reasoning paths by type-aware soft rebalancing of reflection-marker logits at uncertain states, yielding better efficiency-performance trade-offs on six benchmarks.

  • Emergent Slow Thinking in LLMs as Inverse Tree Freezing cs.AI · 2025-09-28 · unverdicted · none · ref 18

    RLVR drives a concept network in LLMs through nucleation and freezing into inverse trees that support slow thinking, and intervening with brief SFT at peak frustration outperforms standard RLVR while post-freeze SFT causes forgetting.