Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) , year=

Measuring Mathematical Problem Solving With the MATH Dataset , author=

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval

cs.AI · 2026-04-20 · accept · novelty 8.0

MathNet delivers the largest multilingual Olympiad math dataset and benchmarks where models like Gemini-3.1-Pro reach 78% on solving but embedding models struggle on equivalent problem retrieval, with retrieval augmentation yielding up to 12% gains.

Validity-Calibrated Reasoning Distillation

cs.LG · 2026-04-14 · unverdicted · novelty 7.0

Validity-calibrated reasoning distillation improves transfer of reasoning skills by modulating updates based on relative local validity of next steps instead of enforcing full trajectory imitation.

Data Difficulty and the Generalization--Extrapolation Tradeoff in LLM Fine-Tuning

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

Optimal data difficulty for LLM supervised fine-tuning shifts toward harder examples as data budget increases due to the generalization-extrapolation tradeoff.

Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving

cs.CL · 2026-04-22

citing papers explorer

Showing 4 of 4 citing papers.

MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval cs.AI · 2026-04-20 · accept · none · ref 34
MathNet delivers the largest multilingual Olympiad math dataset and benchmarks where models like Gemini-3.1-Pro reach 78% on solving but embedding models struggle on equivalent problem retrieval, with retrieval augmentation yielding up to 12% gains.
Validity-Calibrated Reasoning Distillation cs.LG · 2026-04-14 · unverdicted · none · ref 14
Validity-calibrated reasoning distillation improves transfer of reasoning skills by modulating updates based on relative local validity of next steps instead of enforcing full trajectory imitation.
Data Difficulty and the Generalization--Extrapolation Tradeoff in LLM Fine-Tuning cs.LG · 2026-05-13 · unverdicted · none · ref 32
Optimal data difficulty for LLM supervised fine-tuning shifts toward harder examples as data budget increases due to the generalization-extrapolation tradeoff.
Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving cs.CL · 2026-04-22 · unreviewed · ref 60

Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) , year=

fields

years

verdicts

representative citing papers

citing papers explorer