Introduces GSM8K dataset and demonstrates that verifier-based selection of solutions from multiple candidates outperforms fine-tuning baselines on math word problems.
Ape210K: A large- scale and template-rich dataset of math word problems,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
An integrated survey organizing AI mathematical reasoning into informal, formal, discovery, and technique axes while cataloging benchmarks and assessing failure modes.
citing papers explorer
-
Training Verifiers to Solve Math Word Problems
Introduces GSM8K dataset and demonstrates that verifier-based selection of solutions from multiple candidates outperforms fine-tuning baselines on math word problems.