pith. sign in

Example GSM8K problems and solutions We include several examples to provide a qualitative sense of the task and learned model behavior

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2022 1

verdicts

UNVERDICTED 1

representative citing papers

Solving math word problems with process- and outcome-based feedback

cs.LG · 2022-11-25 · unverdicted · novelty 6.0

On GSM8K, outcome-based supervision achieves similar final-answer error rates to process-based with less labeling, but process-based or learned reward models are needed to reach 3.4% reasoning error among correct solutions.

citing papers explorer

Showing 1 of 1 citing paper.

  • Solving math word problems with process- and outcome-based feedback cs.LG · 2022-11-25 · unverdicted · none · ref 43

    On GSM8K, outcome-based supervision achieves similar final-answer error rates to process-based with less labeling, but process-based or learned reward models are needed to reach 3.4% reasoning error among correct solutions.