Criticbench: Benchmarking llms for critique-correct rea- soning

Lin, Z · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models

cs.AI · 2026-05-02 · unverdicted · novelty 6.0

GR-Ben is a new process-level benchmark that evaluates error detection by PRMs and LLMs in science and logic reasoning, showing weaker performance outside mathematics.

citing papers explorer

Showing 1 of 1 citing paper.

GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models cs.AI · 2026-05-02 · unverdicted · none · ref 8
GR-Ben is a new process-level benchmark that evaluates error detection by PRMs and LLMs in science and logic reasoning, showing weaker performance outside mathematics.

Criticbench: Benchmarking llms for critique-correct rea- soning

fields

years

verdicts

representative citing papers

citing papers explorer