Do not refer- ence the HLE dataset, the specific academic subfield, or any specialized terminology that would require domain expertise to understand

Abstract, Generalize: Step back from the specific technical framing of the HLE question

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

SCRuB: Social Concept Reasoning under Rubric-Based Evaluation

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

SCRuB shows frontier LLMs outperform human experts across a five-dimensional critical thinking rubric for social concept reasoning, winning 80.8% of 1,170 pairwise expert judgments and indicating single-turn evaluation saturation.

citing papers explorer

Showing 1 of 1 citing paper after filters.

SCRuB: Social Concept Reasoning under Rubric-Based Evaluation cs.AI · 2026-05-07 · unverdicted · none · ref 11
SCRuB shows frontier LLMs outperform human experts across a five-dimensional critical thinking rubric for social concept reasoning, winning 80.8% of 1,170 pairwise expert judgments and indicating single-turn evaluation saturation.

Do not refer- ence the HLE dataset, the specific academic subfield, or any specialized terminology that would require domain expertise to understand

fields

years

verdicts

representative citing papers

citing papers explorer