RLHF should decompose annotations into dimensions each matched to one of three models—extension, evidence, or authority—instead of applying a single unified pipeline.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
years
2026 3verdicts
UNVERDICTED 3representative citing papers
MultEval supports collaborative creation of LLM-as-a-judge criteria by surfacing disagreements via consensus-building methods, allowing iterative revisions with examples and history, and keeping transparent how human judgments become automated rules.
A participatory design effort at FAccT used in-person sessions and Polis polling to co-create governance input and demonstrate scalable co-design for critical AI communities.
citing papers explorer
-
Three Models of RLHF Annotation: Extension, Evidence, and Authority
RLHF should decompose annotations into dimensions each matched to one of three models—extension, evidence, or authority—instead of applying a single unified pipeline.