pith. sign in

Rlhf from heterogeneous feedback via personalization and preference aggregation

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

years

2026 6

verdicts

UNVERDICTED 6

representative citing papers

Hidden Consensus:Preference-Validity Compression in Human Feedback

cs.CL · 2026-06-09 · unverdicted · novelty 6.0

Empirical study of Malaysian preference judgments finds that 79% of prompts have multiple majority-supported responses discarded by single-winner aggregation, indicating measurement of argmax rather than plural alignment.

PAFO: Pareto Fairness Optimization for Personalized Reward Modeling

cs.AI · 2026-06-06 · unverdicted · novelty 5.0

PAFO applies Pareto fairness optimization and group-specialized distillation to produce a single personalized reward model that improves accuracy for both majority and minority preference groups without requiring group labels at inference.

citing papers explorer

Showing 6 of 6 citing papers.