Rlhf from heterogeneous feedback via personalization and preference aggregation

Chanwoo Park, Mingyang Liu, Dingwen Kong, Kaiqing Zhang, Asuman Ozdaglar · 2024 · arXiv 2405.00254

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

CoPersona: Collaborative Persona Graphs for Robust LLM Personalization

cs.IR · 2026-07-01 · unverdicted · novelty 6.0

CoPersona introduces a multiplex persona graph for facet-level peer alignment and a dual-branch retrieval-plus-reasoning architecture to improve LLM personalization under sparse and biased user interaction data.

Hidden Consensus:Preference-Validity Compression in Human Feedback

cs.CL · 2026-06-09 · unverdicted · novelty 6.0

Empirical study of Malaysian preference judgments finds that 79% of prompts have multiple majority-supported responses discarded by single-winner aggregation, indicating measurement of argmax rather than plural alignment.

Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

Introduces a feasible-reward-set approach to IRL with multiple heterogeneous suboptimal demonstrators, proving monotonic shrinkage of the joint set and two recovery guarantees for the ground-truth optimal reward.

Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences

cs.LG · 2026-05-08 · unverdicted · novelty 6.0 · 2 refs

Recursive generative retraining with heterogeneous rewards converges to a stable distribution satisfying a weighted Nash bargaining solution, preserving diversity under stated conditions.

PAFO: Pareto Fairness Optimization for Personalized Reward Modeling

cs.AI · 2026-06-06 · unverdicted · novelty 5.0

PAFO applies Pareto fairness optimization and group-specialized distillation to produce a single personalized reward model that improves accuracy for both majority and minority preference groups without requiring group labels at inference.

In-Context Reward Adaptation for Robust Preference Modeling

cs.LG · 2026-05-28 · unverdicted · novelty 5.0

Transformer model with response-time auxiliary input adapts reward models to unseen human preference domains via in-context learning from demonstrations.

citing papers explorer

Showing 6 of 6 citing papers.

CoPersona: Collaborative Persona Graphs for Robust LLM Personalization cs.IR · 2026-07-01 · unverdicted · none · ref 42
CoPersona introduces a multiplex persona graph for facet-level peer alignment and a dual-branch retrieval-plus-reasoning architecture to improve LLM personalization under sparse and biased user interaction data.
Hidden Consensus:Preference-Validity Compression in Human Feedback cs.CL · 2026-06-09 · unverdicted · none · ref 2
Empirical study of Malaysian preference judgments finds that 79% of prompts have multiple majority-supported responses discarded by single-winner aggregation, indicating measurement of argmax rather than plural alignment.
Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach cs.LG · 2026-05-29 · unverdicted · none · ref 4
Introduces a feasible-reward-set approach to IRL with multiple heterogeneous suboptimal demonstrators, proving monotonic shrinkage of the joint set and two recovery guarantees for the ground-truth optimal reward.
Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences cs.LG · 2026-05-08 · unverdicted · none · ref 1 · 2 links
Recursive generative retraining with heterogeneous rewards converges to a stable distribution satisfying a weighted Nash bargaining solution, preserving diversity under stated conditions.
PAFO: Pareto Fairness Optimization for Personalized Reward Modeling cs.AI · 2026-06-06 · unverdicted · none · ref 38
PAFO applies Pareto fairness optimization and group-specialized distillation to produce a single personalized reward model that improves accuracy for both majority and minority preference groups without requiring group labels at inference.
In-Context Reward Adaptation for Robust Preference Modeling cs.LG · 2026-05-28 · unverdicted · none · ref 10
Transformer model with response-time auxiliary input adapts reward models to unseen human preference domains via in-context learning from demonstrations.

Rlhf from heterogeneous feedback via personalization and preference aggregation

fields

years

verdicts

representative citing papers

citing papers explorer