CoPersona introduces a multiplex persona graph for facet-level peer alignment and a dual-branch retrieval-plus-reasoning architecture to improve LLM personalization under sparse and biased user interaction data.
Rlhf from heterogeneous feedback via personalization and preference aggregation
6 Pith papers cite this work. Polarity classification is still indexing.
years
2026 6verdicts
UNVERDICTED 6representative citing papers
Empirical study of Malaysian preference judgments finds that 79% of prompts have multiple majority-supported responses discarded by single-winner aggregation, indicating measurement of argmax rather than plural alignment.
Introduces a feasible-reward-set approach to IRL with multiple heterogeneous suboptimal demonstrators, proving monotonic shrinkage of the joint set and two recovery guarantees for the ground-truth optimal reward.
Recursive generative retraining with heterogeneous rewards converges to a stable distribution satisfying a weighted Nash bargaining solution, preserving diversity under stated conditions.
PAFO applies Pareto fairness optimization and group-specialized distillation to produce a single personalized reward model that improves accuracy for both majority and minority preference groups without requiring group labels at inference.
Transformer model with response-time auxiliary input adapts reward models to unseen human preference domains via in-context learning from demonstrations.
citing papers explorer
-
CoPersona: Collaborative Persona Graphs for Robust LLM Personalization
CoPersona introduces a multiplex persona graph for facet-level peer alignment and a dual-branch retrieval-plus-reasoning architecture to improve LLM personalization under sparse and biased user interaction data.
-
Hidden Consensus:Preference-Validity Compression in Human Feedback
Empirical study of Malaysian preference judgments finds that 79% of prompts have multiple majority-supported responses discarded by single-winner aggregation, indicating measurement of argmax rather than plural alignment.
-
Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach
Introduces a feasible-reward-set approach to IRL with multiple heterogeneous suboptimal demonstrators, proving monotonic shrinkage of the joint set and two recovery guarantees for the ground-truth optimal reward.
-
Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences
Recursive generative retraining with heterogeneous rewards converges to a stable distribution satisfying a weighted Nash bargaining solution, preserving diversity under stated conditions.
-
PAFO: Pareto Fairness Optimization for Personalized Reward Modeling
PAFO applies Pareto fairness optimization and group-specialized distillation to produce a single personalized reward model that improves accuracy for both majority and minority preference groups without requiring group labels at inference.
-
In-Context Reward Adaptation for Robust Preference Modeling
Transformer model with response-time auxiliary input adapts reward models to unseen human preference domains via in-context learning from demonstrations.