C-BPO personalizes LLMs via preference-calibrated binary signals and PU learning theory to isolate inter-user differences from shared task knowledge.
The Journal of Machine Learning Research , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework
C-BPO personalizes LLMs via preference-calibrated binary signals and PU learning theory to isolate inter-user differences from shared task knowledge.
- Towards a holistic understanding of Selection Bias for Causal Effect Identification