CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.
Transactions on Machine Learning Research , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MI-EPO maximizes joint conditional mutual information among responses, feedback, and preference vectors, using probabilistic routing to improve alignment and controllability in multi-objective LLM optimization.
citing papers explorer
-
Common-agency Games for Multi-Objective Test-Time Alignment
CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.
-
Multi-Objective Exploration and Preference Optimization via Mutual Information
MI-EPO maximizes joint conditional mutual information among responses, feedback, and preference vectors, using probabilistic routing to improve alignment and controllability in multi-objective LLM optimization.