Efficient Decentralized Learning of Generalized Quantal Response Equilibrium
read the original abstract
We study a solution concept for bounded rational agents in finite normal-form general-sum games called Generalized Quantal Response Equilibrium (GQRE) which generalizes Quantal Response Equilibrium~\citep{mckelvey1995quantal}. In our setup, each player can individually maximize a smooth, regularized expected utility of the mixed profiles used, reflecting both bounded rationality that subsumes stochastic choice and also individual choice of behaviors. After establishing existence under mild conditions, we present a computationally efficient no-regret decentralized learning algorithm via a smoothened version of the Frank--Wolfe algorithm. Our algorithm uses noisy gradient estimates via bandit-feedback from a simulation oracle that reports on repeated plays of the game. We analyze finite-time convergence properties of our algorithm under assumptions that ensure uniqueness of equilibrium, using a novel class of gap functions that generalize the Nash gap. We end by demonstrating the effectiveness of our method on a set of complex general-sum games such as high-rank two-player games, large action two-player games, and known examples of difficult multi-player games.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.