pith. sign in

arxiv: 1408.1484 · v1 · pith:DG3IK3BEnew · submitted 2014-08-07 · 💻 cs.AI

Learning to Cooperate via Policy Search

classification 💻 cs.AI
keywords cooperativegamesobservableagentslearningmethodmethodspartially
0
0 comments X
read the original abstract

Cooperative games are those in which both agents share the same payoff structure. Value-based reinforcement-learning algorithms, such as variants of Q-learning, have been applied to learning cooperative games, but they only apply when the game state is completely observable to both agents. Policy search methods are a reasonable alternative to value-based methods for partially observable environments. In this paper, we provide a gradient-based distributed policy-search method for cooperative games and compare the notion of local optimum to that of Nash equilibrium. We demonstrate the effectiveness of this method experimentally in a small, partially observable simulated soccer domain.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.