Massively Parallel Methods for Deep Reinforcement Learning

Arun Nair , Praveen Srinivasan , Sam Blackwell , Cagdas Alcicek , Rory Fearon , Alessandro De Maria , Vedavyas Panneershelvam , Mustafa Suleyman

show 6 more authors

Charles Beattie Stig Petersen Shane Legg Volodymyr Mnih Koray Kavukcuoglu David Silver

Authors on Pith no claims yet

classification 💻 cs.LG cs.AIcs.DCcs.NE

keywords distributedgamesarchitecturedeeplearningparallelalgorithmbehaviour

0 comments

read the original abstract

We present the first massively distributed architecture for deep reinforcement learning. This architecture uses four main components: parallel actors that generate new behaviour; parallel learners that are trained from stored experience; a distributed neural network to represent the value function or behaviour policy; and a distributed store of experience. We used our architecture to implement the Deep Q-Network algorithm (DQN). Our distributed algorithm was applied to 49 games from Atari 2600 games from the Arcade Learning Environment, using identical hyperparameters. Our performance surpassed non-distributed DQN in 41 of the 49 games and also reduced the wall-time required to achieve these results by an order of magnitude on most games.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
cs.LG 2019-11 accept novelty 8.0

MuZero matches or exceeds AlphaZero-level performance in Go, Chess, Shogi and sets a new state of the art on 57 Atari games by learning a model that directly supports planning rather than reconstructing full environme...