The Potential of the Return Distribution for Exploration in RL

Catholijn M. Jonker; Joost Broekens; Thomas M. Moerland

arxiv: 1806.04242 · v2 · pith:HIWAPEFWnew · submitted 2018-06-11 · 💻 cs.LG · cs.AI· stat.ML

The Potential of the Return Distribution for Exploration in RL

Thomas M. Moerland , Joost Broekens , Catholijn M. Jonker This is my paper

classification 💻 cs.LG cs.AIstat.ML

keywords distributionexplorationreturngaussianlearningpotentialbeenbefore

0 comments

read the original abstract

This paper studies the potential of the return distribution for exploration in deterministic reinforcement learning (RL) environments. We study network losses and propagation mechanisms for Gaussian, Categorical and Gaussian mixture distributions. Combined with exploration policies that leverage this return distribution, we solve, for example, a randomized Chain task of length 100, which has not been reported before when learning with neural networks.

This paper has not been read by Pith yet.

The Potential of the Return Distribution for Exploration in RL

discussion (0)