A Laplacian Framework for Option Discovery in Reinforcement Learning

Marc G. Bellemare; Marlos C. Machado; Michael Bowling

arxiv: 1703.00956 · v2 · pith:52JR544Rnew · submitted 2017-03-02 · 💻 cs.LG · cs.AI

A Laplacian Framework for Option Discovery in Reinforcement Learning

Marlos C. Machado , Marc G. Bellemare , Michael Bowling This is my paper

classification 💻 cs.LG cs.AI

keywords learningdiscoveryeigenpurposesoptionoptionsdifferentdiscoveredfunctions

0 comments

read the original abstract

Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned representations. The options discovered from eigenpurposes traverse the principal directions of the state space. They are useful for multiple tasks because they are discovered without taking the environment's rewards into consideration. Moreover, different options act at different time scales, making them helpful for exploration. We demonstrate features of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
cs.LG 2019-06 unverdicted novelty 6.0

RL policies decompose into information-regularized primitives that compete by requesting state information amounts, with the greediest one acting, yielding better generalization than flat or hierarchical baselines.