Recognition: unknown
State Space Decomposition and Subgoal Creation for Transfer in Deep Reinforcement Learning
read the original abstract
Typical reinforcement learning (RL) agents learn to complete tasks specified by reward functions tailored to their domain. As such, the policies they learn do not generalize even to similar domains. To address this issue, we develop a framework through which a deep RL agent learns to generalize policies from smaller, simpler domains to more complex ones using a recurrent attention mechanism. The task is presented to the agent as an image and an instruction specifying the goal. This meta-controller guides the agent towards its goal by designing a sequence of smaller subtasks on the part of the state space within the attention, effectively decomposing it. As a baseline, we consider a setup without attention as well. Our experiments show that the meta-controller learns to create subgoals within the attention.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Modular Reinforcement Learning For Cooperative Swarms
Modular decomposition of interaction states allows distributed RL for cooperative robot swarms to scale without combinatorial memory explosion in foraging simulations.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.