pith. machine review for the scientific record. sign in

arxiv: 1907.00664 · v1 · submitted 2019-07-01 · 💻 cs.LG · stat.ML

Recognition: unknown

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning

Authors on Pith no claims yet
classification 💻 cs.LG stat.ML
keywords graphpivotaltaskslearningstatesworldaccelerateapproach
0
0 comments X
read the original abstract

In many real-world scenarios, an autonomous agent often encounters various tasks within a single complex environment. We propose to build a graph abstraction over the environment structure to accelerate the learning of these tasks. Here, nodes are important points of interest (pivotal states) and edges represent feasible traversals between them. Our approach has two stages. First, we jointly train a latent pivotal state model and a curiosity-driven goal-conditioned policy in a task-agnostic manner. Second, provided with the information from the world graph, a high-level Manager quickly finds solution to new tasks and expresses subgoals in reference to pivotal states to a low-level Worker. The Worker can then also leverage the graph to easily traverse to the pivotal states of interest, even across long distance, and explore non-locally. We perform a thorough ablation study to evaluate our approach on a suite of challenging maze tasks, demonstrating significant advantages from the proposed framework over baselines that lack world graph knowledge in terms of performance and efficiency.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Graph World Models: Concepts, Taxonomy, and Future Directions

    cs.AI 2026-04 unverdicted novelty 7.0

    The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.