Learning to reach goals via iterated supervised learning

Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine · 2020 · arXiv 1912.06088

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

representative citing papers

Multi-scale Predictive Representations for Goal-conditioned Reinforcement Learning

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Ms.PR applies multi-scale predictive supervision to enforce goal-directed alignment in latent spaces for offline GCRL, yielding improved representation quality and performance on vision and state-based tasks.

Predictive but Not Plannable: RC-aux for Latent World Models

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

RC-aux corrects spatiotemporal mismatch in reconstruction-free latent world models by adding multi-horizon prediction and reachability supervision, improving planning performance on goal-conditioned pixel-control tasks.

Refining Compositional Diffusion for Reliable Long-Horizon Planning

cs.RO · 2026-05-04 · unverdicted · novelty 6.0

RCD steers compositional diffusion sampling toward high-density coherent plans by combining reconstruction-error guidance with overlap consistency, outperforming prior methods on locomotion, manipulation, and pixel-based long-horizon tasks.

GCImOpt: Learning efficient goal-conditioned policies by imitating optimal trajectories

cs.RO · 2026-04-24 · unverdicted · novelty 5.0

GCImOpt trains compact goal-conditioned neural policies by imitating efficiently generated optimal trajectories, achieving high success rates and near-optimal performance on cart-pole, quadcopter, and robot arm tasks while running thousands of times faster than optimization solvers.

From Answers to Arguments: Toward Trustworthy Clinical Diagnostic Reasoning with Toulmin-Guided Curriculum Goal-Conditioned Learning

cs.AI · 2026-04-13 · unverdicted · novelty 5.0

CGCL progressively trains LLMs to generate Toulmin-structured clinical diagnostic arguments across three curriculum stages, achieving accuracy and reasoning quality comparable to RL methods with improved stability and efficiency.

Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning

cs.LG · 2026-04-10 · unverdicted · novelty 5.0

Proposes mean flow policies and LeJEPA loss to overcome Gaussian policy limits and weak subgoal generation in hierarchical offline GCRL, reporting strong results on OGBench state and pixel tasks.

citing papers explorer

Showing 6 of 6 citing papers.

Multi-scale Predictive Representations for Goal-conditioned Reinforcement Learning cs.LG · 2026-05-10 · unverdicted · none · ref 13
Ms.PR applies multi-scale predictive supervision to enforce goal-directed alignment in latent spaces for offline GCRL, yielding improved representation quality and performance on vision and state-based tasks.
Predictive but Not Plannable: RC-aux for Latent World Models cs.LG · 2026-05-08 · unverdicted · none · ref 13
RC-aux corrects spatiotemporal mismatch in reconstruction-free latent world models by adding multi-horizon prediction and reachability supervision, improving planning performance on goal-conditioned pixel-control tasks.
Refining Compositional Diffusion for Reliable Long-Horizon Planning cs.RO · 2026-05-04 · unverdicted · none · ref 24
RCD steers compositional diffusion sampling toward high-density coherent plans by combining reconstruction-error guidance with overlap consistency, outperforming prior methods on locomotion, manipulation, and pixel-based long-horizon tasks.
GCImOpt: Learning efficient goal-conditioned policies by imitating optimal trajectories cs.RO · 2026-04-24 · unverdicted · none · ref 3
GCImOpt trains compact goal-conditioned neural policies by imitating efficiently generated optimal trajectories, achieving high success rates and near-optimal performance on cart-pole, quadcopter, and robot arm tasks while running thousands of times faster than optimization solvers.
From Answers to Arguments: Toward Trustworthy Clinical Diagnostic Reasoning with Toulmin-Guided Curriculum Goal-Conditioned Learning cs.AI · 2026-04-13 · unverdicted · none · ref 1
CGCL progressively trains LLMs to generate Toulmin-structured clinical diagnostic arguments across three curriculum stages, achieving accuracy and reasoning quality comparable to RL methods with improved stability and efficiency.
Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning cs.LG · 2026-04-10 · unverdicted · none · ref 18
Proposes mean flow policies and LeJEPA loss to overcome Gaussian policy limits and weak subgoal generation in hierarchical offline GCRL, reporting strong results on OGBench state and pixel tasks.

Learning to reach goals via iterated supervised learning

fields

years

verdicts

representative citing papers

citing papers explorer