Stable World Model: A data-driven benchmark and model for offline goal-conditioned reinforcement learning

Lucas Maes, Quentin Le Lidec, Dan Haramati, Nassim Massaudi, Damien Scieur, Yann LeCun, Randall Balestriero · 2026 · arXiv 2602.08968

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

Unifying Object-Centric World Models and Diffusion Policy: A Hierarchical Framework for Multi-Stage Robotic Tasks

cs.RO · 2026-06-07 · unverdicted · novelty 5.0

WorldDP combines a high-level object-centric world model for subgoal planning with a low-level diffusion policy for execution, claiming better performance than baselines on multi-stage robotic manipulation benchmarks.

Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics

cs.LG · 2026-05-21 · unverdicted · novelty 5.0

TRM trains a small horizon-matched pairwise head on trajectory data to improve terminal-state ranking in latent MPC, raising success from 7% to 97% on TwoRoom and 32.7% to 84% on PLDM without changing the encoder or dynamics.

Hierarchical Planning with Latent World Models

cs.LG · 2026-04-03

LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

cs.LG · 2026-03-13

citing papers explorer

Showing 4 of 4 citing papers.

Unifying Object-Centric World Models and Diffusion Policy: A Hierarchical Framework for Multi-Stage Robotic Tasks cs.RO · 2026-06-07 · unverdicted · none · ref 19
WorldDP combines a high-level object-centric world model for subgoal planning with a low-level diffusion policy for execution, claiming better performance than baselines on multi-stage robotic manipulation benchmarks.
Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics cs.LG · 2026-05-21 · unverdicted · none · ref 10
TRM trains a small horizon-matched pairwise head on trajectory data to improve terminal-state ranking in latent MPC, raising success from 7% to 97% on TwoRoom and 32.7% to 84% on PLDM without changing the encoder or dynamics.
Hierarchical Planning with Latent World Models cs.LG · 2026-04-03 · unreviewed · ref 31
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels cs.LG · 2026-03-13 · unreviewed · ref 52

Stable World Model: A data-driven benchmark and model for offline goal-conditioned reinforcement learning

fields

years

verdicts

representative citing papers

citing papers explorer