WorldDP combines a high-level object-centric world model for subgoal planning with a low-level diffusion policy for execution, claiming better performance than baselines on multi-stage robotic manipulation benchmarks.
Stable World Model: A data-driven benchmark and model for offline goal-conditioned reinforcement learning
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
TRM trains a small horizon-matched pairwise head on trajectory data to improve terminal-state ranking in latent MPC, raising success from 7% to 97% on TwoRoom and 32.7% to 84% on PLDM without changing the encoder or dynamics.
citing papers explorer
-
Unifying Object-Centric World Models and Diffusion Policy: A Hierarchical Framework for Multi-Stage Robotic Tasks
WorldDP combines a high-level object-centric world model for subgoal planning with a low-level diffusion policy for execution, claiming better performance than baselines on multi-stage robotic manipulation benchmarks.
-
Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics
TRM trains a small horizon-matched pairwise head on trajectory data to improve terminal-state ranking in latent MPC, raising success from 7% to 97% on TwoRoom and 32.7% to 84% on PLDM without changing the encoder or dynamics.
- Hierarchical Planning with Latent World Models
- LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels