Temporal difference learning for model predictive control.arXiv preprint arXiv:2203.04955

Nicklas Hansen, Xiaolong Wang, Hao Su · 2022 · arXiv 2203.04955

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

JEDI is the first online end-to-end latent diffusion world model that trains latents from denoising loss rather than reconstruction, achieving competitive Atari100k results with 43% less VRAM and over 3x faster sampling than pixel diffusion baselines.

Learning Visual Feature-Based World Models via Residual Latent Action

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.

TRAP: Tail-aware Ranking Attack for World-Model Planning

cs.LG · 2026-05-03 · unverdicted · novelty 6.0

TRAP is a tail-aware ranking attack that plants a backdoor in world models so that a trigger causes the model to reorder a few critical imagined trajectories and redirect planning while preserving normal behavior on clean inputs.

RAY-TOLD: Ray-Based Latent Dynamics for Dense Dynamic Obstacle Avoidance with TDMPC

cs.RO · 2026-04-30 · unverdicted · novelty 6.0

RAY-TOLD combines ray-based latent dynamics from LiDAR with MPPI control and a learned policy prior via mixture sampling to lower collision rates in high-density dynamic obstacle environments compared to standard MPPI.

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

cs.AI · 2026-01-22 · conditional · novelty 6.0

Single-stage fine-tuning of a video model to generate actions as latent frames plus future states and values yields state-of-the-art robot policy performance on LIBERO, RoboCasa, and bimanual tasks.

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

cs.AI · 2025-06-11 · unverdicted · novelty 6.0

V-JEPA 2 pre-trained on massive unlabeled video achieves strong results on motion understanding and action anticipation, SOTA video QA at 8B scale, and enables zero-shot robotic planning on Franka arms using only 62 hours of unlabeled robot video.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Learning Visual Feature-Based World Models via Residual Latent Action cs.CV · 2026-05-08 · unverdicted · none · ref 60
RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.

Temporal difference learning for model predictive control.arXiv preprint arXiv:2203.04955

fields

years

verdicts

representative citing papers

citing papers explorer