Semantic world models

Jacob Berg, Chuning Zhu, Yanda Bao, Ishan Durugkar, Abhishek Gupta · 2025 · arXiv 2510.19818

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

Learning Visual Feature-Based World Models via Residual Latent Action

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.

Mask World Model: Predicting What Matters for Robust Robot Policy Learning

cs.RO · 2026-04-21 · unverdicted · novelty 7.0

Mask World Model predicts semantic mask dynamics with video diffusion and integrates it with a diffusion policy head, outperforming RGB world models on LIBERO and RLBench while showing better real-world generalization and texture robustness.

LASAR: Towards Spatio-temporal Reasoning with Latent Cognitive Map

cs.CV · 2026-05-16 · unverdicted · novelty 4.0

LASAR pairs a dual-memory system with spatio-temporal contrastive learning to induce latent cognitive maps, reporting 2-3.5% zero-shot gains on VLN-CE and VSI-Bench plus high map self-consistency.

World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications

cs.LG · 2026-05-28 · unverdicted · novelty 3.0

The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.

citing papers explorer

Showing 4 of 4 citing papers.

Learning Visual Feature-Based World Models via Residual Latent Action cs.CV · 2026-05-08 · unverdicted · none · ref 59
RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.
Mask World Model: Predicting What Matters for Robust Robot Policy Learning cs.RO · 2026-04-21 · unverdicted · none · ref 2
Mask World Model predicts semantic mask dynamics with video diffusion and integrates it with a diffusion policy head, outperforming RGB world models on LIBERO and RLBench while showing better real-world generalization and texture robustness.
LASAR: Towards Spatio-temporal Reasoning with Latent Cognitive Map cs.CV · 2026-05-16 · unverdicted · none · ref 3
LASAR pairs a dual-memory system with spatio-temporal contrastive learning to induce latent cognitive maps, reporting 2-3.5% zero-shot gains on VLN-CE and VSI-Bench plus high map self-consistency.
World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications cs.LG · 2026-05-28 · unverdicted · none · ref 229
The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.

Semantic world models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer