arXiv preprint arXiv:2602.08025 (2026)

Ye, Y · 2026 · arXiv 2602.08025

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?

cs.CV · 2026-06-03 · unverdicted · novelty 7.0

Dream.exe evaluates 8 video generation models on 101 manipulation tasks by converting generated videos into executable robot trajectories in a simulator, finding measurable success rates that visual metrics do not predict.

WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation

cs.CV · 2026-05-25 · unverdicted · novelty 7.0

WBench is a benchmark with 289 test cases and 1,058 turns for evaluating interactive world models using 22 automated metrics validated against human judgments.

World Models as Group Actions

cs.CV · 2026-05-23 · unverdicted · novelty 7.0

Formalizes video world models as group actions on states and uses latent regularization with synthesized supervision to enforce consistency, introducing GAC and GAR metrics that improve structural correctness in SOTA models.

WorldMark: A Unified Benchmark Suite for Interactive Video World Models

cs.CV · 2026-04-23 · unverdicted · novelty 7.0

WorldMark is the first public benchmark that standardizes scenes, trajectories, and control interfaces across heterogeneous interactive image-to-video world models.

WorldOdysseyBench: An Open-World Benchmark for Long-Horizon Stability of Interactive World Models

cs.CV · 2026-06-30 · unverdicted · novelty 6.0 · 2 refs

WorldOdysseyBench introduces four new evaluation dimensions and metrics for interactive world models and shows that none of 10+ tested models reliably pass all of them.

Current World Models Lack a Persistent State Core

cs.CV · 2026-06-18 · unverdicted · novelty 6.0

Current world models fail to evolve internal state when unobserved and instead resume scenes at the last observed state, as diagnosed by the new WRBench benchmark across 23 models and 9600 videos.

Geometry-Aware Implicit Memory for Video World Models

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

GIM-World adds a camera-queryable geometry distillation head and pruning rule to implicit memory in video world models, claiming better long-horizon geometric consistency on the MIND benchmark than explicit and implicit baselines.

PhysRAG: Enhancing Physics-Awareness in Video Generation via Retrieval-Augmented Generation

cs.CV · 2026-06-25 · unverdicted · novelty 5.0

PhysRAG curates 7K videos from WISA-80K, builds a physical video database, and injects knowledge via learnable queries into a diffusion model to reach SOTA visual quality and physical compliance on PhyGenBench and VBench.

WorldOlympiad: Can Your World Model Survive a Triathlon?

cs.CV · 2026-06-09 · unverdicted · novelty 5.0

WorldOlympiad is a new benchmark decomposing world-model evaluation into physical, geometry, and interaction tracks using segmentation, MLLM judges, Gaussian splatting, and action prompts on diverse scenarios.

citing papers explorer

Showing 9 of 9 citing papers after filters.

Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation? cs.CV · 2026-06-03 · unverdicted · none · ref 30
Dream.exe evaluates 8 video generation models on 101 manipulation tasks by converting generated videos into executable robot trajectories in a simulator, finding measurable success rates that visual metrics do not predict.
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation cs.CV · 2026-05-25 · unverdicted · none · ref 24
WBench is a benchmark with 289 test cases and 1,058 turns for evaluating interactive world models using 22 automated metrics validated against human judgments.
World Models as Group Actions cs.CV · 2026-05-23 · unverdicted · none · ref 67
Formalizes video world models as group actions on states and uses latent regularization with synthesized supervision to enforce consistency, introducing GAC and GAR metrics that improve structural correctness in SOTA models.
WorldMark: A Unified Benchmark Suite for Interactive Video World Models cs.CV · 2026-04-23 · unverdicted · none · ref 41
WorldMark is the first public benchmark that standardizes scenes, trajectories, and control interfaces across heterogeneous interactive image-to-video world models.
WorldOdysseyBench: An Open-World Benchmark for Long-Horizon Stability of Interactive World Models cs.CV · 2026-06-30 · unverdicted · none · ref 14 · 2 links
WorldOdysseyBench introduces four new evaluation dimensions and metrics for interactive world models and shows that none of 10+ tested models reliably pass all of them.
Current World Models Lack a Persistent State Core cs.CV · 2026-06-18 · unverdicted · none · ref 27
Current world models fail to evolve internal state when unobserved and instead resume scenes at the last observed state, as diagnosed by the new WRBench benchmark across 23 models and 9600 videos.
Geometry-Aware Implicit Memory for Video World Models cs.CV · 2026-06-01 · unverdicted · none · ref 63
GIM-World adds a camera-queryable geometry distillation head and pruning rule to implicit memory in video world models, claiming better long-horizon geometric consistency on the MIND benchmark than explicit and implicit baselines.
PhysRAG: Enhancing Physics-Awareness in Video Generation via Retrieval-Augmented Generation cs.CV · 2026-06-25 · unverdicted · none · ref 83
PhysRAG curates 7K videos from WISA-80K, builds a physical video database, and injects knowledge via learnable queries into a diffusion model to reach SOTA visual quality and physical compliance on PhyGenBench and VBench.
WorldOlympiad: Can Your World Model Survive a Triathlon? cs.CV · 2026-06-09 · unverdicted · none · ref 47
WorldOlympiad is a new benchmark decomposing world-model evaluation into physical, geometry, and interaction tracks using segmentation, MLLM judges, Gaussian splatting, and action prompts on diverse scenarios.

arXiv preprint arXiv:2602.08025 (2026)

fields

years

verdicts

representative citing papers

citing papers explorer