pith. sign in

Canonical reference

Omninwm: Omniscient driving navigation world models

Canonical reference. 100% of citing Pith papers cite this work as background.

9 Pith papers citing it
Background 100% of classified citations
abstract

Autonomous driving world models are expected to work effectively across three core dimensions: state, action, and reward. However, existing methods are typically restricted to fragmented modality modeling, short-horizon drift, and imprecise action control, while lacking intrinsic mechanisms for policy evaluation. In this paper, we introduce OmniNWM, an Omniscient panoramic Navigation World Model that addresses all three dimensions within a consistent probabilistic framework. For State, OmniNWM generates panoramic videos of RGB, semantics, metric depth, and 3D occupancy, ensuring pixel-level alignment across modalities with joint distribution modeling. To mitigate autoregressive exposure bias, we propose a structured panoramic forcing strategy to stabilize long-horizon generation via stochastic manifold thickening. For Action, we introduce canonical geometric action encoding with normalized panoramic Pl\"ucker ray-maps. This representation decouples motion dynamics from sensor intrinsics, enabling precise, zero-shot trajectory control across heterogeneous datasets and camera configurations. For Reward, we derive intrinsic occupancy-grounded dense rewards directly from generated 3D volumes, establishing a reliable closed-loop simulation cycle for evaluating diverse planning agents. Extensive experiments demonstrate that OmniNWM achieves SOTA performance in generation fidelity and control precision, with remarkable zero-shot robustness to novel scenes on NuPlan and in-house datasets with distinct camera rigs. Project page is available at https://arlo0o.github.io/OmniNWM/.

citation-role summary

background 5

citation-polarity summary

fields

cs.CV 9

years

2026 8 2025 1

roles

background 5

polarities

background 5

representative citing papers

PanoWorld: Geometry-Consistent Panoramic Video World Modeling

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

PanoWorld adds depth consistency and trajectory consistency losses plus spherical adaptations to a pre-trained video model, plus a new PanoGeo dataset, to produce geometry-consistent 360 video.

ReWorld: Learning Better Representations for World Action Models

cs.CV · 2026-06-25 · unverdicted · novelty 5.0

ReWorld applies future-predictive, cross-modal, and hard-negative supervision directly to intermediate representations in Video and Action DiTs for WAMs, reporting 23.9% FVD improvement and PDMS rise from 89.1 to 90.4 on nuScenes and NAVSIM.

citing papers explorer

Showing 9 of 9 citing papers.