CLAP reduces planning error on challenging driving scenarios by 24% on NAVSIM using contrastive latent-space prompt optimization on frozen VLA models with no regression on normal frames.
Unidrivevla: Unifying understanding, perception, and action planning for autonomous driving
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
DAWN couples a world predictor with a world-conditioned action denoiser in latent space so that each refines the other recursively, yielding strong planning and safety results on autonomous driving benchmarks.
CoPhy is a new RL framework that distills VLM cognition into BEV encoders, adds an auto-regressive BEV world model for action-conditioned future prediction, and optimizes policies via GRPO with dual physical-cognitive rewards, claiming SOTA on NAVSIM v1/v2.
citing papers explorer
-
CLAP: Contrastive Latent-space Prompt Optimization for End-to-end Autonomous Driving
CLAP reduces planning error on challenging driving scenarios by 24% on NAVSIM using contrastive latent-space prompt optimization on frozen VLA models with no regression on normal frames.
-
The DAWN of World-Action Interactive Models
DAWN couples a world predictor with a world-conditioned action denoiser in latent space so that each refines the other recursively, yielding strong planning and safety results on autonomous driving benchmarks.
-
Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving
CoPhy is a new RL framework that distills VLM cognition into BEV encoders, adds an auto-regressive BEV world model for action-conditioned future prediction, and optimizes policies via GRPO with dual physical-cognitive rewards, claiming SOTA on NAVSIM v1/v2.