{"work":{"id":"eed9adfb-3fa1-4c0c-abb9-e31b79bba031","openalex_id":null,"doi":null,"arxiv_id":"2511.00088","raw_key":null,"title":"Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail","authors":null,"authors_text":"NVIDIA: Yan Wang, Wenjie Luo, Junjie Bai, Yulong Cao, Tong Che, Ke Chen","year":2025,"venue":"cs.RO","abstract":"End-to-end architectures trained via imitation learning have advanced autonomous driving by scaling model size and data, yet performance remains brittle in safety-critical long-tail scenarios where supervision is sparse and causal understanding is limited. We introduce Alpamayo-R1 (AR1), a vision-language-action model (VLA) that integrates Chain of Causation reasoning with trajectory planning for complex driving scenarios. Our approach features three key innovations: (1) the Chain of Causation (CoC) dataset, built through a hybrid auto-labeling and human-in-the-loop pipeline producing decision-grounded, causally linked reasoning traces aligned with driving behaviors; (2) a modular VLA architecture combining Cosmos-Reason, a vision-language model pre-trained for Physical AI, with a diffusion-based trajectory decoder that generates dynamically feasible trajectories in real time; (3) a multi-stage training strategy using supervised fine-tuning to elicit reasoning and reinforcement learning (RL) to enforce reasoning-action consistency and optimize reasoning quality. AR1 achieves up to a 12% improvement in planning accuracy on challenging cases compared to a trajectory-only baseline, with a 35% reduction in close encounter rate in closed-loop simulation. RL post-training improves reasoning quality by 45% and reasoning-action consistency by 37%. Model scaling from 0.5B to 7B parameters shows consistent improvements. On-vehicle road tests confirm real-time performance (99 ms latency) and successful urban deployment. By bridging interpretable reasoning with precise control, AR1 demonstrates a practical path towards Level 4 autonomous driving. Model weights are available at https://huggingface.co/nvidia/Alpamayo-R1-10B with inference code at https://github.com/NVlabs/alpamayo.","external_url":"https://arxiv.org/abs/2511.00088","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T05:55:24.785600+00:00","pith_arxiv_id":"2511.00088","created_at":"2026-05-10T06:11:20.708338+00:00","updated_at":"2026-06-05T21:23:00.469572+00:00","title_quality_ok":true,"display_title":"Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail","render_title":"Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail"},"hub":{"state":{"work_id":"eed9adfb-3fa1-4c0c-abb9-e31b79bba031","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":26,"external_cited_by_count":null,"distinct_field_count":5,"first_pith_cited_at":"2025-05-21T18:24:36+00:00","last_pith_cited_at":"2026-05-21T13:54:29+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-10T23:17:16.443734+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":7}],"polarity_counts":[{"context_polarity":"background","n":6},{"context_polarity":"unclear","n":1}],"runs":{},"summary":{},"graph":{},"authors":[]}}