Decision-Making with Lightweight Confidence-Aware Language Model for Autonomous Driving

Jun Ma; Mingxing Peng; Pei Liu; Ruiguo Zhong; Rui Yang; Ruoyu Yao

arxiv: 2605.25393 · v1 · pith:TP4OLOTQnew · submitted 2026-05-25 · 💻 cs.RO

Decision-Making with Lightweight Confidence-Aware Language Model for Autonomous Driving

Ruoyu Yao , Ruiguo Zhong , Pei Liu , Mingxing Peng , Rui Yang , Jun Ma This is my paper

Pith reviewed 2026-06-29 21:58 UTC · model grok-4.3

classification 💻 cs.RO

keywords autonomous drivinglanguage modeldecision makingmodel distillationchain of thoughtconfidence awarenuplan benchmarklightweight model

0 comments

The pith

A lightweight dual-head language model distilled from multi-agent CoT demonstrations achieves state-of-the-art closed-loop success rates on the nuPlan benchmark in both regular and long-tail scenarios while keeping inference latency low.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper seeks to make language-model reasoning practical for autonomous driving by replacing slow, heavy models with a compact version that still produces reliable decisions and explanations. It first runs a team of agents that vote on actions, score their own confidence, and summarize step-by-step reasoning to create clean training examples. Those examples are then used to fine-tune a small dual-head model that outputs both an action probability and a short rationale, with retrieval augmentation added for better data use. If the distillation works as described, autonomous systems could gain open-world reasoning without the compute cost that currently blocks deployment.

Core claim

The central claim is that a multi-agent workflow of action voting, confidence assessment, and summarization agents can generate high-quality, confidence-annotated Chain-of-Thought decision demonstrations that, when distilled via confidence-aware fine-tuning and Retrieval Augmented Generation into a lightweight dual-head language model, enable joint prediction of decision probabilities and textual rationales, producing state-of-the-art success rates on the nuPlan benchmark in both regular and long-tail scenarios at low inference latency.

What carries the argument

The multi-agent collaborative workflow (action voting, confidence assessment, summarization) that produces confidence-annotated CoT demonstrations for distillation into the dual-head lightweight model.

If this is right

The approach reaches state-of-the-art success rates in regular driving scenarios on nuPlan.
The approach reaches state-of-the-art success rates in long-tail driving scenarios on nuPlan.
Inference latency remains low enough for resource-constrained autonomous driving systems.
Retrieval Augmented Generation improves the model's adaptability and data efficiency during fine-tuning.
The dual-head design allows simultaneous output of action probabilities and textual rationales.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same distillation pattern could be tested in other robotics domains that need both an action and a human-readable justification from a small model.
Running the system on physical vehicles instead of simulation would test whether simulation-only demonstrations transfer without additional degradation.
The per-decision confidence scores produced by the model could be monitored in real time to trigger fallback controllers when uncertainty rises.

Load-bearing premise

The demonstrations created by the multi-agent workflow are high enough in quality and accurately annotated with confidence that distillation into the smaller model preserves performance without major loss.

What would settle it

Closed-loop nuPlan experiments in which the distilled lightweight model either falls short of prior SOTA success rates in long-tail scenarios or shows inference latency too high for real-time control.

Figures

Figures reproduced from arXiv: 2605.25393 by Jun Ma, Mingxing Peng, Pei Liu, Ruiguo Zhong, Rui Yang, Ruoyu Yao.

**Figure 1.** Figure 1: An illustration of our framework, which mainly consists of a multi-agent workflow for collecting memories featuring confidence-aware multimodal [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Qualitative demonstrations of our approach in two representative scenarios. For better visualization, the scores of different planning trajectories [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Quantitative effects of the number of few-shot examples on accuracy, [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Large Language Models (LLMs) and Multimodal LLMs (MLLMs) have demonstrated immense potential in autonomous driving (AD) by offering human-like reasoning and open-world generalization. However, the excessive computational overhead and high inference latency of these massive models severely hinder their deployment in resource-constrained AD systems. To address this challenge, we propose a novel decision-making framework utilizing a lightweight confidence-aware language model, which bridges the gap between complex multimodal intention reasoning and efficient inference. Specifically, we design a multi-agent collaborative workflow, comprising action voting, confidence assessment, and summarization agents, to generate high-quality, confidence-annotated decision demonstrations via explicit Chain-of-Thought (CoT) reasoning. These demonstrations are then distilled into a lightweight language model featuring a dual-head architecture, enabling the joint prediction of decision probabilities and the generation of textual rationales. The distillation is realized via a confidence-aware fine-tuning strategy coupled with Retrieval Augmented Generation (RAG) to enhance the model's adaptability and data efficiency. Comprehensive closed-loop experiments on the nuPlan benchmark demonstrate that our approach achieves state-of-the-art (SOTA) success rates in both regular and long-tail scenarios while maintaining low inference latency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract describes distilling multi-agent LLM reasoning into a lightweight dual-head model for AD decisions, but provides no metrics to support the SOTA claim.

read the letter

The punchline is that the work applies known techniques like multi-agent collaboration and distillation to create a small model for autonomous driving, but the lack of any reported metrics in the abstract leaves the performance claims untestable.

The paper does introduce a specific setup with agents handling action voting, confidence assessment, and summarization to produce CoT-annotated data. This data then trains a dual-head language model that predicts both decisions and explanations. Adding RAG and confidence-aware fine-tuning is a sensible step to improve the small model's output.

This approach targets a genuine bottleneck in using LLMs for AD: their size makes them unsuitable for onboard use. By focusing on closed-loop nuPlan tests for both normal and rare cases, the authors pick the right evaluation setting.

The main weakness is the complete absence of quantitative results. The abstract says it achieves SOTA success rates with low latency, yet there are no numbers, no baseline comparisons, and no mention of model size or exact latency. Without those, it's not possible to know whether the distillation succeeds or falls short. The assumption that the multi-agent data transfers well to the small model is key but unsupported here.

The abstract also lacks any citations, which makes it hard to see how this differs from prior distillation or RAG work in the field.

This is aimed at the autonomous driving community interested in efficient AI methods. Someone building systems that need both reasoning and speed could take ideas from the pipeline.

Given that the idea engages honestly with deployment realities, it deserves peer review. The full paper probably contains the experiments that would allow proper evaluation.

I would recommend sending it out for review, but flag that the experimental validation needs to be scrutinized closely.

Referee Report

1 major / 0 minor

Summary. The paper proposes a decision-making framework for autonomous driving that employs a multi-agent collaborative workflow (action voting, confidence assessment, summarization) to generate high-quality, confidence-annotated decision demonstrations via explicit Chain-of-Thought reasoning. These demonstrations are distilled into a lightweight dual-head language model that jointly predicts decision probabilities and generates textual rationales, using a confidence-aware fine-tuning strategy combined with Retrieval Augmented Generation (RAG). Closed-loop experiments on the nuPlan benchmark are claimed to achieve state-of-the-art success rates in both regular and long-tail scenarios while maintaining low inference latency.

Significance. If the empirical claims hold with proper validation, the work could meaningfully advance efficient deployment of reasoning-capable models in resource-constrained autonomous driving systems by reducing inference overhead while preserving open-world generalization and safety-critical performance.

major comments (1)

[Abstract] Abstract: the central claim of achieving SOTA success rates on nuPlan is asserted without any quantitative metrics, baselines, error bars, ablation results, or experimental details, rendering the primary performance contribution impossible to evaluate or verify from the provided manuscript text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their feedback. We address the concern about the abstract below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of achieving SOTA success rates on nuPlan is asserted without any quantitative metrics, baselines, error bars, ablation results, or experimental details, rendering the primary performance contribution impossible to evaluate or verify from the provided manuscript text.

Authors: We agree that the abstract, as currently written, asserts the SOTA claim without supporting numerical values or experimental details, which reduces its standalone verifiability. The full manuscript contains the requested information in the experiments section (closed-loop nuPlan results, baseline comparisons, regular vs. long-tail scenarios, latency measurements, and ablations). To address the referee's point directly, we will revise the abstract to incorporate key quantitative metrics, baseline references, and a brief mention of the evaluation protocol. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an empirical framework for distilling multi-agent CoT demonstrations into a lightweight dual-head LM, evaluated via closed-loop nuPlan experiments. No equations, parameter fittings, derivations, or self-citation chains appear in the abstract or described methods. Claims rest on external benchmark performance rather than any reduction of outputs to inputs by construction. This is the common case of a self-contained empirical contribution with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities; assessment limited to high-level description.

pith-pipeline@v0.9.1-grok · 5748 in / 1200 out tokens · 52967 ms · 2026-06-29T21:58:13.283366+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

LUNA-AD: Lightweight Uncertainty-Aware Language Model with Lifelong Learning for Autonomous Driving
cs.RO 2026-06 unverdicted novelty 4.0

LUNA-AD introduces a tri-system model with multi-agent hypothesis exploration, distilled lightweight inference, and reflection-driven lifelong learning that claims state-of-the-art success rates on nuPlan benchmarks w...

Reference graph

Works this paper leans on

42 extracted references · 16 canonical work pages · cited by 1 Pith paper · 8 internal anchors

[1]

A survey of motion planning and control techniques for self-driving urban vehicles,

B. Paden, M. ˇC´ap, S. Z. Yong, D. Yershov, and E. Frazzoli, “A survey of motion planning and control techniques for self-driving urban vehicles,”IEEE Transactions on Intelligent Vehicles, vol. 1, no. 1, pp. 33–55, 2016

2016
[2]

Active interaction in driv- ing: An intention-aware decision-making for autonomous vehicles,

Y . Zhang, Y . Zhu, L. Xiong, and C. Tang, “Active interaction in driv- ing: An intention-aware decision-making for autonomous vehicles,” inthe Proceedings of IEEE International Conference on Intelligent Transportation Systems, 2024, pp. 2266–2271

2024
[3]

A safe hierarchical planning framework for complex driving scenarios based on reinforce- ment learning,

J. Li, L. Sun, J. Chen, M. Tomizuka, and W. Zhan, “A safe hierarchical planning framework for complex driving scenarios based on reinforce- ment learning,” inthe Proceedings of IEEE International Conference on Robotics and Automation, 2021, pp. 2660–2666

2021
[4]

Interactive decision-making integrating graph neural networks and model predictive control for autonomous driving,

K. Yang, S. Li, M. Wang, and X. Tang, “Interactive decision-making integrating graph neural networks and model predictive control for autonomous driving,”IEEE Transactions on Intelligent Transportation Systems, vol. 26, no. 5, pp. 6991 – 7005, 2025

2025
[5]

DiLu: A knowledge-driven approach to autonomous driv- ing with large language models,

L. Wen, D. Fu, X. Li, X. Cai, M. Tao, P. Cai, M. Dou, B. Shi, L. He, and Y . Qiao, “DiLu: A knowledge-driven approach to autonomous driv- ing with large language models,” inthe Proceedings of International Conference on Learning Representations, 2024

2024
[6]

Continuously learning, adapting, and improving: A dual-process approach to autonomous driving,

J. Mei, Y . Ma, X. Yang, L. Wen, X. Cai, X. Li, D. Fu, B. Zhang, P. Cai, M. Douet al., “Continuously learning, adapting, and improving: A dual-process approach to autonomous driving,”Advances in Neural Information Processing Systems, vol. 37, pp. 123 261–123 290, 2024

2024
[7]

Towards interactive and learnable cooperative driving automation: A large language model-driven decision-making framework,

S. Fang, J. Liu, M. Ding, Y . Cui, C. Lv, P. Hang, and J. Sun, “Towards interactive and learnable cooperative driving automation: A large language model-driven decision-making framework,”IEEE Transactions on Vehicular Technology, vol. 74, no. 8, pp. 11 894– 11 905, 2025

2025
[8]

A survey on multimodal large language models for autonomous driving,

C. Cui, Y . Ma, X. Cao, W. Ye, Y . Zhou, K. Liang, J. Chen, J. Lu, Z. Yang, K.-D. Liaoet al., “A survey on multimodal large language models for autonomous driving,” inthe Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 958– 979

2024
[9]

LanguageMPC: Large language models as decision makers for autonomous driving,

H. Sha, Y . Mu, Y . Jiang, L. Chen, C. Xu, P. Luo, S. E. Li, M. Tomizuka, W. Zhan, and M. Ding, “LanguageMPC: Large language models as decision makers for autonomous driving,”arXiv preprint arXiv:2310.03026, 2023

work page arXiv 2023
[10]

LMDrive: Closed-loop end-to-end driving with large language models,

H. Shao, Y . Hu, L. Wang, G. Song, S. L. Waslander, Y . Liu, and H. Li, “LMDrive: Closed-loop end-to-end driving with large language models,” inthe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 15 120–15 130

2024
[11]

Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving

B. Jiang, S. Chen, B. Liao, X. Zhang, W. Yin, Q. Zhang, C. Huang, W. Liu, and X. Wang, “Senna: Bridging large vision-language models and end-to-end autonomous driving,”arXiv preprint arXiv:2410.22313, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[12]

DSDrive: Distilling large language model for lightweight end-to-end autonomous driving with unified reasoning and planning,

W. Liu, P. Liu, and J. Ma, “DSDrive: Distilling large language model for lightweight end-to-end autonomous driving with unified reasoning and planning,”arXiv preprint arXiv:2505.05360, 2025

work page arXiv 2025
[13]

Vtgnet: A vision-based trajectory generation network for autonomous vehicles in urban environments,

P. Cai, Y . Sun, H. Wang, and M. Liu, “Vtgnet: A vision-based trajectory generation network for autonomous vehicles in urban environments,” IEEE Transactions on Intelligent Vehicles, vol. 6, no. 3, pp. 419–429, 2020

2020
[14]

Desire: Distant future prediction in dynamic scenes with interacting agents,

N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. Torr, and M. Chan- draker, “Desire: Distant future prediction in dynamic scenes with interacting agents,” inthe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 336–345

2017
[15]

CALMM-Drive: Confidence-aware autonomous driving with large multimodal model,

R. Yao, Y . Wang, H. Liu, R. Yang, Z. Peng, L. Zhu, and J. Ma, “CALMM-Drive: Confidence-aware autonomous driving with large multimodal model,”arXiv preprint arXiv:2412.04209, 2024

work page arXiv 2024
[16]

ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation

H. Fu, D. Zhang, Z. Zhao, J. Cui, D. Liang, C. Zhang, D. Zhang, H. Xie, B. Wang, and X. Bai, “Orion: A holistic end-to-end au- tonomous driving framework by vision-language instructed action generation,”arXiv preprint arXiv:2503.19755, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[17]

Fast and slow thinking,

D. Kahneman, “Fast and slow thinking,”Allen Lane and Penguin Books, New York, 2011

2011
[18]

GPT-Driver: Learning to Drive with GPT

J. Mao, Y . Qian, J. Ye, H. Zhao, and Y . Wang, “GPT-Driver: Learning to drive with GPT,”arXiv preprint arXiv:2310.01415, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[19]

Drive like a human: Rethinking autonomous driving with large language models,

D. Fu, X. Li, L. Wen, M. Dou, P. Cai, B. Shi, and Y . Qiao, “Drive like a human: Rethinking autonomous driving with large language models,” inthe Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 910–919

2024
[20]

A Survey on Knowledge Distillation of Large Language Models

X. Xu, M. Li, C. Tao, T. Shen, R. Cheng, J. Li, C. Xu, D. Tao, and T. Zhou, “A survey on knowledge distillation of large language models,”arXiv preprint arXiv:2402.13116, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[21]

LeapV AD: A leap in autonomous driving via cognitive per- ception and dual-process thinking,

Y . Ma, T. Wei, N. Zhong, J. Mei, T. Hu, L. Wen, X. Yang, B. Shi, and Y . Liu, “LeapV AD: A leap in autonomous driving via cognitive per- ception and dual-process thinking,”arXiv preprint arXiv:2501.08168, 2025

work page arXiv 2025
[22]

GenAD: Generative end-to-end autonomous driving,

W. Zheng, R. Song, X. Guo, C. Zhang, and L. Chen, “GenAD: Generative end-to-end autonomous driving,” inthe Proceedings of European Conference on Computer Vision, 2024, pp. 87–104

2024
[23]

Diffusion-based planning for autonomous driving with flexible guidance,

Y . Zheng, R. Liang, K. Zheng, J. Zheng, L. Mao, J. Li, W. Gu, R. Ai, S. E. Li, X. Zhanet al., “Diffusion-based planning for autonomous driving with flexible guidance,” inthe Proceedings of International Conference on Learning Representations, 2025

2025
[24]

Coplanner: An interactive motion planner with contingency-aware diffusion for autonomous driving,

R. Zhong, R. Yao, P. Liu, X. Chen, R. Yang, and J. Ma, “Coplanner: An interactive motion planner with contingency-aware diffusion for autonomous driving,”arXiv preprint arXiv:2509.17080, 2025

work page arXiv 2025
[25]

HE-Drive: Human-like end-to-end driving with vision language models,

J. Wang, X. Zhang, Z. Xing, S. Gu, X. Guo, Y . Hu, Z. Song, Q. Zhang, X. Long, and W. Yin, “HE-Drive: Human-like end-to-end driving with vision language models,”arXiv preprint arXiv:2410.05051, 2024

work page arXiv 2024
[26]

Distilling the Knowledge in a Neural Network

G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[27]

Knowledge distillation: A survey,

J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,”International journal of computer vision, vol. 129, no. 6, pp. 1789–1819, 2021

2021
[28]

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

C.-Y . Hsieh, C.-L. Li, C.-K. Yeh, H. Nakhost, Y . Fujii, A. Ratner, R. Krishna, C.-Y . Lee, and T. Pfister, “Distilling step-by-step! outper- forming larger language models with less training data and smaller model sizes,”arXiv preprint arXiv:2305.02301, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[29]

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Z. Li, K. Li, S. Wang, S. Lan, Z. Yu, Y . Ji, Z. Li, Z. Zhu, J. Kautz, Z. Wuet al., “Hydra-MDP: End-to-end multimodal planning with multi-target hydra-distillation,”arXiv preprint arXiv:2406.06978, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[30]

DistillDrive: End-to-end multi-mode autonomous driving distillation by isomorphic hetero-source planning model,

R. Yu, X. Zhang, R. Zhao, H. Yan, and M. Wang, “DistillDrive: End-to-end multi-mode autonomous driving distillation by isomorphic hetero-source planning model,” inthe Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 26 188– 26 197

2025
[31]

Enhancing trust in large language models with uncertainty-aware fine-tuning,

Y . Zhou, P. Xu, X. Wang, B. An, Y . Niu, and X. Liu, “Enhancing trust in large language models with uncertainty-aware fine-tuning,” inthe Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, 2024, pp. 19 648–19 656

2024
[32]

Can LLMs express their uncertainty? An empirical evaluation of confidence elicitation in LLMs,

M. Xiong, Z. Hu, X. Lu, Y . LI, J. Fu, J. He, and B. Hooi, “Can LLMs express their uncertainty? An empirical evaluation of confidence elicitation in LLMs,” inthe Proceedings of International Conference on Learning Representations, 2024

2024
[33]

Conformity in large language models,

X. Zhu, C. Zhang, T. Stafford, N. Collier, and A. Vlachos, “Conformity in large language models,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025, pp. 3854–3872

2025
[34]

Diffusion-ES: Gradient-free planning with diffusion for autonomous and instruction-guided driving,

B. Yang, H. Su, N. Gkanatsios, T.-W. Ke, A. Jain, J. Schneider, and K. Fragkiadaki, “Diffusion-ES: Gradient-free planning with diffusion for autonomous and instruction-guided driving,” inthe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, 2024, pp. 15 342–15 353

2024
[35]

Generalized force model of traffic dynam- ics,

D. Helbing and B. Tilch, “Generalized force model of traffic dynam- ics,”Physical review E, vol. 58, no. 1, p. 133, 1998

1998
[36]

Parting with misconceptions about learning-based vehicle motion planning,

D. Dauner, M. Hallgarten, A. Geiger, and K. Chitta, “Parting with misconceptions about learning-based vehicle motion planning,” inthe Proceedings of Conference on Robot Learning, 2023, pp. 1268–1281

2023
[37]

Urban Driver: Learning to drive from real-world demonstrations using policy gradients,

O. Scheel, L. Bergamini, M. Wolczyk, B. Osi ´nski, and P. Ondruska, “Urban Driver: Learning to drive from real-world demonstrations using policy gradients,” inthe Proceedings of Conference on Robot Learning, 2022, pp. 718–728

2022
[38]

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

H. Caesar, J. Kabzan, K. S. Tan, W. K. Fong, E. Wolff, A. Lang, L. Fletcher, O. Beijbom, and S. Omari, “nuPlan: A closed-loop ML- based planning benchmark for autonomous vehicles,”arXiv preprint arXiv:2106.11810, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[39]

Rethinking imitation-based planners for autonomous driving,

J. Cheng, Y . Chen, X. Mei, B. Yang, B. Li, and M. Liu, “Rethinking imitation-based planners for autonomous driving,” inthe Proceedings of IEEE International Conference on Robotics and Automation, 2024, pp. 14 123–14 130

2024
[40]

GameFormer: Game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving,

Z. Huang, H. Liu, and C. Lv, “GameFormer: Game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving,” inthe Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, 2023, pp. 3903–3913

2023
[41]

PLUTO: Pushing the limit of imitation learning-based planning for autonomous driving,

J. Cheng, Y . Chen, and Q. Chen, “PLUTO: Pushing the limit of imitation learning-based planning for autonomous driving,”arXiv preprint arXiv:2404.14327, 2024

work page arXiv 2024
[42]

PlanAgent: A multi-modal large lan- guage agent for closed-loop vehicle motion planning,

Y . Zheng, Z. Xing, Q. Zhang, B. Jin, P. Li, Y . Zheng, Z. Xia, K. Zhan, X. Lang, Y . Chenet al., “PlanAgent: A multi-modal large lan- guage agent for closed-loop vehicle motion planning,”arXiv preprint arXiv:2406.01587, 2024

work page arXiv 2024

[1] [1]

A survey of motion planning and control techniques for self-driving urban vehicles,

B. Paden, M. ˇC´ap, S. Z. Yong, D. Yershov, and E. Frazzoli, “A survey of motion planning and control techniques for self-driving urban vehicles,”IEEE Transactions on Intelligent Vehicles, vol. 1, no. 1, pp. 33–55, 2016

2016

[2] [2]

Active interaction in driv- ing: An intention-aware decision-making for autonomous vehicles,

Y . Zhang, Y . Zhu, L. Xiong, and C. Tang, “Active interaction in driv- ing: An intention-aware decision-making for autonomous vehicles,” inthe Proceedings of IEEE International Conference on Intelligent Transportation Systems, 2024, pp. 2266–2271

2024

[3] [3]

A safe hierarchical planning framework for complex driving scenarios based on reinforce- ment learning,

J. Li, L. Sun, J. Chen, M. Tomizuka, and W. Zhan, “A safe hierarchical planning framework for complex driving scenarios based on reinforce- ment learning,” inthe Proceedings of IEEE International Conference on Robotics and Automation, 2021, pp. 2660–2666

2021

[4] [4]

Interactive decision-making integrating graph neural networks and model predictive control for autonomous driving,

K. Yang, S. Li, M. Wang, and X. Tang, “Interactive decision-making integrating graph neural networks and model predictive control for autonomous driving,”IEEE Transactions on Intelligent Transportation Systems, vol. 26, no. 5, pp. 6991 – 7005, 2025

2025

[5] [5]

DiLu: A knowledge-driven approach to autonomous driv- ing with large language models,

L. Wen, D. Fu, X. Li, X. Cai, M. Tao, P. Cai, M. Dou, B. Shi, L. He, and Y . Qiao, “DiLu: A knowledge-driven approach to autonomous driv- ing with large language models,” inthe Proceedings of International Conference on Learning Representations, 2024

2024

[6] [6]

Continuously learning, adapting, and improving: A dual-process approach to autonomous driving,

J. Mei, Y . Ma, X. Yang, L. Wen, X. Cai, X. Li, D. Fu, B. Zhang, P. Cai, M. Douet al., “Continuously learning, adapting, and improving: A dual-process approach to autonomous driving,”Advances in Neural Information Processing Systems, vol. 37, pp. 123 261–123 290, 2024

2024

[7] [7]

Towards interactive and learnable cooperative driving automation: A large language model-driven decision-making framework,

S. Fang, J. Liu, M. Ding, Y . Cui, C. Lv, P. Hang, and J. Sun, “Towards interactive and learnable cooperative driving automation: A large language model-driven decision-making framework,”IEEE Transactions on Vehicular Technology, vol. 74, no. 8, pp. 11 894– 11 905, 2025

2025

[8] [8]

A survey on multimodal large language models for autonomous driving,

C. Cui, Y . Ma, X. Cao, W. Ye, Y . Zhou, K. Liang, J. Chen, J. Lu, Z. Yang, K.-D. Liaoet al., “A survey on multimodal large language models for autonomous driving,” inthe Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 958– 979

2024

[9] [9]

LanguageMPC: Large language models as decision makers for autonomous driving,

H. Sha, Y . Mu, Y . Jiang, L. Chen, C. Xu, P. Luo, S. E. Li, M. Tomizuka, W. Zhan, and M. Ding, “LanguageMPC: Large language models as decision makers for autonomous driving,”arXiv preprint arXiv:2310.03026, 2023

work page arXiv 2023

[10] [10]

LMDrive: Closed-loop end-to-end driving with large language models,

H. Shao, Y . Hu, L. Wang, G. Song, S. L. Waslander, Y . Liu, and H. Li, “LMDrive: Closed-loop end-to-end driving with large language models,” inthe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 15 120–15 130

2024

[11] [11]

Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving

B. Jiang, S. Chen, B. Liao, X. Zhang, W. Yin, Q. Zhang, C. Huang, W. Liu, and X. Wang, “Senna: Bridging large vision-language models and end-to-end autonomous driving,”arXiv preprint arXiv:2410.22313, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[12] [12]

DSDrive: Distilling large language model for lightweight end-to-end autonomous driving with unified reasoning and planning,

W. Liu, P. Liu, and J. Ma, “DSDrive: Distilling large language model for lightweight end-to-end autonomous driving with unified reasoning and planning,”arXiv preprint arXiv:2505.05360, 2025

work page arXiv 2025

[13] [13]

Vtgnet: A vision-based trajectory generation network for autonomous vehicles in urban environments,

P. Cai, Y . Sun, H. Wang, and M. Liu, “Vtgnet: A vision-based trajectory generation network for autonomous vehicles in urban environments,” IEEE Transactions on Intelligent Vehicles, vol. 6, no. 3, pp. 419–429, 2020

2020

[14] [14]

Desire: Distant future prediction in dynamic scenes with interacting agents,

N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. Torr, and M. Chan- draker, “Desire: Distant future prediction in dynamic scenes with interacting agents,” inthe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 336–345

2017

[15] [15]

CALMM-Drive: Confidence-aware autonomous driving with large multimodal model,

R. Yao, Y . Wang, H. Liu, R. Yang, Z. Peng, L. Zhu, and J. Ma, “CALMM-Drive: Confidence-aware autonomous driving with large multimodal model,”arXiv preprint arXiv:2412.04209, 2024

work page arXiv 2024

[16] [16]

ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation

H. Fu, D. Zhang, Z. Zhao, J. Cui, D. Liang, C. Zhang, D. Zhang, H. Xie, B. Wang, and X. Bai, “Orion: A holistic end-to-end au- tonomous driving framework by vision-language instructed action generation,”arXiv preprint arXiv:2503.19755, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[17] [17]

Fast and slow thinking,

D. Kahneman, “Fast and slow thinking,”Allen Lane and Penguin Books, New York, 2011

2011

[18] [18]

GPT-Driver: Learning to Drive with GPT

J. Mao, Y . Qian, J. Ye, H. Zhao, and Y . Wang, “GPT-Driver: Learning to drive with GPT,”arXiv preprint arXiv:2310.01415, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[19] [19]

Drive like a human: Rethinking autonomous driving with large language models,

D. Fu, X. Li, L. Wen, M. Dou, P. Cai, B. Shi, and Y . Qiao, “Drive like a human: Rethinking autonomous driving with large language models,” inthe Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 910–919

2024

[20] [20]

A Survey on Knowledge Distillation of Large Language Models

X. Xu, M. Li, C. Tao, T. Shen, R. Cheng, J. Li, C. Xu, D. Tao, and T. Zhou, “A survey on knowledge distillation of large language models,”arXiv preprint arXiv:2402.13116, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[21] [21]

LeapV AD: A leap in autonomous driving via cognitive per- ception and dual-process thinking,

Y . Ma, T. Wei, N. Zhong, J. Mei, T. Hu, L. Wen, X. Yang, B. Shi, and Y . Liu, “LeapV AD: A leap in autonomous driving via cognitive per- ception and dual-process thinking,”arXiv preprint arXiv:2501.08168, 2025

work page arXiv 2025

[22] [22]

GenAD: Generative end-to-end autonomous driving,

W. Zheng, R. Song, X. Guo, C. Zhang, and L. Chen, “GenAD: Generative end-to-end autonomous driving,” inthe Proceedings of European Conference on Computer Vision, 2024, pp. 87–104

2024

[23] [23]

Diffusion-based planning for autonomous driving with flexible guidance,

Y . Zheng, R. Liang, K. Zheng, J. Zheng, L. Mao, J. Li, W. Gu, R. Ai, S. E. Li, X. Zhanet al., “Diffusion-based planning for autonomous driving with flexible guidance,” inthe Proceedings of International Conference on Learning Representations, 2025

2025

[24] [24]

Coplanner: An interactive motion planner with contingency-aware diffusion for autonomous driving,

R. Zhong, R. Yao, P. Liu, X. Chen, R. Yang, and J. Ma, “Coplanner: An interactive motion planner with contingency-aware diffusion for autonomous driving,”arXiv preprint arXiv:2509.17080, 2025

work page arXiv 2025

[25] [25]

HE-Drive: Human-like end-to-end driving with vision language models,

J. Wang, X. Zhang, Z. Xing, S. Gu, X. Guo, Y . Hu, Z. Song, Q. Zhang, X. Long, and W. Yin, “HE-Drive: Human-like end-to-end driving with vision language models,”arXiv preprint arXiv:2410.05051, 2024

work page arXiv 2024

[26] [26]

Distilling the Knowledge in a Neural Network

G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[27] [27]

Knowledge distillation: A survey,

J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,”International journal of computer vision, vol. 129, no. 6, pp. 1789–1819, 2021

2021

[28] [28]

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

C.-Y . Hsieh, C.-L. Li, C.-K. Yeh, H. Nakhost, Y . Fujii, A. Ratner, R. Krishna, C.-Y . Lee, and T. Pfister, “Distilling step-by-step! outper- forming larger language models with less training data and smaller model sizes,”arXiv preprint arXiv:2305.02301, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[29] [29]

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Z. Li, K. Li, S. Wang, S. Lan, Z. Yu, Y . Ji, Z. Li, Z. Zhu, J. Kautz, Z. Wuet al., “Hydra-MDP: End-to-end multimodal planning with multi-target hydra-distillation,”arXiv preprint arXiv:2406.06978, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[30] [30]

DistillDrive: End-to-end multi-mode autonomous driving distillation by isomorphic hetero-source planning model,

R. Yu, X. Zhang, R. Zhao, H. Yan, and M. Wang, “DistillDrive: End-to-end multi-mode autonomous driving distillation by isomorphic hetero-source planning model,” inthe Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 26 188– 26 197

2025

[31] [31]

Enhancing trust in large language models with uncertainty-aware fine-tuning,

Y . Zhou, P. Xu, X. Wang, B. An, Y . Niu, and X. Liu, “Enhancing trust in large language models with uncertainty-aware fine-tuning,” inthe Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, 2024, pp. 19 648–19 656

2024

[32] [32]

Can LLMs express their uncertainty? An empirical evaluation of confidence elicitation in LLMs,

M. Xiong, Z. Hu, X. Lu, Y . LI, J. Fu, J. He, and B. Hooi, “Can LLMs express their uncertainty? An empirical evaluation of confidence elicitation in LLMs,” inthe Proceedings of International Conference on Learning Representations, 2024

2024

[33] [33]

Conformity in large language models,

X. Zhu, C. Zhang, T. Stafford, N. Collier, and A. Vlachos, “Conformity in large language models,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025, pp. 3854–3872

2025

[34] [34]

Diffusion-ES: Gradient-free planning with diffusion for autonomous and instruction-guided driving,

B. Yang, H. Su, N. Gkanatsios, T.-W. Ke, A. Jain, J. Schneider, and K. Fragkiadaki, “Diffusion-ES: Gradient-free planning with diffusion for autonomous and instruction-guided driving,” inthe Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, 2024, pp. 15 342–15 353

2024

[35] [35]

Generalized force model of traffic dynam- ics,

D. Helbing and B. Tilch, “Generalized force model of traffic dynam- ics,”Physical review E, vol. 58, no. 1, p. 133, 1998

1998

[36] [36]

Parting with misconceptions about learning-based vehicle motion planning,

D. Dauner, M. Hallgarten, A. Geiger, and K. Chitta, “Parting with misconceptions about learning-based vehicle motion planning,” inthe Proceedings of Conference on Robot Learning, 2023, pp. 1268–1281

2023

[37] [37]

Urban Driver: Learning to drive from real-world demonstrations using policy gradients,

O. Scheel, L. Bergamini, M. Wolczyk, B. Osi ´nski, and P. Ondruska, “Urban Driver: Learning to drive from real-world demonstrations using policy gradients,” inthe Proceedings of Conference on Robot Learning, 2022, pp. 718–728

2022

[38] [38]

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

H. Caesar, J. Kabzan, K. S. Tan, W. K. Fong, E. Wolff, A. Lang, L. Fletcher, O. Beijbom, and S. Omari, “nuPlan: A closed-loop ML- based planning benchmark for autonomous vehicles,”arXiv preprint arXiv:2106.11810, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[39] [39]

Rethinking imitation-based planners for autonomous driving,

J. Cheng, Y . Chen, X. Mei, B. Yang, B. Li, and M. Liu, “Rethinking imitation-based planners for autonomous driving,” inthe Proceedings of IEEE International Conference on Robotics and Automation, 2024, pp. 14 123–14 130

2024

[40] [40]

GameFormer: Game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving,

Z. Huang, H. Liu, and C. Lv, “GameFormer: Game-theoretic modeling and learning of transformer-based interactive prediction and planning for autonomous driving,” inthe Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, 2023, pp. 3903–3913

2023

[41] [41]

PLUTO: Pushing the limit of imitation learning-based planning for autonomous driving,

J. Cheng, Y . Chen, and Q. Chen, “PLUTO: Pushing the limit of imitation learning-based planning for autonomous driving,”arXiv preprint arXiv:2404.14327, 2024

work page arXiv 2024

[42] [42]

PlanAgent: A multi-modal large lan- guage agent for closed-loop vehicle motion planning,

Y . Zheng, Z. Xing, Q. Zhang, B. Jin, P. Li, Y . Zheng, Z. Xia, K. Zhan, X. Lang, Y . Chenet al., “PlanAgent: A multi-modal large lan- guage agent for closed-loop vehicle motion planning,”arXiv preprint arXiv:2406.01587, 2024

work page arXiv 2024