arxiv: 2604.24562 · v1 · submitted 2026-04-27 · 💻 cs.AI · cs.CL· cs.CY

Recognition: unknown

Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations

Bowen Jian, Hong Wang, Liqiang Wang, Rongjie Yu, Zihang Zou

Authors on Pith no claims yet

Pith reviewed 2026-05-08 03:24 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.CY

keywords autonomous vehiclestraffic lawslarge language modelsscenario taxonomylegal compliancedriving requirementsAV navigationscenario anchors

0 comments

The pith

A traffic scenario taxonomy with node-wise anchors grounds LLMs to derive accurate mandatory and prohibitive driving requirements from traffic laws for autonomous vehicles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Autonomous vehicles need to follow traffic laws in many different situations, yet manually writing formal rules for every case is slow and difficult to update. The paper shows that structuring scenarios into a taxonomy and using node-wise anchors to guide large language models helps the models pick the right legal provisions and turn them into clear requirements. This avoids the common problem where models pull in irrelevant rules or overlook applicable ones. When the approach works, it gives a practical way to build law-compliant behavior into AV systems at scale. The authors demonstrate this by creating a compliance layer for navigation and a real-time onboard monitor.

Core claim

The paper establishes that a pipeline grounding LLM reasoning in a traffic scenario taxonomy through node-wise anchors encoding hierarchical semantics improves law-scenario matching by 29.1 percent and raises accuracy of derived mandatory and prohibitive requirements by 36.9 percent and 38.2 percent respectively when tested on Chinese traffic laws and the OnSite dataset of 5,897 scenarios. It further shows real-world use by building a law-compliance layer for AV navigation and an onboard real-time compliance monitor for in-field testing.

What carries the argument

Traffic scenario taxonomy with node-wise anchors that encode hierarchical semantics to guide LLM reasoning over legal provisions

If this is right

A law-compliance layer can be added to existing AV navigation planners
An onboard real-time monitor can check compliance during actual driving
The approach supplies a scalable base for AV development and regulatory review
Mandatory and prohibitive requirements become more reliably tied to specific scenarios

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same taxonomy structure could be adapted to traffic regulations in other countries to test broader use
Linking the derived requirements directly to real-time perception data would let AVs check compliance while driving
Expanding the taxonomy to include rare edge scenarios would reveal whether the accuracy gains persist

Load-bearing premise

The traffic scenario taxonomy and node-wise anchors capture the full range of real-world driving variability while keeping LLM outputs accurate and complete when applied outside the tested Chinese laws and OnSite dataset.

What would settle it

Running the same pipeline on traffic laws from a different jurisdiction or on a new dataset with substantially different scenarios and finding no gain in matching rate or requirement accuracy over plain LLM prompting would show the central claim does not hold.

read the original abstract

Driving in compliance with traffic laws and regulations is a basic requirement for human drivers, yet autonomous vehicles (AVs) can violate these requirements in diverse real-world scenarios. To encode law compliance into AV systems, conventional approaches use formal logic languages to explicitly specify behavioral constraints, but this process is labor-intensive, hard to scale, and costly to maintain. With recent advances in artificial intelligence, it is promising to leverage large language models (LLMs) to derive legal requirements from traffic laws and regulations. However, without explicitly grounding and reasoning in structured traffic scenarios, LLMs often retrieve irrelevant provisions or miss applicable ones, yielding imprecise requirements. To address this, we propose a novel pipeline that grounds LLM reasoning in a traffic scenario taxonomy through node-wise anchors that encode hierarchical semantics. On Chinese traffic laws and OnSite dataset (5,897 scenarios), our method improves law-scenario matching by 29.1\% and increases the accuracy of derived mandatory and prohibitive requirements by 36.9\% and 38.2\%, respectively. We further demonstrate real-world applicability by constructing a law-compliance layer for AV navigation and developing an onboard, real-time compliance monitor for in-field testing, providing a solid foundation for future AV development, deployment, and regulatory oversight.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete LLM pipeline for pulling scenario-specific rules out of traffic laws for AVs, with reported gains on Chinese data and a working prototype, but the taxonomy's fit outside that single jurisdiction is untested.

read the letter

This paper gives a concrete way to ground LLMs in a traffic scenario taxonomy so they pull out the right legal requirements for autonomous driving instead of missing or grabbing irrelevant rules. On Chinese laws paired with their OnSite set of 5,897 scenarios, the method lifts law-scenario matching by 29 percent and improves accuracy on mandatory and prohibitive requirements by 37 and 38 percent. They also built an actual compliance layer for navigation and ran an onboard real-time monitor in field tests, which moves the work past pure extraction into something closer to deployment use.

Referee Report

2 major / 1 minor

Summary. The paper proposes a pipeline to derive scenario-aware driving requirements from traffic laws by grounding LLM reasoning in a hierarchical traffic scenario taxonomy using node-wise anchors. On Chinese traffic laws and the OnSite dataset of 5,897 scenarios, it reports a 29.1% improvement in law-scenario matching along with 36.9% and 38.2% gains in accuracy for mandatory and prohibitive requirements; it further constructs a law-compliance layer for AV navigation and an onboard real-time monitor demonstrated via in-field testing.

Significance. If the grounding mechanism proves robust, the approach could offer a more scalable alternative to manual formal-logic encoding of traffic rules for autonomous vehicles, with direct implications for compliance monitoring and regulatory integration. The inclusion of a deployed real-time monitor provides a concrete path from derivation to operational use.

major comments (2)

[Experiments] Experiments section: the reported gains (29.1% matching, 36.9%/38.2% accuracy) are presented without specifying the baseline methods, the precise definition or measurement protocol for 'accuracy' of derived requirements, error bars, or statistical significance tests, rendering the central empirical claims difficult to verify or reproduce.
[Method] Method section: the scenario taxonomy and node-wise anchors are presented as the key grounding innovation, yet no ablation studies, derivation details independent of the test laws, or cross-jurisdiction validation are provided; this makes it impossible to determine whether the observed improvements stem from the proposed structure or from dataset-specific tuning.

minor comments (1)

[Abstract] The abstract would benefit from a brief sentence outlining the main pipeline stages before stating the quantitative results.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, indicating where revisions will be made to improve clarity, reproducibility, and rigor.

read point-by-point responses

Referee: [Experiments] Experiments section: the reported gains (29.1% matching, 36.9%/38.2% accuracy) are presented without specifying the baseline methods, the precise definition or measurement protocol for 'accuracy' of derived requirements, error bars, or statistical significance tests, rendering the central empirical claims difficult to verify or reproduce.

Authors: We agree that the experimental reporting requires additional detail for full reproducibility and verifiability. In the revised manuscript, we will specify the baseline methods (direct LLM prompting without grounding and a keyword-based retrieval baseline), define accuracy as the fraction of derived requirements that match expert-verified ground truth (where experts assess whether mandatory and prohibitive requirements correctly capture applicable legal obligations for each scenario), report standard deviations across multiple runs with varied random seeds, and include statistical significance testing (paired t-tests) for the observed improvements. revision: yes
Referee: [Method] Method section: the scenario taxonomy and node-wise anchors are presented as the key grounding innovation, yet no ablation studies, derivation details independent of the test laws, or cross-jurisdiction validation are provided; this makes it impossible to determine whether the observed improvements stem from the proposed structure or from dataset-specific tuning.

Authors: The scenario taxonomy is derived from standard hierarchical traffic engineering classifications (e.g., based on road type, maneuver, and environmental factors) and is described in Section 3.1 as independent of the specific test laws. Node-wise anchors encode level-specific semantics to ground LLM reasoning. We will add ablation studies in the revision, comparing the full pipeline against variants without the hierarchy and without node-wise anchors. We will also clarify the taxonomy derivation process. However, cross-jurisdiction validation is not feasible in the current work due to the absence of equivalent annotated datasets. revision: partial

standing simulated objections not resolved

Cross-jurisdiction validation, as no equivalent annotated scenario datasets from other legal systems are currently available to the authors.

Circularity Check

0 steps flagged

No circularity: empirical gains are direct comparisons of a proposed pipeline against baselines

full rationale

The paper introduces a novel grounding pipeline (scenario taxonomy + node-wise anchors) for LLM-based requirement derivation and evaluates it via direct accuracy and matching metrics on the OnSite dataset against conventional formal-logic approaches. No equations, fitted parameters renamed as predictions, self-citations that bear the central claim, or ansatzes imported from prior author work appear in the derivation. The reported 29.1%/36.9%/38.2% improvements are presented as empirical outcomes of the new method, not reductions to the inputs by construction. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the untested assumption that the chosen taxonomy plus anchors sufficiently represent real traffic variability and that LLMs produce faithful requirements once grounded; no explicit free parameters or new physical entities are introduced.

axioms (1)

domain assumption LLMs can retrieve and apply relevant legal provisions accurately when provided with structured scenario anchors encoding hierarchical semantics
Invoked to justify the pipeline's improvement over ungrounded LLM use.

invented entities (1)

node-wise anchors no independent evidence
purpose: Encode hierarchical semantics of traffic scenarios to guide LLM reasoning toward applicable laws
New construct introduced in the proposed pipeline to address irrelevant or missed provisions.

pith-pipeline@v0.9.0 · 5535 in / 1320 out tokens · 32344 ms · 2026-05-08T03:24:00.912899+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 6 canonical work pages · 4 internal anchors

[1]

An ethi- cal trajectory planning algorithm for autonomous vehicles.Nature Machine Intelligence, 5(2):137–144, 2023

Maximilian Geisslinger, Franziska Poszler, and Markus Lienkamp. An ethi- cal trajectory planning algorithm for autonomous vehicles.Nature Machine Intelligence, 5(2):137–144, 2023

2023
[2]

On a Formal Model of Safe and Scalable Self-driving Cars

Shai Shalev-Shwartz, Shaked Shammah, and Amnon Shashua. On a formal model of safe and scalable self-driving cars.arXiv preprint arXiv:1708.06374, 2017

work page Pith review arXiv 2017
[3]

Using online verification to prevent autonomous vehicles from causing ac- cidents.Nature Machine Intelligence, 2(9):518–528, 2020

Christian Pek, Stefanie Manzinger, Markus Koschi, and Matthias Althoff. Using online verification to prevent autonomous vehicles from causing ac- cidents.Nature Machine Intelligence, 2(9):518–528, 2020

2020
[4]

Automotive World, March 2025

Tesla pauses china FSD rollout to gain regulatory approval. Automotive World, March 2025. Accessed 2026-01-06

2025
[5]

Tesla forced to halt FSD trials in china over new rules

Brad Anderson. Tesla forced to halt FSD trials in china over new rules. Carscoops, March 2025. Accessed 2026-01-06

2025
[6]

Yahoo Autos, March 2025

Tesla stops FSD free trial in china just one week after launch. Yahoo Autos, March 2025. Accessed 2026-01-06

2025
[7]

Tesla’s self-driving rollout in china hits a regulatory speed bump

Max McDee. Tesla’s self-driving rollout in china hits a regulatory speed bump. ArenaEV, March 2025. Accessed 2026-01-06

2025
[8]

Glob- alChinaEV, March 2025

Tesla china reportedly paused full self-driving (FSD) trial rollout. Glob- alChinaEV, March 2025. Accessed 2026-01-06

2025
[9]

People. Waymo vehicle allegedly blocks emergency crews responding to austin mass shooting.https://people.com/waymo-vehicle-allegedly -blocks-emergency-crews-responding-austin-mass-shooting-11917 679, 2026. [Accessed 03-03-2026]

2026
[10]

Baidu’s mass robotaxi rollout stirs heated debate in china

Sixth Tone. Baidu’s mass robotaxi rollout stirs heated debate in china. Sixth Tone, 2024

2024
[11]

Robotaxis — arriving at a future near you.China Daily HK, 2024

China Daily HK. Robotaxis — arriving at a future near you.China Daily HK, 2024

2024
[12]

China’s xiaomi says it is cooperating with police after fatal ev accident.Reuters, 2025

Reuters. China’s xiaomi says it is cooperating with police after fatal ev accident.Reuters, 2025

2025
[13]

Xiaomi auto denies claims ’spontaneous combustion’ caused fire in fatal su7 car crash.Yicai Global, 2025

Yicai Global. Xiaomi auto denies claims ’spontaneous combustion’ caused fire in fatal su7 car crash.Yicai Global, 2025

2025
[14]

Half of new cars sold in china have l2 assisted driving tech, head of china ev100 says

Yisi Xiao. Half of new cars sold in china have l2 assisted driving tech, head of china ev100 says. Yicai Global, July 2025. Accessed 2026-01-06

2025
[15]

Delivering more for our riders in a year of incredible growth

Waymo. Delivering more for our riders in a year of incredible growth. Waymo Blog, December 2025. Accessed 2026-01-06. 17

2025
[16]

Waymo robotaxis did 14 million trips in 2025

The Verge. Waymo robotaxis did 14 million trips in 2025. The Verge, December 2025. Accessed 2026-01-06

2025
[17]

Baidu, Inc., February 2026

Baidu announces fourth quarter and fiscal year 2025 results. Baidu, Inc., February 2026. Accessed 2026-03-10

2025
[18]

Standing gen- eral order on crash reporting

National Highway Traffic Safety Administration (NHTSA). Standing gen- eral order on crash reporting. NHTSA Webpage, 2021. Accessed 2026-01- 06

2021
[19]

Odi investiga- tion opening resume: Pe25-013 (waymo llc) — performance around stopped school buses

National Highway Traffic Safety Administration (NHTSA). Odi investiga- tion opening resume: Pe25-013 (waymo llc) — performance around stopped school buses. PDF, 2025

2025
[20]

Part 573 safety recall report 25e-084 (waymo llc) — school bus stop violations

National Highway Traffic Safety Administration (NHTSA). Part 573 safety recall report 25e-084 (waymo llc) — school bus stop violations. PDF, 2025

2025
[21]

China bans ’smart’ and ’autonomous’ driving terms from vehicle ads

Reuters. China bans ’smart’ and ’autonomous’ driving terms from vehicle ads. Reuters, April 2025. Accessed 2026-01-06

2025
[22]

China mandates regulatory approvals for autonomous driving software upgrades

Reuters. China mandates regulatory approvals for autonomous driving software upgrades. Reuters, February 2025. Accessed 2026-01-06

2025
[23]

China pilots l3 vehicles on roads

CHINA DAILY. China pilots l3 vehicles on roads. CHINA DAILY, De- cember 2025. Accessed 2026-02-14

2025
[24]

Notice on further strengthening the management of product access, recall and online soft- ware upgrade of intelligent connected vehicles, 2025

Ministry of Industry and Information Technology. Notice on further strengthening the management of product access, recall and online soft- ware upgrade of intelligent connected vehicles, 2025

2025
[25]

Planning-oriented autonomous driving

Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. Planning-oriented autonomous driving. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 17853–17862, 2023

2023
[26]

Vad: Vectorized scene representation for efficient autonomous driving

Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Vad: Vectorized scene representation for efficient autonomous driving. InPro- ceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8350, 2023

2023
[27]

End-to-end driving with online trajectory evaluation via bev world model

Yingyan Li, Yuqi Wang, Yang Liu, Jiawei He, Lue Fan, and Zhaoxiang Zhang. End-to-end driving with online trajectory evaluation via bev world model. InProceedings of the IEEE/CVF International Conference on Com- puter Vision, pages 27137–27146, 2025

2025
[28]

Formalising and monitoring traffic rules for autonomous vehicles in isabelle/hol.Springer, Cham, 2017

Albert Rizaldi, Jonas Keinholz, Monika Huber, Jochen Feldle, and Tobias Nipkow. Formalising and monitoring traffic rules for autonomous vehicles in isabelle/hol.Springer, Cham, 2017. 18

2017
[29]

Intention-aware motion planning with road rules

Jesper Karlsson and Jana Tumova. Intention-aware motion planning with road rules. In2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), page 526–532. IEEE, 2020

2020
[30]

Specifying safety of autonomous vehicles in signal temporal logic

Nikos Arechiga. Specifying safety of autonomous vehicles in signal temporal logic. In2019 IEEE Intelligent Vehicles Symposium (IV), page 58–63. IEEE, 2019

2019
[31]

Encoding and monitoring responsibility sensitive safety rules for automated vehicles in signal temporal logic

Mohammad Hekmatnejad, Shakiba Yaghoubi, Adel Dokhanchi, Heni Ben Amor, Aviral Shrivastava, Lina Karam, and Georgios Fainekos. Encoding and monitoring responsibility sensitive safety rules for automated vehicles in signal temporal logic. InProceedings of the 17th ACM-IEEE Interna- tional Conference on Formal Methods and Models for System Design, page 1–11, 2019

2019
[32]

Formalization of interstate traffic rules in temporal logic

Sebastian Maierhofer, Anna-Katharina Rettinger, Eva Charlotte Mayer, and Matthias Althoff. Formalization of interstate traffic rules in temporal logic. In2020 IEEE Intelligent Vehicles Symposium (IV), page 752–759. IEEE, 2020

2020
[33]

Formaliza- tion of intersection traffic rules in temporal logic

Sebastian Maierhofer, Paul Moosbrugger, and Matthias Althoff. Formaliza- tion of intersection traffic rules in temporal logic. In2022 IEEE Intelligent Vehicles Symposium (IV), page 1135–1144. IEEE, 2022

2022
[34]

Online legal driving behavior monitoring for self-driving vehicles.Nature commu- nications, 15(1):408, 2024

Wenhao Yu, Chengxiang Zhao, Hong Wang, Jiaxin Liu, Xiaohan Ma, Yingkai Yang, Jun Li, Weida Wang, Xiaosong Hu, and Ding Zhao. Online legal driving behavior monitoring for self-driving vehicles.Nature commu- nications, 15(1):408, 2024

2024
[35]

Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Ka- plan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

1901
[36]

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Alt- man, Shyamal Anadkat, et al. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review arXiv 2023
[37]

Gemini: A Family of Highly Capable Multimodal Models

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. Gemini: a family of highly capable multimodal mod- els.arXiv preprint arXiv:2312.11805, 2023

work page internal anchor Pith review arXiv 2023
[38]

Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Paul Christiano, Jan Leike, and Ryan Lowe

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feed- back. InAdvances in Neural Information Processing Systems (NeurIPS), 2022. 19

2022
[39]

Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM Computing Surveys, 2023

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM Computing Surveys, 2023

2023
[40]

Zewei Zhou, Tianhui Cai, Yun Zhao, Seth Z.and Zhang, Zhiyu Huang, Bolei Zhou, and Jiaqi Ma. Autovla: A vision-language-action model for end-to- end autonomous driving with adaptive reasoning and reinforcement fine- tuning.Advances in Neural Information Processing Systems (NeurIPS), 2025

2025
[41]

Parameter-efficient fine-tuning of large-scale pre-trained language models

Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature machine intelligence, 5(3):220–235, 2023

2023
[42]

Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Liang Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022

2022
[43]

The power of scale for parameter-efficient prompt tuning

Brian Lester, Rami Al-Rfou, and Noah Constant. The power of scale for parameter-efficient prompt tuning. InProceedings of the 2021 conference on empirical methods in natural language processing, pages 3045–3059, 2021

2021
[44]

Orion: A holistic end-to-end autonomous driving framework by vision-language instructed action generation

Haoyu Fu, Diankun Zhang, Zongchuang Zhao, Jianfeng Cui, Dingkang Liang, Chong Zhang, Dingyuan Zhang, Hongwei Xie, Bing Wang, and Xiang Bai. Orion: A holistic end-to-end autonomous driving framework by vision-language instructed action generation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 24823– 24834, 2025

2025
[45]

Retrieval-augmented generation for knowledge-intensive NLP tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K¨ uttler, Mike Lewis, Wen-tau Yih, Tim Rockt¨ aschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural In- formation Processing Systems (NeurIPS), 2020

2020
[46]

Safeauto: Knowledge-enhanced safe autonomous driving with multi- modal foundation models

Jiawei Zhang, Xuan Yang, Taiqi Wang, Yu Yao, Aleksandr Petiushko, and Bo Li. Safeauto: Knowledge-enhanced safe autonomous driving with multi- modal foundation models. InInternational Conference on Machine Learn- ing, pages 76497–76517. PMLR, 2025

2025
[47]

G-retriever: Retrieval-augmented generation for textual graph understanding and question answering.Ad- vances in Neural Information Processing Systems, 37:132876–132907, 2024

Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, and Bryan Hooi. G-retriever: Retrieval-augmented generation for textual graph understanding and question answering.Ad- vances in Neural Information Processing Systems, 37:132876–132907, 2024

2024
[48]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, and Jonathan Larson. From local to global: 20 A graph rag approach to query-focused summarization.arXiv preprint arXiv:2404.16130, 2024

work page internal anchor Pith review arXiv 2024
[49]

Detect- ing hallucinations in large language models using semantic entropy.Nature, 630(8017):625–630, 2024

Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal. Detect- ing hallucinations in large language models using semantic entropy.Nature, 630(8017):625–630, 2024

2024
[50]

The next decade in ai: four steps towards robust artificial intelligence.arXiv preprint arXiv:2002.06177, 2020

Gary Marcus. The next decade in ai: four steps towards robust artificial intelligence.arXiv preprint arXiv:2002.06177, 2020

work page arXiv 2002
[51]

Survey of hallucina- tion in natural language generation.ACM computing surveys, 55(12):1–38, 2023

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucina- tion in natural language generation.ACM computing surveys, 55(12):1–38, 2023

2023
[52]

Hallucination detection: Ro- bustly discerning reliable answers in large language models

Yuyan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, and Yanghua Xiao. Hallucination detection: Ro- bustly discerning reliable answers in large language models. InProceedings of the 32nd ACM International Conference on Information and Knowledge Management, pages 245–255, 2023

2023
[53]

ASAM e.V., 2021

ASAM e.V.ASAM OpenDRIVE textregistered Specification, v1.7.0. ASAM e.V., 2021. Road network de- scription format for driving simulators

2021
[54]

ASAM e.V., 2022

ASAM e.V.ASAM OpenSCENARIO textregistered Specification, v1.2.0. ASAM e.V., 2022. Scenario description format for automated driving and simulation

2022
[55]

Scene visualization, selection and download.https://onsi te.com.cn/#/dist/benchmarkLeaderBoard, 2025

OnSite Team. Scene visualization, selection and download.https://onsi te.com.cn/#/dist/benchmarkLeaderBoard, 2025

2025
[56]

Road vehicles — test scenarios for automated driving systems — scenario categorization, 2024

ISO. Road vehicles — test scenarios for automated driving systems — scenario categorization, 2024

2024
[57]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review arXiv 2025
[58]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations (ICLR), 2015

2015
[59]

Rouge: A package for automatic evaluation of summaries

Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. InText summarization branches out, page 74–81, 2004

2004
[60]

Bertscore: Evaluating text generation with bert

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. InInternational Conference on Learning Representations, 2020. 21

2020
[61]

Linearrag: Linear graph retrieval augmented generation on large-scale corpora

Luyao Zhuang, Shengyuan Chen, Yilin Xiao, Huachi Zhou, Yujing Zhang, Hao Chen, Qinggang Zhang, and Xiao Huang. Linearrag: Linear graph retrieval augmented generation on large-scale corpora. InInternational Conference on Machine Learning, 2026

2026
[62]

Microscopic traffic sim- ulation using sumo

Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erd- mann, Yun-Pang Fl¨ otter¨ od, Robert Hilbrich, Leonhard L¨ ucken, Johannes Rummel, Peter Wagner, and Evamarie Wiessner. Microscopic traffic sim- ulation using sumo. In2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 2575–2582, 2018. 6 Acknowledgments...

2018