CrashSight is a new infrastructure-focused benchmark showing that state-of-the-art vision-language models can describe crash scenes but fail at temporal and causal reasoning.
Tumtraffic-videoqa: A benchmark for unified spatio-temporal video understanding in traffic scenes.arXiv preprint arXiv:2502.02449, 2025b
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
NuRisk is a new VQA dataset for agent-level risk assessment in autonomous driving that benchmarks VLMs at 33% peak accuracy and shows a fine-tuned 7B model reaching 41% with 75% lower latency.
Creates LTD dataset for open-ended traffic VQA and trains UniVLT model to achieve SOTA on unified microscopic AD and macroscopic traffic reasoning tasks.
DTPQA is a new VQA benchmark consisting of synthetic and real-world traffic images with distance annotations to isolate and measure VLM perception capabilities for driving decisions.
citing papers explorer
-
CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning
CrashSight is a new infrastructure-focused benchmark showing that state-of-the-art vision-language models can describe crash scenes but fail at temporal and causal reasoning.
-
NuRisk: A Visual Question Answering Dataset for Agent-Level Risk Assessment in Autonomous Driving
NuRisk is a new VQA dataset for agent-level risk assessment in autonomous driving that benchmarks VLMs at 33% peak accuracy and shows a fine-tuned 7B model reaching 41% with 75% lower latency.
-
Towards Safe Mobility: A Unified Transportation Foundation Model enabled by Open-Ended Vision-Language Dataset
Creates LTD dataset for open-ended traffic VQA and trains UniVLT model to achieve SOTA on unified microscopic AD and macroscopic traffic reasoning tasks.
-
Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA)
DTPQA is a new VQA benchmark consisting of synthetic and real-world traffic images with distance annotations to isolate and measure VLM perception capabilities for driving decisions.