The Road Ahead in Autonomous Driving: The KITScenes Multimodal Dataset
Pith reviewed 2026-06-28 14:36 UTC · model grok-4.3
The pith
A new multimodal dataset supplies the most complete HD maps of any public autonomous driving collection, with traffic elements placed in accurate 3D and fully connected.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Our HD maps are, to our knowledge, the most complete of any sensor dataset, validated through autonomous driving trials on open-source software. For the first time in a public dataset, all driving-relevant traffic elements, such as traffic lights, are mapped in 3D to a reprojection-accurate level with full topological connectivity.
What carries the argument
The HD maps that locate every driving-relevant traffic element in 3D at reprojection-accurate positions and maintain full topological connectivity among them.
If this is right
- Algorithms for end-to-end driving can be evaluated against maps that contain every traffic element in connected 3D form.
- Online HD map construction and long-range depth estimation benchmarks become available on data from irregular city layouts.
- Novel view synthesis methods can be tested on synchronized multimodal recordings that include the new map layer.
- Training sets gain coverage of mixed-traffic European streets that differ from grid-based collections.
Where Pith is reading between the lines
- If the maps hold up under wider testing, planners could rely on them to simulate complex intersections that current datasets omit.
- The sensor synchronization details might be reused to reduce timing errors in other multimodal collections.
- Adding more cities with similar map density would allow direct measurement of how map completeness affects model generalization.
Load-bearing premise
The maps reach reprojection accuracy and complete connectivity because the sensor synchronization and mapping steps contain no systematic errors, an assumption supported only by the unshown details of the driving-trial validation.
What would settle it
A set of camera images in which the projected positions of mapped traffic lights deviate from their visible locations or a driving test that reveals missing road connections in the supplied topology.
Figures
read the original abstract
Existing autonomous driving datasets have enabled major progress, but fall short in sensor fidelity, map completeness, or geographic diversity. We present KITScenes Multimodal, a European dataset built around high-fidelity sensors and maps. Our fully synchronized sensor suite combines high-resolution global-shutter cameras, long-range lidar beyond 400m, 4D imaging radar, and redundant GNSS/INS localization. Our HD maps are, to our knowledge, the most complete of any sensor dataset, validated through autonomous driving trials on open-source software. For the first time in a public dataset, all driving-relevant traffic elements, such as traffic lights, are mapped in 3D to a reprojection-accurate level with full topological connectivity. Recorded in cities with irregular street layouts and mixed traffic modes, our dataset complements existing datasets by broadening the available geographic diversity. We also introduce four benchmarks, each advancing spatial learning for embodied AI: online HD map construction, long-range depth estimation, novel view synthesis, and end-to-end driving. Project page: https://kitscenes.com/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces KITScenes Multimodal, a new European autonomous driving dataset featuring a synchronized high-fidelity sensor suite (high-resolution global-shutter cameras, long-range lidar >400m, 4D imaging radar, redundant GNSS/INS) and claims the most complete HD maps of any public sensor dataset. These maps include all driving-relevant traffic elements (e.g., traffic lights) mapped in 3D to a reprojection-accurate level with full topological connectivity, validated via autonomous driving trials on open-source software. The dataset targets geographic diversity in irregular street layouts and mixed traffic, and introduces four benchmarks: online HD map construction, long-range depth estimation, novel view synthesis, and end-to-end driving.
Significance. If the HD map completeness and accuracy claims hold with supporting evidence, the dataset would meaningfully complement existing ones by expanding geographic coverage and providing uniquely detailed 3D topological maps, potentially enabling stronger progress on the four proposed benchmarks for embodied AI and spatial learning in autonomous driving.
major comments (1)
- [Abstract] Abstract: The central claim that the HD maps are 'the most complete of any sensor dataset' and achieve 'reprojection-accurate level' 3D mapping of traffic elements with 'full topological connectivity' for the first time in a public dataset rests on validation through autonomous driving trials, yet no quantitative metrics (e.g., reprojection error thresholds, connectivity success rates, or error statistics) or pipeline details are provided to substantiate this. This directly undermines assessment of the 'most complete' and 'first time' assertions.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our manuscript. We address the major comment regarding the substantiation of our HD map claims below, and commit to revisions that strengthen the paper without altering its core contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that the HD maps are 'the most complete of any sensor dataset' and achieve 'reprojection-accurate level' 3D mapping of traffic elements with 'full topological connectivity' for the first time in a public dataset rests on validation through autonomous driving trials, yet no quantitative metrics (e.g., reprojection error thresholds, connectivity success rates, or error statistics) or pipeline details are provided to substantiate this. This directly undermines assessment of the 'most complete' and 'first time' assertions.
Authors: We agree that the abstract would be strengthened by explicit quantitative metrics and pipeline details to support the claims. The full manuscript describes the mapping process and real-world validation via autonomous driving trials on open-source software, but does not include specific error statistics. In the revised version, we will add a dedicated subsection (likely in Section 3 or 4) providing quantitative metrics such as mean reprojection error for mapped traffic elements (e.g., traffic lights), topological connectivity success rates, and error statistics across the dataset, along with an overview of the mapping pipeline. This will enable direct assessment of the completeness and accuracy claims relative to prior datasets. revision: yes
Circularity Check
No circularity: descriptive dataset paper with no derivations or predictions
full rationale
The paper is a dataset release announcement. Its central claims concern sensor suite completeness, HD map coverage, and the introduction of four benchmarks. No equations, fitted parameters, predictions, or derivation chains appear in the provided text. Claims of map accuracy and topological connectivity are presented as empirical outcomes of the collection process rather than results derived from prior fitted quantities or self-citations. Because no load-bearing mathematical step exists that could reduce to its own inputs, the circularity score is 0.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Vision meets robotics: The kitti dataset.The International Journal of Robotics Research, 32(11):1231–1237, 2013
Andreas Geiger, Philip Lenz, Raquel Urtasun, and Christoph Stiller. Vision meets robotics: The kitti dataset.The International Journal of Robotics Research, 32(11):1231–1237, 2013. doi: 10.1177/ 0278364913491297
2013
-
[2]
Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes: A multimodal dataset for autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
2020
-
[3]
Scalability in perception for autonomous driving: Waymo open dataset
Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, and Dragomir Anguelov. Scalability in perception...
2020
-
[4]
Qi, Yin Zhou, Zoey Yang, Aurélien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley, Jonathon Shlens, and Dragomir Anguelov
Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao, Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles R. Qi, Yin Zhou, Zoey Yang, Aurélien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley, Jonathon Shlens, and Dragomir Anguelov. Large scale interactive motion forecasting for autonomous driving: The waymo open motion datas...
2021
-
[5]
Man truckscenes: A multimodal dataset for autonomous trucking in diverse conditions
Felix Fent, Fabian Kuttenreich, Florian Ruch, Farija Rizwin, Stefan Juergens, Lorenz Lecher- mann, Christian Nissler, Andrea Perl, Ulrich V oll, Min Yan, and Markus Lienkamp. Man truckscenes: A multimodal dataset for autonomous trucking in diverse conditions. In A. Glober- son, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,A...
2024
-
[6]
Truck- drive: Long-range autonomous highway driving dataset, 2026
Filippo Ghilotti, Edoardo Palladin, Samuel Brucker, Adam Sigal, Mario Bijelic, and Felix Heide. Truck- drive: Long-range autonomous highway driving dataset, 2026. URL https://arxiv.org/abs/2603. 02413
2026
-
[7]
Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving
Mina Alibeigi, William Ljungbergh, Adam Tonderski, Georg Hess, Adam Lilja, Carl Lindstrom, Daria Motorniuk, Junsheng Fu, Jenny Widahl, and Christoffer Petersson. Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2023
2023
-
[8]
PhysicalAI-Autonomous-Vehicles
NVIDIA Corporation. PhysicalAI-Autonomous-Vehicles. https://huggingface.co/datasets/ nvidia/PhysicalAI-Autonomous-Vehicles, oct 2025. Accessed 2026-05-06, released 2025-10-28
2025
-
[9]
Lanelet2: A high-definition map framework for the future of automated driving
Fabian Poggenhans, Jan-Hendrik Pauls, Johannes Janosovits, Stefan Orf, Maximilian Naumann, Florian Kuhnt, and Matthias Mayr. Lanelet2: A high-definition map framework for the future of automated driving. In2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 1672–1679,
-
[10]
doi: 10.1109/ITSC.2018.8569929
-
[11]
Autoware
Autoware Foundation. Autoware. https://github.com/autowarefoundation/autoware. Accessed: 2026-05-02
2026
-
[12]
Argoverse 2: Next generation datasets for self-driving perception and forecasting
Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjeet Singh, Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kaesemodel Pontes, Deva Ramanan, Peter Carr, and James Hays. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In Proceedings of the Neural Information Processing Systems Track on ...
2021
-
[13]
Xinyu Huang, Peng Wang, Xinjing Cheng, Dingfu Zhou, Qichuan Geng, and Ruigang Yang. The ApolloScape Open Dataset for Autonomous Driving and Its Application .IEEE Transactions on Pattern Analysis & Machine Intelligence, 42(10):2702–2719, October 2020. ISSN 1939-3539. doi: 10.1109/TPAMI. 2019.2926463. URLhttps://doi.ieeecomputersociety.org/10.1109/TPAMI.201...
-
[14]
One million scenes for autonomous driving: Once dataset
Jiageng Mao, Niu Minzhe, ChenHan Jiang, hanxue liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei Zhang, Zhenguo Li, Jie Yu, Chunjing XU, and Hang Xu. One million scenes for autonomous driving: Once dataset. In J. Vanschoren and S. Yeung, editors,Proceed- ings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1,
-
[15]
URL https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/ 2021/file/67c6a1e7ce56d3d6fa748ab6d9af3fd7-Paper-round1.pdf
2021
-
[16]
KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.Pattern Analysis and Machine Intelligence (PAMI), 2022
Yiyi Liao, Jun Xie, and Andreas Geiger. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.Pattern Analysis and Machine Intelligence (PAMI), 2022
2022
-
[17]
Openlane-v2: A topology reasoning benchmark for unified 3d hd mapping
Huijie Wang, Tianyu Li, Yang Li, Li Chen, Chonghao Sima, Zhenbo Liu, Bangjun Wang, Peijin Jia, Yuting Wang, Shengyin Jiang, et al. Openlane-v2: A topology reasoning benchmark for unified 3d hd mapping. InThirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023
2023
-
[18]
Hdmapnet: An online hd map construction and evaluation framework
Qi Li, Yue Wang, Yilun Wang, and Hang Zhao. Hdmapnet: An online hd map construction and evaluation framework. In2022 International Conference on Robotics and Automation (ICRA), pages 4628–4634,
-
[19]
doi: 10.1109/ICRA46639.2022.9812383
-
[20]
VectorMapNet: End-to-end vectorized HD map learning
Yicheng Liu, Tianyuan Yuan, Yue Wang, Yilun Wang, and Hang Zhao. VectorMapNet: End-to-end vectorized HD map learning. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research,...
2023
-
[21]
Bencheng Liao, Shaoyu Chen, Xinggang Wang, Tianheng Cheng, Qian Zhang, Wenyu Liu, and Chang Huang. Maptr: Structured modeling and learning for online vectorized hd map construction.arXiv preprint arXiv:2208.14437, 2022
arXiv 2022
-
[22]
Bencheng Liao, Shaoyu Chen, Yunchi Zhang, Bo Jiang, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Maptrv2: An end-to-end framework for online vectorized hd map construction.Interna- tional Journal of Computer Vision, Oct 2024. ISSN 1573-1405. doi: 10.1007/s11263-024-02235-z. URL https://doi.org/10.1007/s11263-024-02235-z
-
[23]
Streammapnet: Streaming mapping network for vectorized online hd map construction
Tianyuan Yuan, Yicheng Liu, Yue Wang, Yilun Wang, and Hang Zhao. Streammapnet: Streaming mapping network for vectorized online hd map construction. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 7356–7365, 2024
2024
-
[24]
End-to-end vectorized hd-map construction with piecewise bezier curve
Limeng Qiao, Wenjie Ding, Xi Qiu, and Chi Zhang. End-to-end vectorized hd-map construction with piecewise bezier curve. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13218–13228, June 2023
2023
-
[25]
Pivotnet: Vectorized pivot learning for end-to-end hd map construction
Wenjie Ding, Limeng Qiao, Xi Qiu, and Chi Zhang. Pivotnet: Vectorized pivot learning for end-to-end hd map construction. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 3672–3682, 2023
2023
-
[26]
Stream query denoising for vectorized hd-map construction
Shuo Wang, Fan Jia, Weixin Mao, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, and Feng Zhao. Stream query denoising for vectorized hd-map construction. InEuropean Conference on Computer Vision, pages 203–220. Springer, 2024
2024
-
[27]
Maptracker: Tracking with strided memory fusion for consistent vector hd mapping
Jiacheng Chen, Yuefan Wu, Jiaqi Tan, Hang Ma, and Yasutaka Furukawa. Maptracker: Tracking with strided memory fusion for consistent vector hd mapping. InEuropean Conference on Computer Vision, pages 90–107. Springer, 2024
2024
-
[28]
Enhancing vectorized map perception with historical rasterized maps
Xiaoyu Zhang, Guangwei Liu, Zihao Liu, Ningyi Xu, Yunhui Liu, and Ji Zhao. Enhancing vectorized map perception with historical rasterized maps. InEuropean Conference on Computer Vision, pages 422–439. Springer, 2024
2024
-
[29]
Anqi Shi, Yuze Cai, Xiangyu Chen, Jian Pu, Zeyu Fu, and Hong Lu. Globalmapnet: An online framework for vectorized global hd map construction.arXiv preprint arXiv:2409.10063, 2024
arXiv 2024
-
[30]
Mapexpert: Online hd map construction with simple and efficient sparse map element expert
Dapeng Zhang, Dayu Chen, Peng Zhi, Yinda Chen, Zhenlong Yuan, Chenyang Li, Rui Zhou, Qingguo Zhou, et al. Mapexpert: Online hd map construction with simple and efficient sparse map element expert. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 14745–14753, 2025. 11
2025
-
[31]
Jing Yang, Sen Yang, Xiao Tan, and Hanli Wang. Histrackmap: Global vectorized high-definition map construction via history map tracking.arXiv preprint arXiv:2503.07168, 2025
arXiv 2025
-
[32]
Fatih Erdo˘gan, Merve Rabia Barın, and Fatma Güney. Mapping like a skeptic: Probabilistic bev projection for online hd mapping.arXiv preprint arXiv:2508.21689, 2025
arXiv 2025
-
[33]
Fabian Immel, Richard Fehler, Frank Bieder, and Christoph Stiller. Generation of training data from hd maps in the lanelet2 framework.arXiv preprint arXiv:2407.17409, 2024. URL https://arxiv.org/ abs/2407.17409
arXiv 2024
-
[34]
3d packing for self- supervised monocular depth estimation
Vitor Guizilini, Rares Ambrus, Sudeep Pillai, Allan Raventos, and Adrien Gaidon. 3d packing for self- supervised monocular depth estimation. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
2020
-
[35]
Chen, Zhenyu Li, Guang Shi, Jiashi Feng, and Bingyi Kang
Haotong Lin, Sili Chen, Jun Hao Liew, Donny Y . Chen, Zhenyu Li, Guang Shi, Jiashi Feng, and Bingyi Kang. Depth anything 3: Recovering the visual space from any views.arXiv preprint arXiv:2511.10647, 2025
Pith/arXiv arXiv 2025
-
[36]
Unidac: Universal metric depth estimation for any camera, 2026
Girish Chandar Ganesan, Yuliang Guo, Liu Ren, and Xiaoming Liu. Unidac: Universal metric depth estimation for any camera, 2026. URLhttps://arxiv.org/abs/2603.27105
Pith/arXiv arXiv 2026
-
[37]
Mars: An instance-aware, modular and realistic simulator for autonomous driving
Zirui Wu, Tianyu Liu, Liyi Luo, Zhide Zhong, Jianteng Chen, Hongmin Xiao, Chao Hou, Haozhe Lou, Yuantao Chen, Runyi Yang, Yuxin Huang, Xiaoyu Ye, Zike Yan, Yongliang Shi, Yiyi Liao, and Hao Zhao. Mars: An instance-aware, modular and realistic simulator for autonomous driving. In Lu Fang, Jian Pei, Guangtao Zhai, and Ruiping Wang, editors,Artificial Intell...
2024
-
[38]
EmerneRF: Emergent spatial-temporal scene decomposition via self-supervision
Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li, Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, and Yue Wang. EmerneRF: Emergent spatial-temporal scene decomposition via self-supervision. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=ycv2z8TYur
2024
-
[39]
Street gaussians: Modeling dynamic urban scenes with gaussian splatting
Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, and Sida Peng. Street gaussians: Modeling dynamic urban scenes with gaussian splatting. In European Conference on Computer Vision, pages 156–173. Springer, 2024
2024
-
[40]
Omnire: Omni urban scene reconstruction
Ziyu Chen, Jiawei Yang, Jiahui Huang, Riccardo de Lutio, Janick Martinez Esturo, Boris Ivanovic, Or Litany, Zan Gojcic, Sanja Fidler, Marco Pavone, Li Song, and Yue Wang. Omnire: Omni urban scene reconstruction. InThe Thirteenth International Conference on Learning Representations, 2025
2025
-
[41]
Recondrive: Fast feed-forward 4d gaussian splatting for autonomous driving scene reconstruction
Haibao Yu, Kuntao Xiao, Jiahang Wang, Ruiyang Hao, Guoran Hu, Yuxin Huang, Haifang Qin, Bowen Jing, Yuntian Bo, and Ping Luo. Recondrive: Fast feed-forward 4d gaussian splatting for autonomous driving scene reconstruction. Inhttps://arxiv.org/abs/2603.07552, 2026
arXiv 2026
-
[42]
Planning-oriented autonomous driving
Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. Planning-oriented autonomous driving. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17853–17862, 2023
2023
-
[43]
Vad: Vectorized scene representation for efficient autonomous driving
Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Vad: Vectorized scene representation for efficient autonomous driving. ICCV, 2023
2023
-
[44]
Introducing Jpegli: A new JPEG coding library
Zoltan Szabadka, Martin Bruse, and Jyrki Alakuijala. Introducing Jpegli: A new JPEG coding library. Google Open Source Blog, April 2024. URL https://opensource.googleblog.com/2024/04/ introducing-jpegli-new-jpeg-coding-library.html. Accessed: 2026-05-01
2024
-
[45]
Tan et al
K. Tan et al. H. Caesar, J. Kabzan. Nuplan: A closed-loop ml-based planning benchmark for autonomous vehicles. InCVPR ADP3 workshop, 2021
2021
-
[46]
Trust, but verify: Cross-modality fusion for hd map change detection
John Lambert and James Hays. Trust, but verify: Cross-modality fusion for hd map change detection. In J. Vanschoren and S. Yeung, editors,Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1. Curran, 2021
2021
-
[47]
Gtsign-220: A crowd-sourced, stvo-aligned benchmark for fine-grained german traffic sign recognition
Miriam Louise Carnot, Erik Fastermann, Jonas Kunze, Eric Peukert, André Ludwig, and Bogdan Franczyk. Gtsign-220: A crowd-sourced, stvo-aligned benchmark for fine-grained german traffic sign recognition. In Intelligent Vehicles Symposium (IV), 2026. 12
2026
-
[48]
Contact-GraspNet: Efficient 6-dof grasp generation in cluttered scenes
Jan-Hendrik Pauls, Benjamin Schmidt, and Christoph Stiller. Automatic mapping of tailored landmark representations for automated driving and map learning. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6725–6731, 2021. doi: 10.1109/ICRA48506.2021.9561432
-
[49]
Leveraging enhanced queries of point sets for vectorized map construction
Zihao Liu, Xiaoyu Zhang, Guangwei Liu, Ji Zhao, and Ningyi Xu. Leveraging enhanced queries of point sets for vectorized map construction. InEuropean Conference on Computer Vision, 2024
2024
-
[50]
SDTagnet: Leveraging text-annotated navigation maps for online HD map construction
Fabian Immel, Jan-Hendrik Pauls, Richard Fehler, Frank Bieder, Jonas Merkert, and Christoph Stiller. SDTagnet: Leveraging text-annotated navigation maps for online HD map construction. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/ forum?id=N3E1cU8Cv3
2025
-
[51]
Ma- pAnything: Universal feed-forward metric 3D reconstruction
Nikhil Keetha, Norman Müller, Johannes Schönberger, Lorenzo Porzi, Yuchen Zhang, Tobias Fischer, Arno Knapitsch, Duncan Zauss, Ethan Weber, Nelson Antunes, Jonathon Luiten, Manuel Lopez-Antequera, Samuel Rota Bulò, Christian Richardt, Deva Ramanan, Sebastian Scherer, and Peter Kontschieder. Ma- pAnything: Universal feed-forward metric 3D reconstruction. I...
2026
-
[52]
Ze Yang, Yun Chen, Jingkang Wang, Sivabalan Manivasagam, Wei-Chiu Ma, Anqi Joyce Yang, and Raquel Urtasun. UniSim: A Neural Closed-Loop Sensor Simulator . In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1389–1399, Los Alamitos, CA, USA, June 2023. IEEE Computer Society. doi: 10.1109/CVPR52729.2023.00140. URL https://doi...
-
[53]
Recondreamer: Crafting world models for driving scene reconstruction via online restoration
Chaojun Ni, Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, Wenkang Qin, Guan Huang, Chen Liu, Yuyin Chen, Yida Wang, Xueyang Zhang, et al. Recondreamer: Crafting world models for driving scene reconstruction via online restoration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 1559–1569, 2025
2025
-
[54]
Social lstm: Human trajectory prediction in crowded spaces
Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. Social lstm: Human trajectory prediction in crowded spaces. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
2016
-
[55]
Royden Wagner, Omer Sahin Tas, Jaime Villa, Felix Hauser, Yinzhe Shen, Marlon Steiner, Dominik Strutz, Carlos Fernandez, Christian Kinzig, Guillermo S. Guitierrez-Cabello, Hendrik Königshof, Fabian Immel, Richard Schwarzkopf, Nils Alexander Rack, Kevin Rösch, Kaiwen Wang, Jan-Hendrik Pauls, Martin Lauer, Igor Gilitschenski, Holger Caesar, and Christoph St...
Pith/arXiv arXiv 2026
-
[56]
Divide and merge: Motion and semantic learning in end-to-end autonomous driving.Transactions on Machine Learning Research, 2025(11), 2025
Yinzhe Shen, Omer ¸ Sahin Tas, Kaiwen Wang, Royden Wagner, and Christoph Stiller. Divide and merge: Motion and semantic learning in end-to-end autonomous driving.Transactions on Machine Learning Research, 2025(11), 2025
2025
-
[57]
Navigation-guided sparse scene representation for end-to-end autonomous driving
Peidong Li and Dixiao Cui. Navigation-guided sparse scene representation for end-to-end autonomous driving. InInternational Conference on Learning Representations (ICLR), 2025
2025
-
[58]
Epona: Autoregressive diffusion world model for autonomous driving
Kaiwen Zhang, Zhenyu Tang, Xiaotao Hu, Xingang Pan, Xiaoyang Guo, Yuan Liu, Jingwei Huang, Li Yuan, Qian Zhang, Xiao-Xiao Long, Xun Cao, and Wei Yin. Epona: Autoregressive diffusion world model for autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
2025
-
[59]
RawTherapee: A powerful cross-platform raw photo processing program
RawTherapee Development Team. RawTherapee: A powerful cross-platform raw photo processing program. URL https://github.com/RawTherapee/RawTherapee. Includes the AMaZE demosaicing algorithm and raw-domain chromatic aberration correction by E. J. Martinec
-
[60]
Users prefer jpegli over same-sized libjpeg-turbo or mozjpeg, 2024
Martin Bruse, Luca Versari, Zoltan Szabadka, and Jyrki Alakuijala. Users prefer jpegli over same-sized libjpeg-turbo or mozjpeg, 2024. URLhttps://arxiv.org/abs/2403.18589
arXiv 2024
-
[61]
brighter AI Technologies. Face off: Privacy v progress — how deep natural anonymization pro- tects privacy in the age of machine learning. White paper, brighter AI Technologies GmbH, Berlin, Germany, 2022. URL https://ac-landing-pages-user-uploads-production.s3.amazonaws. com/0000122471/803bb7a7-de73-4596-9548-6d1ca3a80e32.pdf
arXiv 2022
-
[62]
3dref: 3d dataset and benchmark for reflection detection in rgb and lidar data
Xiting Zhao and Sören Schwertfeger. 3dref: 3d dataset and benchmark for reflection detection in rgb and lidar data. In2024 International Conference on 3D Vision (3DV), pages 225–234, 2024. doi: 10.1109/3DV62453.2024.00009. 13
-
[63]
Kiss-slam: A simple, robust, and accurate 3d lidar slam system with enhanced generalization capabilities
Tiziano Guadagnino, Benedikt Mersch, Saurabh Gupta, Ignacio Vizzo, Giorgio Grisetti, and Cyrill Stachniss. Kiss-slam: A simple, robust, and accurate 3d lidar slam system with enhanced generalization capabilities. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5363–5370. IEEE, 2025
2025
-
[64]
Calibrating multiple cameras with non-overlapping views using coded checkerboard targets
Tobias Strauß, Julius Ziegler, and Johannes Beck. Calibrating multiple cameras with non-overlapping views using coded checkerboard targets. In17th International IEEE Conference on Intelligent Transportation Systems (ITSC), pages 2623–2628, 2014. doi: 10.1109/ITSC.2014.6958110
-
[65]
Generalized b-spline camera model
Johannes Beck and Christoph Stiller. Generalized b-spline camera model. In2018 IEEE Intelligent Vehicles Symposium (IV), pages 2137–2142, 2018. doi: 10.1109/IVS.2018.8500466
-
[66]
SAM 3: Segment anything with concepts
Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Didac Suris Coll- Vinent, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, Jie Lei, Tengyu Ma, Baishan Guo, Arpit Kalla, Markus Marks, Joseph Greer, Meng Wang, Peize Sun, Roman Rädle, Triantafyllos Afouras, Effrosyni Mavroudi, Katherine Xu, Tsung-Han Wu, Yu Z...
2026
-
[67]
Java OpenStreetMap Editor
JOSM. Java OpenStreetMap Editor. https://josm.openstreetmap.de/, 2026. Accessed: 01.05.2026
2026
-
[68]
Mapillary
Mapillary. Mapillary. https://www.mapillary.com/app, 2026. Street-level imagery platform. Ac- cessed: 2026-05-04
2026
-
[69]
Steven Macenski, Tully Foote, Brian Gerkey, Chris Lalancette, and William Woodall. Robot operating system 2: Design, architecture, and uses in the wild.Science Robotics, 7(66):eabm6074, 2022. doi: 10.1126/scirobotics.abm6074. URL https://www.science.org/doi/abs/10.1126/scirobotics. abm6074
-
[70]
Localization is all you evaluate: Data leakage in online mapping datasets and how to fix it
Adam Lilja, Junsheng Fu, Erik Stenborg, and Lars Hammarstrand. Localization is all you evaluate: Data leakage in online mapping datasets and how to fix it. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22150–22159, 2024
2024
-
[71]
Scaling open-vocabulary object detection
Matthias Minderer, Alexey Gritsenko, and Neil Houlsby. Scaling open-vocabulary object detection. Advances in Neural Information Processing Systems, 36:72983–73007, 2023. 14 A Details on the Sensor Setup Tables 7 to 10 describe our sensor setup in detail, with a real-world picture of it shown in Figure 10. Table 7:Camera setup.All cameras are manufactured ...
arXiv 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.