pith. machine review for the scientific record. sign in

arxiv: 2604.21471 · v1 · submitted 2026-04-23 · 💻 cs.RO

Recognition: unknown

Ufil: A Unified Framework for Infrastructure-based Localization

Authors on Pith no claims yet

Pith reviewed 2026-05-09 21:54 UTC · model grok-4.3

classification 💻 cs.RO
keywords infrastructure-based localizationmulti-object trackingsensor fusioncooperative awareness messagesroadside sensorsunified frameworkROS 2CAV testbed
0
0 comments X

The pith

Ufil standardizes object models and tracking interfaces so researchers can fuse vehicle messages, lidar, and road sensors into one pipeline without rebuilding it for each new source.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Ufil as a framework that separates the common parts of infrastructure localization from the sensor-specific parts. It supplies a single object model plus reusable pieces for prediction, detection, association, state update, and track management, each with clear interfaces. This design lets developers plug in different data sources and run the identical pipeline in both simulation and physical testbeds. The authors integrate three heterogeneous inputs and report lane-level accuracy with low latency in scenarios involving hundreds of vehicles. A reader would care because repeated custom stacks have slowed progress on road-user tracking for safety applications, and a shared base could let more groups test improvements faster.

Core claim

Ufil supplies a standardized object model together with reusable multi-object tracking components that expose interfaces for prediction, detection, association, state update, and track management. These pieces allow three different data sources—ETSI ITS-G5 Cooperative Awareness Messages from vehicles, lidar detections from roadside nodes, and measurements from an in-road sensitive surface—to feed a single localization pipeline that executes without modification in the CARLA simulator and in the CPM Lab testbed. In a three-lane highway scenario the fused output reaches mean lateral position RMSEs of 0.31 m in simulation and 0.29 m on the physical platform, with mean absolute orientation error

What carries the argument

The standardized object model and reusable multi-object tracking components with defined interfaces for each processing step.

If this is right

  • Any single component such as a new association method can be swapped in while the rest of the pipeline stays intact.
  • The same code base produces comparable accuracy and latency whether run with hundreds of simulated vehicles or dozens in a physical lab.
  • Fusing the three chosen modalities already yields lateral RMSE below 0.32 m and orientation error near 2.2 degrees.
  • End-to-end latency from each modality to the fused state estimate stays under 100 ms in both environments.
  • Open reference implementations and examples make it possible to test new sensors or algorithms without starting from scratch.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same interface design could support additional roadside sensors such as radar or additional ITS message types with little extra work.
  • If the object model proves general enough, it might serve as a common exchange format for other infrastructure projects beyond this framework.
  • The scale-independent execution suggests the approach could move from small testbeds to larger intersections or corridors without code changes.
  • Community contributions to individual tracking steps could accumulate into a shared library of interchangeable components.

Load-bearing premise

That one object model and set of tracking interfaces can represent and combine states from very different sensor types without needing substantial per-application changes.

What would settle it

Add a fourth sensor type, such as camera detections, to the pipeline and measure whether the existing interfaces still accept the data, maintain sub-100 ms median latency, and preserve lane-level accuracy without rewriting core components.

Figures

Figures reproduced from arXiv: 2604.21471 by Bassam Alrifaee, Lucas Hegerath, Marius Molz, Massimo Marcon, Simon Sch\"afer.

Figure 1
Figure 1. Figure 1: Object definition in a bird’s eye view. object model, which simplifies algorithm exchange and com￾parison and underpins the portability demonstrated later in Sections IV and V. B. Prediction Ufil applications implement discrete-time multi-object tracking. For each time step k, the prediction step computes the most probable state of each object at time step k given its state at time step k − 1 and a motion … view at source ↗
Figure 2
Figure 2. Figure 2: Runtimes of the four solvers implemented in Ufil. Each solver solved [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Results of the simple simulation crossing example for different [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Testing scenario: a three-lane highway segment with SSL, lidar OSN, [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Summary of state accuracy. For each domain, we plot the error for [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

Infrastructure-based localization enhances road safety and traffic management by providing state estimates of road users. Development is hindered by fragmented, application-specific stacks that tightly couple perception, tracking, and middleware. We introduce Ufil, a Unified Framework for Infrastructure-Based Localization with a standardized object model and reusable multi-object tracking components. Ufil offers interfaces and reference implementations for prediction, detection, association, state update, and track management, allowing researchers to improve components without reimplementing the pipeline. Ufil is open-source C++/ROS 2 software with documentation and executable examples. We demonstrate Ufil by integrating three heterogeneous data sources into a single localization pipeline combining (i) vehicle onboard units broadcasting ETSI ITS-G5 Cooperative Awareness Messages, (ii) a lidar-based roadside sensor node, and (iii) an in-road sensitive surface layer. The pipeline runs unchanged in the CARLA simulator and a small-scale CAV testbed, demonstrating Ufil's scale-independent execution model. In a three-lane highway scenario with 423 and 355 vehicles in simulation and testbed, respectively, the fused system achieves lane-level lateral accuracy with mean lateral position RMSEs of 0.31 m in CARLA and 0.29 m in the CPM Lab, and mean absolute orientation errors around 2.2{\deg}. Median end-to-end latencies from sensing to fused output remain below 100 ms across all modalities in both environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces Ufil, a unified open-source C++/ROS 2 framework for infrastructure-based localization. It defines a standardized object model together with reusable multi-object tracking components and reference implementations for prediction, detection, association, state update, and track management. The framework is demonstrated by fusing three heterogeneous sources—ETSI ITS-G5 CAM messages, lidar roadside units, and in-road sensitive surfaces—into a single pipeline that runs unchanged in the CARLA simulator and the CPM Lab testbed. In a three-lane highway scenario the fused system reports mean lateral position RMSEs of 0.31 m (CARLA) and 0.29 m (CPM Lab), mean absolute orientation errors of approximately 2.2°, and median end-to-end latencies below 100 ms across modalities.

Significance. If the reusability and cross-platform claims hold, Ufil could reduce redundant engineering effort in infrastructure localization research by supplying a common, modular pipeline that accommodates heterogeneous sensors and executes identically in simulation and hardware. The open-source release with documentation and examples, together with concrete accuracy and latency numbers obtained in two distinct environments, provides a verifiable starting point for further component-level improvements.

major comments (2)
  1. [Evaluation] Evaluation section: the reported RMSE and latency figures are given as single mean values without accompanying standard deviations, number of independent runs, or statistical tests. This weakens the robustness claim for lane-level accuracy and sub-100 ms performance, especially given the modest vehicle counts (423 in simulation, 355 in testbed).
  2. [Framework and Evaluation] Framework description and evaluation: while the paper asserts that the standardized object model and reusable components enable integration “without reimplementing the pipeline,” no quantitative evidence (e.g., lines of code changed or time required) is supplied for adding a fourth sensor modality or porting to a new application. This leaves the weakest assumption untested.
minor comments (3)
  1. [Abstract and Introduction] Abstract and §1: several acronyms (CAM, CPM Lab, ETSI ITS-G5) appear before their first definitions; add a short acronym table or inline expansions on first use.
  2. [Implementation] The manuscript would benefit from a short table comparing the three input modalities (message format, update rate, typical noise characteristics) to clarify how the standardized object model normalizes them.
  3. [Evaluation] Figure captions and axis labels in the results plots should explicitly state the number of vehicles or time windows used for each RMSE and latency statistic.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the positive evaluation and recommendation for minor revision. We address the major comments point by point below.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the reported RMSE and latency figures are given as single mean values without accompanying standard deviations, number of independent runs, or statistical tests. This weakens the robustness claim for lane-level accuracy and sub-100 ms performance, especially given the modest vehicle counts (423 in simulation, 355 in testbed).

    Authors: We appreciate this observation. The reported mean RMSE and latency values are computed over all vehicles in the single execution of the three-lane highway scenario in each environment (423 vehicles in CARLA, 355 in CPM Lab). We performed no multiple independent runs or statistical tests, as the evaluation focus was on demonstrating cross-platform operation of the unified pipeline rather than statistical properties of the tracking algorithm. In the revised manuscript we will explicitly note the single-run nature of the experiments, restate the vehicle counts, and qualify the robustness claims to reflect that these results establish feasibility across simulation and hardware. revision: yes

  2. Referee: [Framework and Evaluation] Framework description and evaluation: while the paper asserts that the standardized object model and reusable components enable integration “without reimplementing the pipeline,” no quantitative evidence (e.g., lines of code changed or time required) is supplied for adding a fourth sensor modality or porting to a new application. This leaves the weakest assumption untested.

    Authors: We agree that quantitative integration-effort metrics would strengthen the reusability argument. The current evaluation shows successful fusion of three heterogeneous sources (CAM, lidar, sensitive surfaces) using the standardized interfaces without altering the core pipeline. We did not, however, conduct a fourth-modality experiment or record lines-of-code or time measurements. In revision we will expand the discussion of the object model and component interfaces to illustrate extensibility using the existing three-modality case as qualitative support. We cannot supply the requested quantitative figures because such measurements were not collected. revision: partial

standing simulated objections not resolved
  • Quantitative evidence (lines of code changed or developer time) for adding a fourth sensor modality or porting to a new application, as no such experiment was performed.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper describes an open-source software framework (Ufil) for integrating heterogeneous infrastructure sensors into a unified multi-object tracking pipeline. Its core claims rest on empirical results from CARLA simulation and CPM Lab testbed runs, with reported lateral RMSEs (0.31 m / 0.29 m) and sub-100 ms latencies. No equations, parameter fits, or predictions appear in the provided text; the work supplies reference implementations and an open-source release for independent verification. No self-citations, ansatzes, or renamings reduce the central claims to inputs by construction. This is a standard engineering demonstration paper with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central contribution is the software framework itself rather than new physical principles or fitted parameters; relies on established assumptions in sensor fusion and middleware.

axioms (1)
  • domain assumption Standard robotics middleware like ROS 2 can reliably handle real-time data fusion from multiple sensors
    The framework is implemented in ROS 2 and assumes its communication and timing capabilities support the localization pipeline.
invented entities (1)
  • Standardized object model no independent evidence
    purpose: To enable reusable multi-object tracking components across different applications
    The paper introduces this model as part of the unified framework to decouple perception, tracking, and middleware components.

pith-pipeline@v0.9.0 · 5562 in / 1535 out tokens · 53370 ms · 2026-05-09T21:54:58.513882+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 4 canonical work pages

  1. [1]

    Autoware on board: Enabling autonomous vehicles with embedded systems,

    S. Kato, S. Tokunaga, Y . Maruyama, S. Maeda, M. Hirabayashi, Y . Kitsukawa, A. Monrroy, T. Ando, Y . Fujii, and T. Azumi, “Autoware on board: Enabling autonomous vehicles with embedded systems,” in Proceedings of the 9th ACM/IEEE International Conference on Cyber- Physical Systems (ICCPS), 2018, pp. 287–296

  2. [2]

    Edgar: An autonomous driving research platform – from feature development to real-world application,

    P. Karle, T. Betz, M. Bosk, F. Fent, N. Gehrke, M. Geisslinger, L. Gressenbuch, P. Hafemann, S. Huber, M. H ¨ubner, S. Huch, G. Kaljavesi, T. Kerbl, D. Kulmer, T. Mascetta, S. Maierhofer, F. Pfab, F. Rezabek, E. Rivera, S. Sagmeister, L. Seidlitz, F. Sauerbeck, I. Tahiraj, R. Trauth, N. Uhlemann, G. W ¨ursching, B. Zarrouki, M. Althoff, J. Betz, K. Bengle...

  3. [3]

    Automated driving toolbox,

    T. M. Inc., “Automated driving toolbox,” Natick, Massachusetts, United States, 2025, version: 2025a. [Online]. Available: https: //www.mathworks.com/help/driving/index.html

  4. [4]

    A survey on small-scale testbeds for connected and automated vehicles and robot swarms: A guide for creating a new testbed,

    A. Mokhtarian, J. Xu, P. Scheffe, M. Kloock, S. Sch ¨afer, H. Bang, V .- A. Le, S. Ulhas, J. Betz, S. Wilson, S. Berman, L. Paull, A. Prorok, and B. Alrifaee, “A survey on small-scale testbeds for connected and automated vehicles and robot swarms: A guide for creating a new testbed,”IEEE Robotics & Automation Magazine, 2024

  5. [5]

    From small-scale to full-scale: Assessing the potential for transferability of experimental results in small-scale cav testbeds,

    S. Sch ¨afer and B. Alrifaee, “From small-scale to full-scale: Assessing the potential for transferability of experimental results in small-scale cav testbeds,” inIEEE International Conference on Vehicular Electronics and Safety (ICVES), 2024

  6. [6]

    Choose your simulator wisely: A review on open-source simulators for autonomous driving,

    Y . Li, W. Yuan, S. Zhang, W. Yan, Q. Shen, C. Wang, and M. Yang, “Choose your simulator wisely: A review on open-source simulators for autonomous driving,”IEEE Transactions on Intelligent Vehicles, vol. 9, no. 5, pp. 4861–4876, 2024

  7. [7]

    CARLOS: An Open, Modular, and Scalable Simulation Framework for the Development and Testing of Software for C-ITS,

    C. Geller, B. Haas, A. Kloeker, J. Hermens, B. Lampe, T. Beemel- manns, and L. Eckstein, “CARLOS: An Open, Modular, and Scalable Simulation Framework for the Development and Testing of Software for C-ITS,” inIEEE Intelligent Vehicles Symposium (IV), 2024, pp. 3100– 3106

  8. [8]

    A9-dataset: Multi-sensor infrastructure- based dataset for mobility research,

    C. Creß, W. Zimmer, L. Strand, M. Fortkord, S. Dai, V . Lakshmi- narasimhan, and A. Knoll, “A9-dataset: Multi-sensor infrastructure- based dataset for mobility research,” in2022 IEEE Intelligent Vehicles Symposium (IV), 2022, pp. 965–970

  9. [9]

    Lumpi: The leibniz university multi-perspective intersection dataset,

    S. Busch, C. Koetsier, J. Axmann, and C. Brenner, “Lumpi: The leibniz university multi-perspective intersection dataset,” in2022 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2022, pp. 1127–1134

  10. [10]

    8066–8076

    K. C. Sekaran, M. Geisler, D. R ¨oßle, A. Mohan, D. Cremers, W. Utschick, M. Botsch, W. Huber, and T. Sch ¨on, “Urbaning-v2x: A large-scale multi-vehicle, multi-infrastructure dataset across multiple intersections for cooperative perception,” 2025. [Online]. Available: https://arxiv.org/abs/2510.23478

  11. [11]

    Collaboration helps camera overtake lidar in 3d detection,

    Y . Hu, Y . Lu, R. Xu, W. Xie, S. Chen, and Y . Wang, “Collaboration helps camera overtake lidar in 3d detection,” inThe IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

  12. [12]

    Dair-v2x: A large-scale dataset for vehicle- infrastructure cooperative 3d object detection,

    H. Yu, Y . Luo, M. Shu, Y . Huo, Z. Yang, Y . Shi, Z. Guo, H. Li, X. Hu, J. Yuan, and Z. Nie, “Dair-v2x: A large-scale dataset for vehicle- infrastructure cooperative 3d object detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21 361–21 370

  13. [13]

    Semi-automatic annotation of 3d radar and camera for smart infrastructure-based perception,

    S. Agrawal, S. Bhanderi, and G. Elger, “Semi-automatic annotation of 3d radar and camera for smart infrastructure-based perception,”IEEE Access, vol. 12, pp. 34 325–34 341, 2024

  14. [14]

    Coopscenes: Multi-scene infrastructure and vehicle data for advancing collective perception in autonomous driving,

    M. V osshans, A. Baumann, M. Drueppel, O. Ait-Aider, Y . Mezouar, T. Dang, and M. Enzweiler, “Coopscenes: Multi-scene infrastructure and vehicle data for advancing collective perception in autonomous driving,” in2025 IEEE Intelligent Vehicles Symposium (IV), 2025, pp. 1040–1047

  15. [15]

    2.5d object detection for intelligent roadside infrastructure,

    N. Polley, Y . Boualili, F. M ¨utsch, M. Zipfl, T. Fleck, and J. M. Z ¨ollner, “2.5d object detection for intelligent roadside infrastructure,” 2025. [Online]. Available: https://arxiv.org/abs/2507.03564

  16. [16]

    Set-theoretic local- ization for mobile robots with infrastructure-based sensing,

    X. Li, Y . Li, N. Li, A. Girard, and I. Kolmanovsky, “Set-theoretic local- ization for mobile robots with infrastructure-based sensing,”Advanced Control for Applications: Engineering and Industrial Systems, vol. 5, no. 1, p. e117, 2023

  17. [17]

    An extensible framework for open heterogeneous collaborative perception,

    Y . Lu, Y . Hu, Y . Zhong, D. Wang, S. Chen, and Y . Wang, “An extensible framework for open heterogeneous collaborative perception,” inThe Twelfth International Conference on Learning Representations, 2024

  18. [18]

    Comparison and evaluation of advanced motion models for vehicle tracking,

    R. Schubert, E. Richter, and G. Wanielik, “Comparison and evaluation of advanced motion models for vehicle tracking,” in2008 11th Interna- tional Conference on Information Fusion, 2008, pp. 1–6

  19. [19]

    Bar-Shalom, X

    Y . Bar-Shalom, X. R. Li, and T. Kirubarajan,Estimation with applica- tions to tracking and navigation: theory algorithms and software. John Wiley & Sons, 2002

  20. [20]

    R. S. Bucy and P. D. Joseph,Filtering for stochastic processes with applications to guidance. Chelsea Pub. Co, 1987

  21. [21]

    Aeberhard,Object-level fusion for surround environment perception in automated driving applications

    M. Aeberhard,Object-level fusion for surround environment perception in automated driving applications. VDI Verlag, 2017

  22. [22]

    Lidar-based tracking of traffic participants with sensor nodes in existing urban infrastructure,

    S. Sch ¨afer, B. Alrifaee, and E. Hashemi, “Lidar-based tracking of traffic participants with sensor nodes in existing urban infrastructure,”

  23. [23]

    Available: https://arxiv.org/abs/2509.20009

    [Online]. Available: https://arxiv.org/abs/2509.20009

  24. [24]

    V2aix: A multi- modal real-world dataset of etsi its v2x messages in public road traffic,

    G. Kueppers, J.-P. Busch, L. Reiher, and L. Eckstein, “V2aix: A multi- modal real-world dataset of etsi its v2x messages in public road traffic,” inIEEE International Conference on Intelligent Transportation Systems (ITSC), 2024, pp. 392–398

  25. [25]

    Investigating a pressure sensitive surface layer for vehicle localization,

    S. Sch ¨afer, H. Steidl, S. Kowalewski, and B. Alrifaee, “Investigating a pressure sensitive surface layer for vehicle localization,” inIEEE Intelligent Vehicles Symposium (IV), 2023

  26. [26]

    CARLA: An open urban driving simulator,

    A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “CARLA: An open urban driving simulator,” inProceedings of the 1st Annual Conference on Robot Learning, ser. Proceedings of Machine Learning Research, S. Levine, V . Vanhoucke, and K. Goldberg, Eds., vol. 78. PMLR, 11 2017, pp. 1–16

  27. [27]

    Cyber-physical mobility lab: An open- source platform for networked and autonomous vehicles,

    M. Kloock, P. Scheffe, J. Maczijewski, A. Kampmann, A. Mokhtarian, S. Kowalewski, and B. Alrifaee, “Cyber-physical mobility lab: An open- source platform for networked and autonomous vehicles,” inEUCA European Control Conference (ECC), 2021, pp. 1937–1944

  28. [28]

    Networked and autonomous model- scale vehicles for experiments in research and education,

    P. Scheffe, J. Maczijewski, M. Kloock, A. Kampmann, A. Derks, S. Kowalewski, and B. Alrifaee, “Networked and autonomous model- scale vehicles for experiments in research and education,”IFAC- PapersOnLine, vol. 53, no. 2, pp. 17 332–17 337, 2020

  29. [29]

    Vision-based real-time indoor positioning system for multiple vehicles,

    M. Kloock, P. Scheffe, I. T ¨ulleners, J. Maczijewski, S. Kowalewski, and B. Alrifaee, “Vision-based real-time indoor positioning system for multiple vehicles,”IFAC-PapersOnLine, vol. 53, no. 2, pp. 15 446– 15 453, 2020