pith. machine review for the scientific record. sign in

arxiv: 2605.08528 · v1 · submitted 2026-05-08 · 💻 cs.MA · cs.RO

Recognition: no theorem link

SceneFactory: GPU-Accelerated Multi-Agent Driving Simulation with Physics-Based Vehicle Dynamics

Yicheng Zhu , Yang Chen , Tao Li , Zilin Bian

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:13 UTC · model grok-4.3

classification 💻 cs.MA cs.RO
keywords multi-agent simulationautonomous drivingGPU vectorizationphysics-based dynamicsvehicle simulationreinforcement learningfriction modelingrigid-body contact
0
0 comments X

The pith

GPU vectorization lets physics-based multi-agent driving simulation run 127 times faster than non-batched versions on the same hardware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to prove that detailed rigid-body vehicle physics, tire-road contact, and condition-dependent friction can be kept while scaling to hundreds of concurrent driving worlds on one GPU. It does this by turning entire scenes, agents, controls, and observations into batched tensors so that simulation steps execute as parallel GPU operations. If the approach holds, training loops for autonomous driving policies could use realistic dynamics and road variation at throughputs previously available only with simplified kinematic models. A reader would care because this removes the usual trade-off between physical accuracy and the volume of experience needed for reinforcement learning.

Core claim

By expressing multiple independent worlds and their articulated vehicles as GPU tensors, the system runs physics steps, control application, and data collection concurrently across 256 worlds with 16 agents each, reaching 19,250 controlled-agent simulation steps per second. This produces a measured 127-fold throughput gain over the same physics solver run without vectorization. Transfer tests show policies trained under full physics reach 99.5 percent success when moved to a kinematic model, while the reverse direction falls to 47.3 percent; policies that respond to friction changes cut mean peak deceleration on wet roads from 58.7 to 27.8 m/s squared.

What carries the argument

Batched tensor representation of worlds, agents, controls, observations, and rewards that allows the physics solver to advance all scenarios simultaneously as GPU operations.

If this is right

  • Reinforcement learning policies trained with full articulated dynamics transfer successfully to simpler kinematic models.
  • The reverse transfer from kinematic to physics-based models succeeds at much lower rates.
  • Friction-aware policies measurably reduce peak deceleration demands when road conditions change.
  • Large numbers of concurrent physics-accurate worlds become practical for training without discarding contact or suspension effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Training loops could incorporate real road geometry and weather-dependent friction at scales that previously required simplified models.
  • The same tensor batching pattern could be tested in other multi-body domains where both parallelism and physical fidelity matter.
  • Further increases in world or agent count per GPU might be feasible if memory and kernel launch overhead remain manageable.

Load-bearing premise

The batched tensor operations on the physics solver produce vehicle dynamics, contacts, and friction responses that match those of the non-batched solver within acceptable numerical tolerance.

What would settle it

Run identical control sequences and initial conditions in both the batched vectorized mode and the standard single-world mode, then compare resulting positions, velocities, tire forces, and contact impulses for any divergence larger than floating-point error.

Figures

Figures reproduced from arXiv: 2605.08528 by Tao Li, Yang Chen, Yicheng Zhu, Zilin Bian.

Figure 1
Figure 1. Figure 1: Existing autonomous-driving simulators typically choose between scalable geometry [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative comparison of four simulators. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Throughput (CASPS, log scale). Blue: 16 ag/world; Orange: 1 ag/world; gray dashed: Baseline (∼152–174) [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Effective static friction µstatic vs. water-film depth for the three road surface presets in SceneFactory, computed from the modified ALL model [8]. OGFC retains the highest grip at all film depths due to its coarser texture (θ=1.21, Td=1.08 mm) vs. SMA (θ=1.09) and AC (θ=1.00). Vertical dashed lines mark the dry (0 mm) and moderate-wet (0.5 mm) eval conditions used in Appendix K. The shaded band marks the… view at source ↗
Figure 5
Figure 5. Figure 5: Randomly loaded Scenefactory scenes with diverse road topologies. [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Scenefactory loaded with four randomly selected Waymo Open Motion Dataset traffic [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Vehicle model configuration used in Scenefactory. (The non-rectangular upper body is [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
read the original abstract

Autonomous-driving simulators typically trade physical fidelity for scalable parallelism. Physics-based platforms such as CARLA and MetaDrive provide articulated vehicle dynamics and contact, but their non-vectorized interfaces make batched training difficult. GPU-batched systems such as Waymax and GPUDrive scale to hundreds of scenarios by replacing rigid-body physics with simplified kinematic models, omitting tire--road interaction, suspension, contact dynamics, and road-condition-dependent friction. We introduce SceneFactory, a GPU-vectorized platform for procedural scene construction, physics-based multi-agent simulation, and RL in autonomous-driving environments. Built on NVIDIA Isaac Sim + Isaac Lab, SceneFactory represents worlds and agents as batched tensors: control, observations, rewards, resets, and policy inference run as GPU tensor operations over the Isaac Lab tensor API. SceneFactory converts Waymo Open Motion Dataset road topologies into simulation-ready USD worlds, runs many worlds concurrently on one GPU, populates each with multiple articulated PhysX vehicles, and maps precipitation and road-surface type to PhysX material friction coefficients. With GPU vectorization, SceneFactory achieves up to 127$\times$ higher throughput than a non-vectorized PhysX baseline on the same GPU and physics solver, reaching 19,250 controlled-agent simulation steps per second at 256 worlds $\times$ 16 agents. Cross-simulator transfer reveals an asymmetric dynamics gap: physics-grounded RL policies transfer to a simplified kinematic bicycle model with 99.5% success, whereas reverse transfer drops to 47.3%. Under wet-road friction, friction-aware policies reduce mean peak DRAC from 58.7 to 27.8,m/s$^2$ without sacrificing goal reach. SceneFactory shows that scalable autonomous-driving training need not discard articulated rigid-body dynamics or physically grounded road-condition variation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. SceneFactory is a GPU-vectorized multi-agent driving simulator built on NVIDIA Isaac Sim and Isaac Lab. It converts Waymo Open Motion Dataset road topologies into USD worlds, populates them with multiple articulated PhysX vehicles, maps road conditions to friction coefficients, and runs batched tensor operations for control, observation, reward, and reset over the Isaac Lab tensor API. The central empirical claims are a 127× throughput increase over a non-vectorized PhysX baseline (reaching 19,250 controlled-agent steps/s at 256 worlds × 16 agents), asymmetric cross-simulator policy transfer (99.5% physics-to-kinematic vs. 47.3% reverse), and reduced peak DRAC (58.7 to 27.8 m/s²) for friction-aware policies on wet roads.

Significance. If the batched PhysX dynamics remain equivalent to the single-instance baseline, the work is significant because it demonstrates that articulated rigid-body physics and road-condition variation can be retained at the scale needed for parallel RL training, rather than defaulting to kinematic approximations as in Waymax or GPUDrive. The concrete throughput numbers, the transfer asymmetry result, and the friction-policy experiment provide direct, falsifiable support for the thesis that scalable autonomous-driving training need not discard physically grounded dynamics.

major comments (2)
  1. [§5.1] §5.1 (throughput evaluation): The 127× speedup and the claim of preserving full physics-based dynamics both rest on the assumption that the GPU-batched Isaac Lab / PhysX implementation produces numerically identical trajectories and contact forces to the non-vectorized baseline. No quantitative equivalence check (state L2 error, friction-force histograms, or collision-outcome agreement under matched initial conditions and controls) is reported; batching can alter solver convergence or floating-point accumulation even with the same underlying solver.
  2. [§5.3] §5.3 (cross-simulator transfer): The reported success rates of 99.5% (physics-to-kinematic) and 47.3% (reverse) are presented without the number of evaluation episodes, random seeds, or variance; this makes it impossible to judge whether the asymmetric gap is statistically robust and therefore weakens the supporting evidence for the central claim that physics-grounded policies are meaningfully different.
minor comments (3)
  1. [Abstract] Abstract: the string '27.8,m/s$^2$' contains a formatting error (missing space and incorrect comma); it should read '27.8 m/s²'.
  2. [Abstract and §3] The abstract and §3 do not cite the Isaac Lab or PhysX documentation/papers; adding these references would clarify the exact tensor API and solver version used.
  3. [§5.1 and §5.3] Throughput and transfer tables/figures should report standard deviations or confidence intervals across random seeds to allow readers to assess variability of the 19,250 steps/s and 99.5%/47.3% figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments identify areas where additional evidence and statistical details will strengthen the manuscript. We address each major comment below and commit to the corresponding revisions.

read point-by-point responses
  1. Referee: [§5.1] §5.1 (throughput evaluation): The 127× speedup and the claim of preserving full physics-based dynamics both rest on the assumption that the GPU-batched Isaac Lab / PhysX implementation produces numerically identical trajectories and contact forces to the non-vectorized baseline. No quantitative equivalence check (state L2 error, friction-force histograms, or collision-outcome agreement under matched initial conditions and controls) is reported; batching can alter solver convergence or floating-point accumulation even with the same underlying solver.

    Authors: We agree that a direct quantitative verification of numerical equivalence is necessary to rigorously support the claim that the batched implementation preserves full physics fidelity. Although the underlying PhysX solver, material parameters, and integration settings are identical between the single-instance and batched versions, parallel execution could in principle affect solver convergence or floating-point accumulation. In the revised manuscript we will add an equivalence analysis in §5.1 that reports state L2 errors, contact-force histograms, and collision-outcome agreement under matched initial conditions and controls. This will be performed on a representative set of scenarios drawn from the Waymo Open Motion Dataset. revision: yes

  2. Referee: [§5.3] §5.3 (cross-simulator transfer): The reported success rates of 99.5% (physics-to-kinematic) and 47.3% (reverse) are presented without the number of evaluation episodes, random seeds, or variance; this makes it impossible to judge whether the asymmetric gap is statistically robust and therefore weakens the supporting evidence for the central claim that physics-grounded policies are meaningfully different.

    Authors: We concur that the absence of evaluation details prevents readers from assessing statistical robustness. The transfer experiments were conducted over a substantial number of episodes and multiple random seeds, yet these quantities and the associated variance were not reported. In the revised manuscript we will explicitly state the number of evaluation episodes, the number of random seeds, and the observed variance (or standard deviation) for both transfer directions, together with any confidence intervals. This information will be added to §5.3. revision: yes

Circularity Check

0 steps flagged

No circularity; all performance and transfer claims are direct empirical measurements against external baselines

full rationale

The paper is a systems contribution introducing SceneFactory, a GPU-vectorized multi-agent simulator. Its central claims—127× throughput improvement, 19,250 steps/s at 256×16 agents, asymmetric cross-simulator transfer (99.5% vs 47.3%), and DRAC reduction under wet roads—are reported as measured outcomes from running the new system against non-vectorized PhysX and kinematic bicycle baselines. No equations, fitted parameters, self-definitional loops, or load-bearing self-citations appear in the provided text. The numerical-equivalence assumption for batched PhysX is an unverified premise rather than a derivation that reduces to the paper's own inputs by construction. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The platform rests on the correctness of the underlying NVIDIA Isaac Sim / PhysX engine and the fidelity of the Waymo-to-USD conversion. No new free parameters are introduced beyond standard simulation settings; no new physical entities are postulated.

axioms (2)
  • domain assumption PhysX rigid-body dynamics and contact model remain accurate when executed in batched tensor form via Isaac Lab
    Invoked when claiming that vectorization preserves physics fidelity while increasing speed.
  • domain assumption Precipitation and road-surface labels from Waymo can be mapped to PhysX friction coefficients in a way that produces realistic vehicle behavior
    Used to justify the friction-aware policy experiments.

pith-pipeline@v0.9.0 · 5633 in / 1483 out tokens · 44212 ms · 2026-05-12T01:13:48.798808+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 1 internal anchor

  1. [1]

    How Do Weather Events Affect Roads? - FHWA Road Weather Management

    FHWA. How Do Weather Events Affect Roads? - FHWA Road Weather Management. URL https://ops.fhwa.dot.gov/weather/roadimpact.htm

  2. [2]

    CARLA: An open urban driving simulator

    Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. InProceedings of the 1st Annual Conference on Robot Learning, pages 1–16, 2017

  3. [3]

    Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning

    Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, and Bolei Zhou. Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022

  4. [4]

    Waymax: An Accelerated, Data- Driven Simulator for Large-Scale Autonomous Driving Research

    Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, and Benjamin Sapp. Waymax: An Accelerated, Data- Driven Simulator for...

  5. [5]

    GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS, February 2025

    Saman Kazemkhani, Aarav Pandya, Daphne Cornelisse, Brennan Shacklett, and Eugene Vinit- sky. GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS, February 2025. URLhttp://arxiv.org/abs/2408.01584. arXiv:2408.01584 [cs]

  6. [6]

    NVIDIA Isaac Sim and Isaac Lab

    NVIDIA Corporation. NVIDIA Isaac Sim and Isaac Lab. 2025. https://developer. nvidia.com/isaac-sim

  7. [7]

    Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Mustafa, Ivan Bogun, Weiyue Wang, Mingxing Tan, and Dragomir Anguelov

    Kan Chen, Runzhou Ge, Hang Qiu, Rami Ai-Rfou, Charles R. Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Mustafa, Ivan Bogun, Weiyue Wang, Mingxing Tan, and Dragomir Anguelov. WOMD-LiDAR: Raw sensor dataset benchmark for motion forecasting. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2024

  8. [8]

    Tire-pavement friction modeling consider- ing pavement texture and water film.International Journal of Transportation Science and Technology, 14:99–109, June 2024

    Lanruo Zhao, Hongduo Zhao, and Juewei Cai. Tire-pavement friction modeling consider- ing pavement texture and water film.International Journal of Transportation Science and Technology, 14:99–109, June 2024. ISSN 20460430. doi: 10.1016/j.ijtst.2023.04.001. URL https://linkinghub.elsevier.com/retrieve/pii/S204604302300031X

  9. [9]

    An extensible, data-oriented architecture for high-performance, many-world simulation

    Brennan Shacklett, Luc Guy Rosenzweig, Zhiqiang Xie, Bidipta Sarkar, Andrew Szot, Erik Wijmans, Vladlen Koltun, Dhruv Batra, and Kayvon Fatahalian. An extensible, data-oriented architecture for high-performance, many-world simulation. InACM Transactions on Graphics (SIGGRAPH), 2023

  10. [10]

    SUMMIT: A Simulator for Urban Driving in Massive Mixed Traffic, March 2020

    Panpan Cai, Yiyuan Lee, Yuanfu Luo, and David Hsu. SUMMIT: A Simulator for Urban Driving in Massive Mixed Traffic, March 2020. URLhttp://arxiv.org/abs/1911.04074. arXiv:1911.04074 [cs]

  11. [11]

    SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving

    Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Chongxi Huang, Ying Wen, Kimia Hassan- zadeh, Daniel Graves, Zhengbang Zhu, Yihan Ni, Nhat Nguyen, Mohamed Elsayed, Haitham Ammar, Alexander Cowen-Rivers, Sanjeevan Ahilan, Zheng Tian, Daniel Palenicek, Kasra Rezaee, Peyman ...

  12. [12]

    arXiv preprint arXiv:2106.11810 (2021) 3, 7

    Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Beijbom, and Sammy Omari. NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles, February 2022. URL http://arxiv.org/abs/2106. 11810. arXiv:2106.11810 [cs]

  13. [13]

    Nocturne: a scalable driving benchmark for bringing multi-agent learn- ing one step closer to the real world

    Eugene Vinitsky, Nathan Lichtlé, Xiaomeng Yang, Brandon Amos, and Jakob Fo- erster. Nocturne: a scalable driving benchmark for bringing multi-agent learn- ing one step closer to the real world. In S. Koyejo, S. Mohamed, A. Agar- wal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Informa- tion Processing Systems, volume 35, pages 3962–3974. Cu...

  14. [14]

    URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ 191e9e721a2748a860714fb23aaf7c5d-Paper-Datasets_and_Benchmarks.pdf

  15. [15]

    TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters, May 2024

    Jonathan Wilder Lavington, Ke Zhang, Vasileios Lioutas, Matthew Niedoba, Yunpeng Liu, Dylan Green, Saeid Naderiparizi, Xiaoxuan Liang, Setareh Dabiri, Adam ´Scibior, Berend Zwartsenberg, and Frank Wood. TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters, May 2024. URLhttp:...

  16. [16]

    Influence of ad- verse weather on drivers’ perceived risk during car following based on driving simulations

    Chen Chen, Xiaohua Zhao, Hao Liu, Guichao Ren, and Xiaoming Liu. Influence of ad- verse weather on drivers’ perceived risk during car following based on driving simulations. Journal of Modern Transportation, 27(4):282–292, December 2019. ISSN 2095-087X, 2196-

  17. [17]

    URL http://link.springer.com/10.1007/ s40534-019-00197-4

    doi: 10.1007/s40534-019-00197-4. URL http://link.springer.com/10.1007/ s40534-019-00197-4

  18. [18]

    Ahmed and Ali Ghasemzadeh

    Mohamed M. Ahmed and Ali Ghasemzadeh. The impacts of heavy rain on speed and headway Behaviors: An investigation using the SHRP2 naturalistic driving study data.Transportation Research Part C: Emerging Technologies, 91:371–384, June 2018. ISSN 0968-090X. doi: 10.1016/j.trc.2018.04.012. URL https://www.sciencedirect.com/science/article/ pii/S0968090X1830487X

  19. [19]

    Examining the effect of adverse weather on road transportation using weather and traffic sensors.PLoS ONE, 13(10):e0205409, October

    Yichuan Peng, Yuming Jiang, Jian Lu, and Yajie Zou. Examining the effect of adverse weather on road transportation using weather and traffic sensors.PLoS ONE, 13(10):e0205409, October

  20. [20]

    doi: 10.1371/journal.pone.0205409

    ISSN 1932-6203. doi: 10.1371/journal.pone.0205409. URL https://pmc.ncbi.nlm. nih.gov/articles/PMC6191113/

  21. [21]

    Zhengqing Li and Baljit Singh Bhathal Singh. Autonomous Driving in Adverse Weather: A Multi-Modal Fusion Framework with Uncertainty-Aware Learning for Robust Obstacle Detection.International Journal of Advanced Computer Science and Applications, 16(8), 2025

  22. [22]

    Weather Effects on Obstacle Detection for Autonomous Car

    Rui Song, Jon Wetherall, Simon Maskell, and Jason Ralph. Weather Effects on Obstacle Detection for Autonomous Car. volume 2, pages 331–341. SCITEPRESS, May 2020. ISBN 978-989-758-419-0. doi: 10.5220/0009354503310341. URL https://www.scitepress. org/PublishedPapers/2020/93545

  23. [23]

    Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) , year =

    Tao Li, Haozhe Lei, and Quanyan Zhu. Self-adaptive driving in nonstationary environments through conjectural online lookahead adaptation. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 7205–7211, 2023. doi: 10.1109/ICRA48891.2023. 10161368

  24. [24]

    Worsening Perception: Real-time Degradation of Autonomous Vehicle Perception Performance for Simulation of Adverse Weather Conditions, July 2021

    Ivan Fursa, Elias Fandi, Valentina Musat, Jacob Culley, Enric Gil, Izzeddin Teeti, Louise Bilous, Isaac Vander Sluis, Alexander Rast, and Andrew Bradley. Worsening Perception: Real-time Degradation of Autonomous Vehicle Perception Performance for Simulation of Adverse Weather Conditions, July 2021. URL http://arxiv.org/abs/2103.02760. arXiv:2103.02760 [cs]

  25. [25]

    Edge AI-Powered Real-Time Decision-Making for Autonomous Vehicles in Adverse Weather Conditions, March 2025

    Milad Rahmati. Edge AI-Powered Real-Time Decision-Making for Autonomous Vehicles in Adverse Weather Conditions, March 2025. URL http://arxiv.org/abs/2503.09638. arXiv:2503.09638 [cs]. 11

  26. [26]

    Tackling Snow-Induced Challenges: Safe Autonomous Lane-Keeping with Robust Reinforcement Learning, December

    Amin Jalal Aghdasian, Farzaneh Abdollahi, and Ali Kamali Iglie. Tackling Snow-Induced Challenges: Safe Autonomous Lane-Keeping with Robust Reinforcement Learning, December

  27. [27]

    arXiv:2512.12987 [cs]

    URLhttp://arxiv.org/abs/2512.12987. arXiv:2512.12987 [cs]

  28. [28]

    Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girber, Igor Mordatch, and Olivier Bachem

    C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girber, Igor Mordatch, and Olivier Bachem. Brax — a differentiable physics engine for large scale rigid body simulation. In NeurIPS Datasets and Benchmarks Track, 2021

  29. [29]

    Orbit: A unified simulation framework for interactive robot learning environments

    Mayank Mittal, Calvin Yu, Qinxi Yu, Jingzhou Liu, Nikita Rudin, David Hoeller, Jia Lin Yuan, Ritvik Singh, Yunrong Guo, Hammad Mazhar, et al. Orbit: A unified simulation framework for interactive robot learning environments. InIEEE Robotics and Automation Letters (RA-L), 2023

  30. [30]

    Comparison of threshold determination methods for the deceleration rate to avoid a crash (DRAC)-based crash estimation.Accident Analysis & Pre- vention, 153:106051, April 2021

    Chuanyun Fu and Tarek Sayed. Comparison of threshold determination methods for the deceleration rate to avoid a crash (DRAC)-based crash estimation.Accident Analysis & Pre- vention, 153:106051, April 2021. ISSN 0001-4575. doi: 10.1016/j.aap.2021.106051. URL https://www.sciencedirect.com/science/article/pii/S0001457521000828

  31. [31]

    Cosmos World Foundation Model Platform for Physical AI

    NVIDIA, Niket Agarwal, Arslan Ali, Maciej Bala, Yogesh Balaji, Erik Barker, Tiffany Cai, Prithvijit Chattopadhyay, Yongxin Chen, Yin Cui, Yifan Ding, Daniel Dworakowski, Jiaojiao Fan, Michele Fenzi, Francesco Ferroni, Sanja Fidler, Dieter Fox, Songwei Ge, Yunhao Ge, Jinwei Gu, Siddharth Gururani, Ethan He, Jiahui Huang, Jacob Huffman, Pooya Jannaty, Jingy...

  32. [32]

    All coordinates (road and agent) are shifted so that each scenario is locally centered at the origin

    Re-centering.A scene center is computed as the mean of all road polyline points. All coordinates (road and agent) are shifted so that each scenario is locally centered at the origin

  33. [33]

    Scenes with significant 3D structure (bridges, ramps) are filtered during curation (Appendix I)

    Z-flattening.Waymo road elevation data is collapsed to a constant ground height ( z= 0 ). Scenes with significant 3D structure (bridges, ramps) are filtered during curation (Appendix I)

  34. [34]

    Pairs separated by more than 3.0 m are treated as discontinuities and skipped

    Segment generation.Each consecutive pair of polyline points defines an oriented box segment placed at the midpoint and rotated to align with the point-to-point direction (width 0.10 m, height 0.10 m). Pairs separated by more than 3.0 m are treated as discontinuities and skipped. Both endpoints must fall within±B/2of the scene center (B=200m)

  35. [35]

    The different reconstructed Waymo Motion Open Dataset scene is illustrated in Figure 5

    USD prim creation.Segments are grouped by polyline type and rendered as a PointInstancer with a single unit-cube prototype; per-segment position, orientation, and scale arrays encode the geometry compactly. The different reconstructed Waymo Motion Open Dataset scene is illustrated in Figure 5

  36. [36]

    This flat, GPU-friendly format enables constant-time runtime access without re-parsing the JSON or traversing the USD prim hierarchy

    Metadata baking.Five arrays are written to the world root prim’s USDcustomData: segment midpoint positions, direction vectors, polyline type codes, half-lengths, and half-widths. This flat, GPU-friendly format enables constant-time runtime access without re-parsing the JSON or traversing the USD prim hierarchy. Multi-world grid assembly.SceneFactory arran...

  37. [37]

    Goal position(g b x, gb y), each divided by the scene bounding-box half-size (B/2 = 100m)

  38. [38]

    Heading error to the goal, encoded as(sin ∆ψ,cos ∆ψ)

  39. [39]

    Euclidean goal distance∥g b xy∥, divided byB/2

  40. [40]

    When the weather-to-friction module is active, the 4-dim weather context (Equation 2) is appended to the ego group, yielding an 11-dim input to the ego encoder

    Body-frame velocity(v b x, vb y), each divided by 10 m/s. When the weather-to-friction module is active, the 4-dim weather context (Equation 2) is appended to the ego group, yielding an 11-dim input to the ego encoder. Road geometry context ( Kr ×5 = 1,750 -dim).Each agent selects Kr = 350 road-segment midpoints from the pre-loaded GPU metadata tensors (§...

  41. [41]

    Relative position(∆x b,∆y b)in ego body frame, each divided byr

  42. [42]

    Polyline type code, divided by a normalizer (= 20)

  43. [43]

    Neighbor vehicles (Kv ×7 = 168 -dim).The Kv = 24 nearest alive agents (by Euclidean distance in world XY) are selected per ego agent

    Road direction unit vector(d b x, db y)in ego body frame. Neighbor vehicles (Kv ×7 = 168 -dim).The Kv = 24 nearest alive agents (by Euclidean distance in world XY) are selected per ego agent. Self-distance and dead-agent distances are set to infinity before the argsort. Each neighbor contributes seven features:

  44. [44]

    Relative position(∆x b,∆y b)in ego body frame, each divided byB/2

  45. [45]

    Chassis length and width, each divided byB/2. 16

  46. [46]

    Relative heading∆ψ, divided byπ

  47. [47]

    Speed∥v xy∥, divided by 10 m/s

  48. [48]

    Pairwise time-to-collision (TTC), computed via a swept-circle test. Each vehicle is approximated by three equal-radius circles placed along the longitudinal axis at offsets {−d,0,+d} from the chassis centre, where r= max(0.45,0.55W) and d= min WB 2 ,max(0, L 2 −0.8r) . For the default vehicle ( L=4.0 m, W=2.0 m, WB=2.6 m) this gives r≈1.10 m and d≈1.12 m....

  49. [49]

    Triggered when the maximum contact-sensor force exceeds 25 N

    Collision( −6.0). Triggered when the maximum contact-sensor force exceeds 25 N. A warm-up window of 24 steps after spawn suppresses false positives from initial settling

  50. [50]

    Crash( −10.0). Triggered when the vehicle falls below the ground plane ( zrel <0 ), drifts more than 100 m from its start position, or tilts beyond a gravity-vector threshold (gb z >−0.15 , indicating a rollover or severe pitch)

  51. [51]

    Triggered when any of the vehicle’s three bounding circles overlaps a road-edge segment (Waymo types 15–16), detected via an oriented-bounding-box overlap test

    Lane-forbidden( −20.0). Triggered when any of the vehicle’s three bounding circles overlaps a road-edge segment (Waymo types 15–16), detected via an oriented-bounding-box overlap test. After termination, the agent is teleported off-stage and all subsequent dense rewards are masked to zero. Dense shaping rewards.Six terms provide per-step signal for active agents:

  52. [52]

    Route progress(weight w= 2.0 ). The displacement between consecutive steps is projected onto the tangent of the nearest driving-lane segment (Waymo types 1–2), oriented toward the goal: rprog = clamp (pt −p t−1)· ˆtlane,−2,+2 ·w . This rewards forward progress along the route and penalizes backwards motion

  53. [53]

    Geometric lane-keeping(weight w= 0.08 ). A continuous quality score combines lateral offset and heading alignment with the nearest driving lane: q= exp −(ℓ/σ)2 · (1−w h) +w h ·max(0,cos(ψ−ψ lane)) , where ℓ is the signed lateral offset, σ= 1.75 m is the tolerance, and wh = 0.8 is the heading weight. The reward isr lane =q·w

  54. [54]

    Applied when the vehicle’s lateral offset exceeds 3.25 m or its distance to the nearest lane exceeds 6.0 m

    Off-road penalty(weight w=−0.5 ). Applied when the vehicle’s lateral offset exceeds 3.25 m or its distance to the nearest lane exceeds 6.0 m

  55. [55]

    Applied when the vehicle speed drops below 1.0 m/s, discour- aging the policy from learning to stop and wait

    Idle penalty(weight w=−0.05 ). Applied when the vehicle speed drops below 1.0 m/s, discour- aging the policy from learning to stop and wait

  56. [56]

    The reward isr ttc =−p(τ)

    Inter-vehicle TTC penalty.The minimum pairwise TTC across all neighbors is converted to a penalty: p(τ) = min αv/max(τ, τ floor), p max,v , withα v = 0.10,p max,v = 0.35, andτ floor = 0.5s. The reward isr ttc =−p(τ). 18 Table 7: Reward components used in all experiments. Sparse rewards are delivered once and trigger agent termination; dense rewards are ap...

  57. [57]

    This encourages the policy to slow down when approaching road boundaries

    Road-edge TTC penalty.The same α/τ penalty form is applied to the closest road-edge segment within 40 m ahead: αe = 0.07, pmax,e = 0.40, τfloor = 0.5 s. This encourages the policy to slow down when approaching road boundaries. Termination and time-out.An agent terminates upon any sparse event (goal, collision, crash, or lane-forbidden). If no event occurs...

  58. [58]

    Refinement: tunes all 20 parameters jointly on all rollouts with a narrowed search window (18% of the full range)

  59. [59]

    Best loss

    Brake preservation: re-tunes longitudinal parameters with a tight window (10%) to prevent regression. Each stage inherits the best configuration from the previous stage. The CEM uses a population of 24 candidates, an elite fraction of 25%, an initial standard deviation of 25% of each parameter’s range, and a minimum standard deviation of 5% to prevent pre...

  60. [60]

    Observation construction reduces to a handful of batchedtorch operations with no Python per-agent loop

    Batched joint-state tensors.The custom URDF articulated vehicle (§3.2) exposes positions, velocities, and orientations as GPU-resident tensors of shape (num_envs,num_agents, d) via Isaac Lab’sArticulation API. Observation construction reduces to a handful of batchedtorch operations with no Python per-agent loop. 23 Table 10: Per-step timing breakdown at 6...

  61. [61]

    The Baseline places all worlds in a single PhysX scene, limiting solver parallelism

    Cloned PhysX scenes.Isaac Lab’s environment cloning creates N independent PhysX solver islands, enabling the GPU broad-phase and narrow-phase to dispatch them in parallel. The Baseline places all worlds in a single PhysX scene, limiting solver parallelism

  62. [62]

    The Baseline converts observations to NumPy for Stable-Baselines3’s rollout buffer, incurring O(agents×obs_dim) bytes of device-to-host transfer every step

    GPU-resident training loop.RSL-RL reads observations, rewards, and dones as CUDA tensors directly from the environment; no CPU round-trip occurs during training. The Baseline converts observations to NumPy for Stable-Baselines3’s rollout buffer, incurring O(agents×obs_dim) bytes of device-to-host transfer every step. K Friction Awareness Ablation We compa...