IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking

Chengyang Li; Guoliang Li; Hengshuang Zhao; Jia Pan; Qi Hao; Rui Gao; Ruihua Han; Shuai Wang; Xinyi Wang; Yupu Lu

arxiv: 2606.08729 · v1 · pith:5GUNEZRNnew · submitted 2026-06-07 · 💻 cs.RO · cs.LG

IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking

Ruihua Han , Shuai Wang , Chengyang Li , Rui Gao , Xinyi Wang , Zhe Liu , Guoliang Li , Yupu Lu

show 3 more authors

Qi Hao Jia Pan Hengshuang Zhao

This is my paper

Pith reviewed 2026-06-27 18:17 UTC · model grok-4.3

classification 💻 cs.RO cs.LG

keywords robot simulationnavigationYAML configurationbenchmarkingrobot learningcollision avoidanceLiDAR sensingreproducible scenarios

0 comments

The pith

IR-SIM defines complete robotic navigation scenarios using only YAML configuration files that specify kinematics, collision checks, LiDAR sensing, visualization, and behaviors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents IR-SIM as a simulator in which every element of a navigation scenario is captured in a single YAML file rather than custom code. This includes the robot's motion rules, geometric collision detection, sensor models, graphics output, and task-specific behavior modules. A reader would care because the approach lets scenarios be written, shared, and altered through text alone, which supports automated generation from language prompts, algorithm benchmarking, and data collection for learning. The same YAML files also serve as a bridge to higher-fidelity simulators and physical robots without additional programming.

Core claim

IR-SIM makes robotic simulation fully describable and reproducible by encoding mobile robot kinematics, geometric collision checking, LiDAR sensing, visualization, and behavior modules inside YAML configuration files, so that scenarios can be created or modified from text prompts and used directly for benchmarking navigation algorithms or generating training data.

What carries the argument

YAML configuration files that encapsulate kinematics, geometric collision checking, LiDAR sensing, visualization, and behavior modules as the single source of truth for each scenario.

If this is right

Navigation scenarios can be constructed directly from natural language descriptions via the IR-SIM agent skills.
Training data for collision-avoidance policies can be generated automatically from the defined scenarios.
Social navigation policies can be benchmarked in reproducible, text-specified environments.
Prototyped algorithms can be transferred to high-fidelity simulators and real robots without writing new interface code.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The YAML-only design could lower the entry barrier for researchers who want to test navigation ideas without maintaining simulator codebases.
Because scenarios are text files, they could be version-controlled and shared in the same way as datasets or model weights.
Extending the same configuration approach to other robot skills such as manipulation would require only additional module definitions inside the YAML schema.

Load-bearing premise

YAML configuration files alone can fully capture and reproduce the required navigation scenarios, sensing, and behaviors for benchmarking and learning without needing custom code in most practical cases.

What would settle it

A navigation task whose accurate reproduction requires sensor or behavior logic that cannot be expressed inside the existing YAML schema and therefore demands extra Python code.

Figures

Figures reproduced from arXiv: 2606.08729 by Chengyang Li, Guoliang Li, Hengshuang Zhao, Jia Pan, Qi Hao, Rui Gao, Ruihua Han, Shuai Wang, Xinyi Wang, Yupu Lu, Zhe Liu.

**Figure 2.** Figure 2: Skill-native construction of the pedestrian wander scenario. The pipeline contains four stages: (a) prompting and extracting scenario elements, (b) retrieving IR-SIM skill templates and API patterns, (c) assembling YAML and Python artifacts, and (d) rendering rollout snapshots and saving the animation demonstration. example, the pipeline takes only a few minutes from the text prompt to the animated scenari… view at source ↗

**Figure 3.** Figure 3: IR-SIM scenario families for learning and benchmarking: (a) scenario with randomized [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Bridge examples with 3D scene assets. Left: Habitat-Sim/HM3D to IR-SIM occupancy [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: CARLA bridge examples. Left: IR-SIM pedestrian trajectories instantiated in CARLA. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Simulation plays a key role in automated robotics research supported by large language models (LLMs). However, existing simulators often require custom code or complex interfaces, creating a barrier to rapid prototyping and automated algorithm development. To this end, we propose the Intelligent Robot Simulator (IR-SIM), a lightweight skill-native navigation simulator designed for rapid scenario construction, benchmarking, and robot learning. In IR-SIM, scenarios are entirely defined by YAML configuration files that specify mobile robot kinematics, geometric collision checking, LiDAR sensing, visualization, and behavior modules. This design makes robotic simulation fully describable and reproducible, allowing scenarios to be generated and modified from text prompts through the proposed IR-SIM agent skills. The resulting scenarios can be used for automated benchmarking of navigation algorithms and for automated generation of training data for learning methods. Furthermore, IR-SIM provides bridges to high fidelity simulators and real world deployment, allowing users to validate their algorithms in more realistic settings after prototyping without extra coding. The experiments showcase the convenience and versatility of IR-SIM in multiple tasks: constructing navigation scenarios from natural language, training a collision avoidance policy, benchmarking social navigation policies, and bridging to high fidelity simulators and real world deployment. The project website is available at https://github.com/hanruihua/ir-sim.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

IR-SIM's YAML-only scenario definition is the main pitch but likely needs pre-built code modules underneath, limiting the reproducibility claim.

read the letter

The paper introduces IR-SIM as a lightweight simulator where navigation scenarios get defined entirely through YAML files covering kinematics, collision checks, LiDAR, visualization, and behavior modules. This setup aims to support text-prompt generation via an IR-SIM agent and easy bridging to other simulators or real hardware.

What stands out is the skill-native design that tries to lower the barrier for LLM-assisted prototyping and benchmarking in navigation tasks. The four experiment types listed—language-based scenario building, policy training, social navigation benchmarking, and bridging—align with practical needs in the subfield and show the intended workflow.

The abstract offers no quantitative results, error metrics, or direct comparisons to existing simulators, which leaves the convenience claims untested in the provided text. The stress-test concern holds: behavior modules are probably selected and parameterized from a code base rather than fully specified in YAML alone, so sharing a YAML file would not guarantee identical reproduction without the matching implementations.

This paper targets robotics researchers who iterate on navigation algorithms and want faster scenario setup than standard tools require. It could deliver value for quick tests if the released code matches the description.

The work shows clear intent and honest positioning as a tooling contribution rather than a fundamental method. It deserves peer review to check the implementation details and any results in the full manuscript.

Referee Report

2 major / 2 minor

Summary. The manuscript presents IR-SIM, a lightweight simulator for robot navigation tasks. Scenarios are claimed to be entirely defined by YAML configuration files specifying mobile robot kinematics, geometric collision checking, LiDAR sensing, visualization, and behavior modules. This enables full describability and reproducibility, allowing scenario generation and modification from text prompts via IR-SIM agent skills. The simulator supports automated benchmarking of navigation algorithms, generation of training data for learning methods, and provides bridges to high-fidelity simulators and real-world deployment without extra coding. Experiments are described for constructing navigation scenarios from natural language, training a collision avoidance policy, benchmarking social navigation policies, and bridging to other simulators and real world.

Significance. If the YAML-based configuration approach successfully allows complex behavior modules to be fully specified and reproduced without custom code, IR-SIM could lower barriers for rapid prototyping and LLM-supported robotics research by making simulations text-describable and reproducible. The bridging capability to high-fidelity sims and real world is a positive feature for validation. The open-source github release supports reproducibility of the tool itself.

major comments (2)

[Abstract] Abstract: The central claim that 'scenarios are entirely defined by YAML configuration files that specify ... and behavior modules' is load-bearing for the reproducibility, text-prompt generation, and no-extra-coding assertions. No evidence or verification is supplied that behavior modules (e.g., for social navigation or collision avoidance) can be exhaustively configured via declarative YAML parameters alone rather than by selecting and parameterizing pre-existing code implementations whose logic remains outside the YAML file.
[Experiments] Experiments description: The four listed experiment types (natural-language scenario construction, policy training, social navigation benchmarking, bridging) are presented without any quantitative results, success rates, error analysis, or ablation showing that the YAML approach suffices for the claimed tasks without custom code, undermining assessment of the design's practical scope.

minor comments (2)

The manuscript would benefit from including concrete example YAML snippets (perhaps in an appendix or figure) illustrating specification of a behavior module to make the 'entirely defined by YAML' claim concrete.
A feature-comparison table against existing simulators (e.g., Gazebo, Habitat, Isaac Sim) would help clarify the claimed lightweight and skill-native advantages.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below, clarifying the design of IR-SIM and committing to revisions that strengthen the evidence for our claims.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'scenarios are entirely defined by YAML configuration files that specify ... and behavior modules' is load-bearing for the reproducibility, text-prompt generation, and no-extra-coding assertions. No evidence or verification is supplied that behavior modules (e.g., for social navigation or collision avoidance) can be exhaustively configured via declarative YAML parameters alone rather than by selecting and parameterizing pre-existing code implementations whose logic remains outside the YAML file.

Authors: We clarify that IR-SIM behavior modules are selected and fully parameterized via declarative YAML entries: the file specifies the module identifier (e.g., 'social_navigation' or 'orca') together with all numeric and structural parameters that govern its execution. The simulator core supplies the implementation, yet the complete scenario—including which modules are active and their exact settings—is captured in the YAML without any additional user code. To supply the requested verification, the revised manuscript will include complete YAML excerpts for the social-navigation and collision-avoidance modules used in the experiments, demonstrating that the configuration is exhaustive and reproducible from the file alone. revision: yes
Referee: [Experiments] Experiments description: The four listed experiment types (natural-language scenario construction, policy training, social navigation benchmarking, bridging) are presented without any quantitative results, success rates, error analysis, or ablation showing that the YAML approach suffices for the claimed tasks without custom code, undermining assessment of the design's practical scope.

Authors: The experiments were written to illustrate the end-to-end workflows (text-to-YAML generation, data collection for learning, policy benchmarking, and cross-simulator bridging) rather than to report algorithmic performance. We acknowledge that the lack of quantitative metrics makes it difficult to judge the practical reach of the YAML-only approach. In the revision we will augment the experiments section with concrete numbers: success rates and collision counts for the trained collision-avoidance policy, standard social-navigation metrics (e.g., success rate, average time to goal) for the benchmarked policies, and timing/error statistics for the bridging examples, together with a short ablation confirming that all scenarios were created without custom code beyond the supplied YAML. revision: yes

Circularity Check

0 steps flagged

No circularity: software design claim with no derivations or self-referential reductions

full rationale

The paper describes a simulator architecture whose central claim is that scenarios are defined via YAML files specifying kinematics, collision, sensing, visualization, and behavior modules. No equations, fitted parameters, predictions, or uniqueness theorems appear. No self-citations are invoked as load-bearing premises. The claim is an implementation assertion about the tool's interface, not a derivation that reduces to its own inputs by construction. External reproducibility depends on the released code and configs, which is outside the scope of circularity analysis. This matches the default non-circular case for software papers.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper introduces a software tool rather than a mathematical result. It relies on standard assumptions about kinematics, collision geometry, and sensor models from robotics literature. No free parameters or invented physical entities are described in the abstract.

axioms (2)

domain assumption Standard mobile robot kinematic models and geometric collision checking are sufficient to represent navigation scenarios.
Invoked when stating that YAML files specify kinematics and collision checking.
domain assumption YAML files can fully encode visualization, LiDAR sensing, and behavior modules without loss of fidelity for the intended use cases.
Central to the claim that scenarios are entirely defined by YAML.

pith-pipeline@v0.9.1-grok · 5794 in / 1130 out tokens · 20407 ms · 2026-06-27T18:17:57.077838+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 24 canonical work pages · 2 internal anchors

[1]

Agent skills overview.https://platform.claude.com/docs/en/ agents-and-tools/agent-skills/overview, 2025

Anthropic. Agent skills overview.https://platform.claude.com/docs/en/ agents-and-tools/agent-skills/overview, 2025

2025
[2]

C. Lu, C. Lu, R. T. Lange, Y . Yamada, S. Hu, J. Foerster, D. Ha, and J. Clune. To- wards end-to-end automation of ai research.Nature, 651(8107):914–919, 2026. doi: 10.1038/s41586-026-10265-5

work page doi:10.1038/s41586-026-10265-5 2026
[3]

Y . Wang, Z. Xian, F. Chen, T.-H. Wang, Y . Wang, K. Fragkiadaki, Z. Erickson, D. Held, and C. Gan. RoboGen: Towards unleashing infinite data for automated robot learning via genera- tive simulation. In R. Salakhutdinov, Z. Kolter, K. Heller, A. Weller, N. Oliver, J. Scarlett, and F. Berkenkamp, editors,Proceedings of the 41st International Conference on ...

2024
[4]

L. Wang, Y . Ling, Z. Yuan, M. Shridhar, C. Bao, Y . Qin, B. Wang, H. Xu, and X. Wang. Gen- Sim: Generating robotic simulation tasks via large language models. InInternational Con- ference on Learning Representations. OpenReview.net, 2024. URLhttps://openreview. net/forum?id=OI3RoHoWAN

2024
[5]

Gao and C.-M

Y . Gao and C.-M. Huang. Evaluation of socially-aware robot navigation.Frontiers in Robotics and AI, 8:721317, 2022. doi:10.3389/frobt.2021.721317

work page doi:10.3389/frobt.2021.721317 2022
[6]

Karwowski, W

J. Karwowski, W. Szynkiewicz, and E. Niewiadomska-Szynkiewicz. Bridging requirements, planning, and evaluation: A review of social robot navigation.Sensors, 24(9):2794, 2024. doi:10.3390/s24092794

work page doi:10.3390/s24092794 2024
[7]

X. Xiao, B. Liu, G. Warnell, and P. Stone. Motion planning and control for mobile robot navigation using machine learning: A survey.Autonomous Robots, 46(5):569–597, 2022. doi: 10.1007/s10514-022-10039-8

work page doi:10.1007/s10514-022-10039-8 2022
[8]

Francis, C

A. Francis, C. P ´erez-D’Arpino, C. Li, F. Xia, A. Alahi, R. Alami, A. Bera, A. Biswas, J. Biswas, R. Chandra, H.-T. L. Chiang, M. Everett, S. Ha, J. Hart, J. P. How, H. Karnan, T.-W. E. Lee, L. J. Manso, R. Mirsky, S. Pirk, P. T. Singamaneni, P. Stone, A. V . Taylor, P. Trautman, N. Tsoi, M. V´azquez, X. Xiao, P. Xu, N. Yokoyama, A. Toshev, and R. Mart´ı...

work page doi:10.1145/3700599 2025
[9]

Dosovitskiy, G

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun. CARLA: An open urban driving simulator. In S. Levine, V . Vanhoucke, and K. Goldberg, editors,Proceedings of the 1st Annual Conference on Robot Learning, volume 78 ofProceedings of Machine Learning Research, pages 1–16. PMLR, 13–15 Nov 2017. URLhttps://proceedings.mlr.press/ v78/dosovitskiy17a.html

2017
[10]

N. P. Koenig and A. Howard. Design and use paradigms for Gazebo, an open-source multi- robot simulator. InProceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 3, pages 2149–2154. IEEE, 2004. doi:10.1109/IROS. 2004.1389727

work page doi:10.1109/iros 2004
[11]

Isaac Sim, 2025

NVIDIA. Isaac Sim, 2025. URLhttps://github.com/isaac-sim/IsaacSim. Version 5.1.0; Apache-2.0 license. 9

2025
[12]

Blanco-Claraco, B

J.-L. Blanco-Claraco, B. Tymchenko, F. J. Ma ˜nas-Alvarez, F. Ca ˜nadas-Ar´anega, ´A. L ´opez- G´azquez, and J. C. Moreno. MultiVehicle Simulator (MVSim): Lightweight dynamics simu- lator for multiagents and mobile robotics research.SoftwareX, 23:101443, 2023. doi:10.1016/ j.softx.2023.101443. URLhttps://www.sciencedirect.com/science/article/pii/ S2352711...

arXiv 2023
[13]

Flatland: 2d robot simulator, 2026

Flatland Contributors. Flatland: 2d robot simulator, 2026. URLhttps://github.com/ avidbots/flatland

2026
[14]

pyrobosim: A ROS 2 enabled mobile robot simulator for behavior prototyping, 2026

pyrobosim Contributors. pyrobosim: A ROS 2 enabled mobile robot simulator for behavior prototyping, 2026. URLhttps://github.com/sea-bass/pyrobosim

2026
[15]

J. J. Damanik, C. A. Deresa, S. Park, W. Imliki, and H.-L. Choi. JALAN-Sim: A 200- million-fps simulated environment for 2d navigation in cluttered spaces, 2025. URLhttps: //damanikjosh.com/publications/2025-06-11-jalan-sim/

2025
[16]

Chandra, Z

R. Chandra, Z. Sprague, and J. Biswas. SOCIALGYM 2.0: Simulator for multi-robot learning and navigation in shared human spaces. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 23778–23780, 2024. doi:10.1609/aaai.v38i21.30562. URL https://ojs.aaai.org/index.php/AAAI/article/view/30562

work page doi:10.1609/aaai.v38i21.30562 2024
[17]

C. Chen, Y . Liu, S. Kreiss, and A. Alahi. Crowd-robot interaction: Crowd-aware robot navi- gation with attention-based deep reinforcement learning. In2019 International Conference on Robotics and Automation, pages 6015–6022. IEEE, 2019. doi:10.1109/ICRA.2019.8794134

work page doi:10.1109/icra.2019.8794134 2019
[18]

Mavrogiannis, F

C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh. Core challenges of social robot navigation: A survey.ACM Transactions on Human-Robot Interac- tion, 12(3):36:1–36:39, 2023. doi:10.1145/3583741

work page doi:10.1145/3583741 2023
[19]

Biswas, A

A. Biswas, A. Wang, G. Silvera, A. Steinfeld, and H. Admoni. SocNavBench: A grounded simulation testing framework for evaluating social navigation.ACM Transactions on Human- Robot Interaction, 11(3):1–24, 2022. doi:10.1145/3476413

work page doi:10.1145/3476413 2022
[20]

van den Berg, S

J. van den Berg, S. J. Guy, M. Lin, and D. Manocha. Reciprocal n-body collision avoidance. InRobotics Research: The 14th International Symposium ISRR, pages 3–19. Springer, 2011. doi:10.1007/978-3-642-19457-3 1

work page doi:10.1007/978-3-642-19457-3 2011
[21]

Helbing and P

D. Helbing and P. Moln ´ar. Social force model for pedestrian dynamics.Physical Review E, 51 (5):4282–4286, 1995. doi:10.1103/PhysRevE.51.4282

work page doi:10.1103/physreve.51.4282 1995
[22]

J. K. Terry, B. Black, N. Grammel, M. Jayakumar, A. Hari, R. Sullivan, L. S. Santos, C. Dief- fendahl, C. Horsch, R. Perez-Vicente, N. Williams, Y . Lokesh, and P. Ravi. PettingZoo: Gym for multi-agent reinforcement learning. InAdvances in Neural Information Processing Sys- tems, volume 34, pages 15032–15043, 2021. URLhttps://openreview.net/forum?id= fLnsj7fpbPI

2021
[23]

Raffin, A

A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann. Stable-Baselines3: Reliable reinforcement learning implementations.Journal of Machine Learning Research, 22 (268):1–8, 2021. URLhttps://www.jmlr.org/papers/v22/20-1364.html

2021
[24]

Liang, W

J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng. Code as Policies: Language model programs for embodied control. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9493–9500. IEEE, 2023. doi:10.1109/ ICRA48891.2023.10160591

arXiv 2023
[25]

Model Context Protocol: Architecture overview

Model Context Protocol Contributors. Model Context Protocol: Architecture overview. https://modelcontextprotocol.io/docs/learn/architecture, 2024. 10

2024
[26]

Isaac Sim MCP: Isaac simulation mcp extension and server.https: //github.com/omni-mcp/isaac-sim-mcp, 2025

omni-mcp Contributors. Isaac Sim MCP: Isaac simulation mcp extension and server.https: //github.com/omni-mcp/isaac-sim-mcp, 2025

2025
[27]

Perille, A

D. Perille, A. Truong, X. Xiao, and P. Stone. Benchmarking metric ground navigation. In 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics, pages 116–

2020
[28]

doi:10.1109/SSRR50563.2020.9292572

IEEE, 2020. doi:10.1109/SSRR50563.2020.9292572

work page doi:10.1109/ssrr50563.2020.9292572 2020
[29]

A. Nair, F. Jiang, K. Hou, Z. Xu, S. Li, X. Xiao, and P. Stone. DynaBARN: Benchmarking metric ground navigation in dynamic environments. In2022 IEEE International Symposium on Safety, Security, and Rescue Robotics, pages 347–352. IEEE, 2022. doi:10.1109/SSRR56537. 2022.10018758

work page doi:10.1109/ssrr56537 2022
[30]

Brockman, V

G. Brockman, V . Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. OpenAI Gym, 2016. URLhttps://arxiv.org/abs/1606.01540

Pith/arXiv arXiv 2016
[31]

Gymnasium: A Standard Interface for Reinforcement Learning Environments

M. Towers, A. Kwiatkowski, J. K. Terry, J. U. Balis, G. de Cola, T. Deleu, M. Goul ˜ao, A. Kallinteris, M. Krimmel, A. KG, R. Perez-Vicente, A. Pierr ´e, S. Schulhoff, J. J. Tai, H. Tan, and O. G. Younis. Gymnasium: A standard interface for reinforcement learning en- vironments. InAdvances in Neural Information Processing Systems, volume 38, 2025. doi: 10...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2407.17032 2025
[32]

A. Bou, M. Bettini, S. Dittert, V . Kumar, S. Sodhani, X. Yang, G. De Fabritiis, and V . Moens. TorchRL: A data-driven decision-making library for PyTorch, 2023. URLhttps://arxiv. org/abs/2306.00577

arXiv 2023
[33]

R. Han, S. Chen, S. Wang, Z. Zhang, R. Gao, Q. Hao, and J. Pan. Reinforcement learned distributed multi-robot navigation with reciprocal velocity obstacle shaped rewards.IEEE Robotics and Automation Letters, 7(3):5896–5903, 2022. doi:10.1109/LRA.2022.3161699

work page doi:10.1109/lra.2022.3161699 2022
[34]

Martinez-Baselga, E

D. Martinez-Baselga, E. Sebasti ´an, E. Montijano, L. Riazuelo, C. Sag ¨u´es, and L. Montano. Avocado: Adaptive optimal collision avoidance driven by opinion.IEEE Transactions on Robotics, 41:2495–2511, 2025. doi:10.1109/TRO.2025.3552350

work page doi:10.1109/tro.2025.3552350 2025
[35]

D. Oh, Y . Lim, H. Jung, J. Huh, J. Lee, C. Choi, H. J. Kim, and J. Park. A survey on collision avoidance algorithms for multi-robot systems.International Journal of Control, Automation and Systems, 23(7):2019–2035, 2025. doi:10.1007/s12555-024-1104-9

work page doi:10.1007/s12555-024-1104-9 2019
[36]

Gillies, C

S. Gillies, C. van der Wel, J. Van den Bossche, M. W. Taves, J. Arnott, B. C. Ward, and others. Shapely, May 2025. URLhttps://github.com/shapely/shapely. Version 2.1.1; BSD- 3-Clause license; DOI: 10.5281/zenodo.5597138

work page doi:10.5281/zenodo.5597138 2025
[37]

J. D. Hunter. Matplotlib: A 2d graphics environment.Computing in Science & Engineering, 9 (3):90–95, 2007. doi:10.1109/MCSE.2007.55

work page doi:10.1109/mcse.2007.55 2007
[38]

P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths.IEEE Transactions on Systems Science and Cybernetics, 4(2):100–107,
[39]

doi:10.1109/TSSC.1968.300136

work page doi:10.1109/tssc.1968.300136 1968
[40]

S. M. LaValle. Rapidly-exploring random trees: A new tool for path planning. Technical Report TR 98-11, Computer Science Department, Iowa State University, 1998

1998
[41]

In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019

M. Savva, A. Kadian, O. Maksymets, Y . Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V . Koltun, J. Malik, D. Parikh, and D. Batra. Habitat: A platform for embodied AI research. InProceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9339– 9347, 2019. doi:10.1109/ICCV .2019.00943. 11

work page doi:10.1109/iccv 2019
[42]

S. K. Ramakrishnan, A. Gokaslan, E. Wijmans, O. Maksymets, A. Clegg, J. M. Turner, E. Un- dersander, W. Galuba, A. Westbury, A. X. Chang, M. Savva, Y . Zhao, and D. Batra. Habitat- Matterport 3D Dataset (HM3D): 1000 large-scale 3d environments for embodied AI. InThirty- fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track,
[43]

URLhttps://arxiv.org/abs/2109.08238

Pith/arXiv arXiv
[44]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.CoRR, abs/1707.06347, 2017. doi:10.48550/arXiv.1707.06347. URLhttps: //arxiv.org/abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017
[45]

van den Berg, M

J. van den Berg, M. Lin, and D. Manocha. Reciprocal velocity obstacles for real-time multi- agent navigation. In2008 IEEE International Conference on Robotics and Automation, pages 1928–1935. IEEE, 2008

1928
[46]

Kerbl, G

B. Kerbl, G. Kopanas, T. Leimk ¨uhler, and G. Drettakis. 3D gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4):1–14, 2023. doi:10.1145/ 3592433

2023
[47]

R. Han, S. Wang, S. Wang, Z. Zhang, J. Chen, S. Lin, C. Li, C. Xu, Y . C. Eldar, Q. Hao, et al. NeuPAN: Direct point robot navigation with end-to-end model-based learning.IEEE Transactions on Robotics, 41:2804–2824, 2025. doi:10.1109/TRO.2025.3554252. 12

work page doi:10.1109/tro.2025.3554252 2025

[1] [1]

Agent skills overview.https://platform.claude.com/docs/en/ agents-and-tools/agent-skills/overview, 2025

Anthropic. Agent skills overview.https://platform.claude.com/docs/en/ agents-and-tools/agent-skills/overview, 2025

2025

[2] [2]

C. Lu, C. Lu, R. T. Lange, Y . Yamada, S. Hu, J. Foerster, D. Ha, and J. Clune. To- wards end-to-end automation of ai research.Nature, 651(8107):914–919, 2026. doi: 10.1038/s41586-026-10265-5

work page doi:10.1038/s41586-026-10265-5 2026

[3] [3]

Y . Wang, Z. Xian, F. Chen, T.-H. Wang, Y . Wang, K. Fragkiadaki, Z. Erickson, D. Held, and C. Gan. RoboGen: Towards unleashing infinite data for automated robot learning via genera- tive simulation. In R. Salakhutdinov, Z. Kolter, K. Heller, A. Weller, N. Oliver, J. Scarlett, and F. Berkenkamp, editors,Proceedings of the 41st International Conference on ...

2024

[4] [4]

L. Wang, Y . Ling, Z. Yuan, M. Shridhar, C. Bao, Y . Qin, B. Wang, H. Xu, and X. Wang. Gen- Sim: Generating robotic simulation tasks via large language models. InInternational Con- ference on Learning Representations. OpenReview.net, 2024. URLhttps://openreview. net/forum?id=OI3RoHoWAN

2024

[5] [5]

Gao and C.-M

Y . Gao and C.-M. Huang. Evaluation of socially-aware robot navigation.Frontiers in Robotics and AI, 8:721317, 2022. doi:10.3389/frobt.2021.721317

work page doi:10.3389/frobt.2021.721317 2022

[6] [6]

Karwowski, W

J. Karwowski, W. Szynkiewicz, and E. Niewiadomska-Szynkiewicz. Bridging requirements, planning, and evaluation: A review of social robot navigation.Sensors, 24(9):2794, 2024. doi:10.3390/s24092794

work page doi:10.3390/s24092794 2024

[7] [7]

X. Xiao, B. Liu, G. Warnell, and P. Stone. Motion planning and control for mobile robot navigation using machine learning: A survey.Autonomous Robots, 46(5):569–597, 2022. doi: 10.1007/s10514-022-10039-8

work page doi:10.1007/s10514-022-10039-8 2022

[8] [8]

Francis, C

A. Francis, C. P ´erez-D’Arpino, C. Li, F. Xia, A. Alahi, R. Alami, A. Bera, A. Biswas, J. Biswas, R. Chandra, H.-T. L. Chiang, M. Everett, S. Ha, J. Hart, J. P. How, H. Karnan, T.-W. E. Lee, L. J. Manso, R. Mirsky, S. Pirk, P. T. Singamaneni, P. Stone, A. V . Taylor, P. Trautman, N. Tsoi, M. V´azquez, X. Xiao, P. Xu, N. Yokoyama, A. Toshev, and R. Mart´ı...

work page doi:10.1145/3700599 2025

[9] [9]

Dosovitskiy, G

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun. CARLA: An open urban driving simulator. In S. Levine, V . Vanhoucke, and K. Goldberg, editors,Proceedings of the 1st Annual Conference on Robot Learning, volume 78 ofProceedings of Machine Learning Research, pages 1–16. PMLR, 13–15 Nov 2017. URLhttps://proceedings.mlr.press/ v78/dosovitskiy17a.html

2017

[10] [10]

N. P. Koenig and A. Howard. Design and use paradigms for Gazebo, an open-source multi- robot simulator. InProceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 3, pages 2149–2154. IEEE, 2004. doi:10.1109/IROS. 2004.1389727

work page doi:10.1109/iros 2004

[11] [11]

Isaac Sim, 2025

NVIDIA. Isaac Sim, 2025. URLhttps://github.com/isaac-sim/IsaacSim. Version 5.1.0; Apache-2.0 license. 9

2025

[12] [12]

Blanco-Claraco, B

J.-L. Blanco-Claraco, B. Tymchenko, F. J. Ma ˜nas-Alvarez, F. Ca ˜nadas-Ar´anega, ´A. L ´opez- G´azquez, and J. C. Moreno. MultiVehicle Simulator (MVSim): Lightweight dynamics simu- lator for multiagents and mobile robotics research.SoftwareX, 23:101443, 2023. doi:10.1016/ j.softx.2023.101443. URLhttps://www.sciencedirect.com/science/article/pii/ S2352711...

arXiv 2023

[13] [13]

Flatland: 2d robot simulator, 2026

Flatland Contributors. Flatland: 2d robot simulator, 2026. URLhttps://github.com/ avidbots/flatland

2026

[14] [14]

pyrobosim: A ROS 2 enabled mobile robot simulator for behavior prototyping, 2026

pyrobosim Contributors. pyrobosim: A ROS 2 enabled mobile robot simulator for behavior prototyping, 2026. URLhttps://github.com/sea-bass/pyrobosim

2026

[15] [15]

J. J. Damanik, C. A. Deresa, S. Park, W. Imliki, and H.-L. Choi. JALAN-Sim: A 200- million-fps simulated environment for 2d navigation in cluttered spaces, 2025. URLhttps: //damanikjosh.com/publications/2025-06-11-jalan-sim/

2025

[16] [16]

Chandra, Z

R. Chandra, Z. Sprague, and J. Biswas. SOCIALGYM 2.0: Simulator for multi-robot learning and navigation in shared human spaces. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 23778–23780, 2024. doi:10.1609/aaai.v38i21.30562. URL https://ojs.aaai.org/index.php/AAAI/article/view/30562

work page doi:10.1609/aaai.v38i21.30562 2024

[17] [17]

C. Chen, Y . Liu, S. Kreiss, and A. Alahi. Crowd-robot interaction: Crowd-aware robot navi- gation with attention-based deep reinforcement learning. In2019 International Conference on Robotics and Automation, pages 6015–6022. IEEE, 2019. doi:10.1109/ICRA.2019.8794134

work page doi:10.1109/icra.2019.8794134 2019

[18] [18]

Mavrogiannis, F

C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh. Core challenges of social robot navigation: A survey.ACM Transactions on Human-Robot Interac- tion, 12(3):36:1–36:39, 2023. doi:10.1145/3583741

work page doi:10.1145/3583741 2023

[19] [19]

Biswas, A

A. Biswas, A. Wang, G. Silvera, A. Steinfeld, and H. Admoni. SocNavBench: A grounded simulation testing framework for evaluating social navigation.ACM Transactions on Human- Robot Interaction, 11(3):1–24, 2022. doi:10.1145/3476413

work page doi:10.1145/3476413 2022

[20] [20]

van den Berg, S

J. van den Berg, S. J. Guy, M. Lin, and D. Manocha. Reciprocal n-body collision avoidance. InRobotics Research: The 14th International Symposium ISRR, pages 3–19. Springer, 2011. doi:10.1007/978-3-642-19457-3 1

work page doi:10.1007/978-3-642-19457-3 2011

[21] [21]

Helbing and P

D. Helbing and P. Moln ´ar. Social force model for pedestrian dynamics.Physical Review E, 51 (5):4282–4286, 1995. doi:10.1103/PhysRevE.51.4282

work page doi:10.1103/physreve.51.4282 1995

[22] [22]

J. K. Terry, B. Black, N. Grammel, M. Jayakumar, A. Hari, R. Sullivan, L. S. Santos, C. Dief- fendahl, C. Horsch, R. Perez-Vicente, N. Williams, Y . Lokesh, and P. Ravi. PettingZoo: Gym for multi-agent reinforcement learning. InAdvances in Neural Information Processing Sys- tems, volume 34, pages 15032–15043, 2021. URLhttps://openreview.net/forum?id= fLnsj7fpbPI

2021

[23] [23]

Raffin, A

A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann. Stable-Baselines3: Reliable reinforcement learning implementations.Journal of Machine Learning Research, 22 (268):1–8, 2021. URLhttps://www.jmlr.org/papers/v22/20-1364.html

2021

[24] [24]

Liang, W

J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng. Code as Policies: Language model programs for embodied control. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9493–9500. IEEE, 2023. doi:10.1109/ ICRA48891.2023.10160591

arXiv 2023

[25] [25]

Model Context Protocol: Architecture overview

Model Context Protocol Contributors. Model Context Protocol: Architecture overview. https://modelcontextprotocol.io/docs/learn/architecture, 2024. 10

2024

[26] [26]

Isaac Sim MCP: Isaac simulation mcp extension and server.https: //github.com/omni-mcp/isaac-sim-mcp, 2025

omni-mcp Contributors. Isaac Sim MCP: Isaac simulation mcp extension and server.https: //github.com/omni-mcp/isaac-sim-mcp, 2025

2025

[27] [27]

Perille, A

D. Perille, A. Truong, X. Xiao, and P. Stone. Benchmarking metric ground navigation. In 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics, pages 116–

2020

[28] [28]

doi:10.1109/SSRR50563.2020.9292572

IEEE, 2020. doi:10.1109/SSRR50563.2020.9292572

work page doi:10.1109/ssrr50563.2020.9292572 2020

[29] [29]

A. Nair, F. Jiang, K. Hou, Z. Xu, S. Li, X. Xiao, and P. Stone. DynaBARN: Benchmarking metric ground navigation in dynamic environments. In2022 IEEE International Symposium on Safety, Security, and Rescue Robotics, pages 347–352. IEEE, 2022. doi:10.1109/SSRR56537. 2022.10018758

work page doi:10.1109/ssrr56537 2022

[30] [30]

Brockman, V

G. Brockman, V . Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. OpenAI Gym, 2016. URLhttps://arxiv.org/abs/1606.01540

Pith/arXiv arXiv 2016

[31] [31]

Gymnasium: A Standard Interface for Reinforcement Learning Environments

M. Towers, A. Kwiatkowski, J. K. Terry, J. U. Balis, G. de Cola, T. Deleu, M. Goul ˜ao, A. Kallinteris, M. Krimmel, A. KG, R. Perez-Vicente, A. Pierr ´e, S. Schulhoff, J. J. Tai, H. Tan, and O. G. Younis. Gymnasium: A standard interface for reinforcement learning en- vironments. InAdvances in Neural Information Processing Systems, volume 38, 2025. doi: 10...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2407.17032 2025

[32] [32]

A. Bou, M. Bettini, S. Dittert, V . Kumar, S. Sodhani, X. Yang, G. De Fabritiis, and V . Moens. TorchRL: A data-driven decision-making library for PyTorch, 2023. URLhttps://arxiv. org/abs/2306.00577

arXiv 2023

[33] [33]

R. Han, S. Chen, S. Wang, Z. Zhang, R. Gao, Q. Hao, and J. Pan. Reinforcement learned distributed multi-robot navigation with reciprocal velocity obstacle shaped rewards.IEEE Robotics and Automation Letters, 7(3):5896–5903, 2022. doi:10.1109/LRA.2022.3161699

work page doi:10.1109/lra.2022.3161699 2022

[34] [34]

Martinez-Baselga, E

D. Martinez-Baselga, E. Sebasti ´an, E. Montijano, L. Riazuelo, C. Sag ¨u´es, and L. Montano. Avocado: Adaptive optimal collision avoidance driven by opinion.IEEE Transactions on Robotics, 41:2495–2511, 2025. doi:10.1109/TRO.2025.3552350

work page doi:10.1109/tro.2025.3552350 2025

[35] [35]

D. Oh, Y . Lim, H. Jung, J. Huh, J. Lee, C. Choi, H. J. Kim, and J. Park. A survey on collision avoidance algorithms for multi-robot systems.International Journal of Control, Automation and Systems, 23(7):2019–2035, 2025. doi:10.1007/s12555-024-1104-9

work page doi:10.1007/s12555-024-1104-9 2019

[36] [36]

Gillies, C

S. Gillies, C. van der Wel, J. Van den Bossche, M. W. Taves, J. Arnott, B. C. Ward, and others. Shapely, May 2025. URLhttps://github.com/shapely/shapely. Version 2.1.1; BSD- 3-Clause license; DOI: 10.5281/zenodo.5597138

work page doi:10.5281/zenodo.5597138 2025

[37] [37]

J. D. Hunter. Matplotlib: A 2d graphics environment.Computing in Science & Engineering, 9 (3):90–95, 2007. doi:10.1109/MCSE.2007.55

work page doi:10.1109/mcse.2007.55 2007

[38] [38]

P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths.IEEE Transactions on Systems Science and Cybernetics, 4(2):100–107,

[39] [39]

doi:10.1109/TSSC.1968.300136

work page doi:10.1109/tssc.1968.300136 1968

[40] [40]

S. M. LaValle. Rapidly-exploring random trees: A new tool for path planning. Technical Report TR 98-11, Computer Science Department, Iowa State University, 1998

1998

[41] [41]

In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019

M. Savva, A. Kadian, O. Maksymets, Y . Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V . Koltun, J. Malik, D. Parikh, and D. Batra. Habitat: A platform for embodied AI research. InProceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9339– 9347, 2019. doi:10.1109/ICCV .2019.00943. 11

work page doi:10.1109/iccv 2019

[42] [42]

S. K. Ramakrishnan, A. Gokaslan, E. Wijmans, O. Maksymets, A. Clegg, J. M. Turner, E. Un- dersander, W. Galuba, A. Westbury, A. X. Chang, M. Savva, Y . Zhao, and D. Batra. Habitat- Matterport 3D Dataset (HM3D): 1000 large-scale 3d environments for embodied AI. InThirty- fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track,

[43] [43]

URLhttps://arxiv.org/abs/2109.08238

Pith/arXiv arXiv

[44] [44]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.CoRR, abs/1707.06347, 2017. doi:10.48550/arXiv.1707.06347. URLhttps: //arxiv.org/abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017

[45] [45]

van den Berg, M

J. van den Berg, M. Lin, and D. Manocha. Reciprocal velocity obstacles for real-time multi- agent navigation. In2008 IEEE International Conference on Robotics and Automation, pages 1928–1935. IEEE, 2008

1928

[46] [46]

Kerbl, G

B. Kerbl, G. Kopanas, T. Leimk ¨uhler, and G. Drettakis. 3D gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4):1–14, 2023. doi:10.1145/ 3592433

2023

[47] [47]

R. Han, S. Wang, S. Wang, Z. Zhang, J. Chen, S. Lin, C. Li, C. Xu, Y . C. Eldar, Q. Hao, et al. NeuPAN: Direct point robot navigation with end-to-end model-based learning.IEEE Transactions on Robotics, 41:2804–2824, 2025. doi:10.1109/TRO.2025.3554252. 12

work page doi:10.1109/tro.2025.3554252 2025