pith. machine review for the scientific record. sign in

arxiv: 2604.06972 · v1 · submitted 2026-04-08 · 💻 cs.RO · cs.MA

Recognition: no theorem link

Differentiable Environment-Trajectory Co-Optimization for Safe Multi-Agent Navigation

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:48 UTC · model grok-4.3

classification 💻 cs.RO cs.MA
keywords multi-agent navigationenvironment optimizationbi-level optimizationdifferentiable optimizationsafety metrictrajectory planningKKT conditions
0
0 comments X

The pith

Jointly optimizing environment configurations and agent trajectories enables safer multi-agent navigation in constrained spaces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows how to treat the layout of a shared space as something that can be adjusted alongside the paths agents take. By setting up a bi-level optimization problem, the lower level finds the best paths for agents given a layout, while the upper level tweaks the layout to make those paths safer overall. A differentiable method allows the whole thing to be solved by passing gradients from the path optimizer back to the layout optimizer using conditions from optimization theory. The result is environments that actively guide agents away from risks, as tested in simulated warehouse and urban settings. This matters because fixed environments often force agents into risky maneuvers, whereas adjustable ones can reduce that need.

Core claim

The central discovery is a differentiable co-optimization framework for environments and trajectories in multi-agent navigation. The lower-level problem optimizes agent trajectories to minimize navigation costs using interior point methods, while the upper-level problem optimizes environment parameters to maximize a novel measure-theoretic safety metric using gradient ascent. Gradients are obtained analytically by applying the KKT conditions and the Implicit Function Theorem to couple the levels. This allows the system to find environment configurations that provide navigation guidance, improving safety and efficiency in scenarios from warehouse logistics to urban transportation.

What carries the argument

Bi-level optimization structure where lower-level trajectory optimization is solved with interior point methods and upper-level environment optimization uses gradient ascent, coupled differentiably via KKT conditions and the Implicit Function Theorem, with a measure-theoretic safety metric as the objective.

Load-bearing premise

Environment configurations can be modeled as continuous variables with reliable gradients computed through optimization conditions, and the safety metric based on measure theory truly captures collision risks in practice.

What would settle it

Running the method on a physical multi-robot system in a warehouse and measuring if actual collision rates decrease compared to fixed environment baselines under identical agent control policies.

Figures

Figures reproduced from arXiv: 2604.06972 by Amanda Prorok, Gabriele Fadini, Stelian Coros, Zhan Gao.

Figure 1
Figure 1. Figure 1: Environment configurations can impact the safety and [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Agent Ai starts from the initial position si and moves towards the goal position gi by following its trajectory xi, and avoids collisions with obstacle regions {∆j} 3 j=1 in the environment E. agents {Ai} N i=1. This motivates to define the safety metric from two aspects: (i) safety w.r.t. obstacle regions and (ii) safety among agents. A. Safety w.r.t. Obstacle Regions Safety w.r.t. obstacle regions charac… view at source ↗
Figure 4
Figure 4. Figure 4: Safety metric as a function of environment parameters in [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: The six scenarios considered in our experiments. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: (a) Optimized environment of our method that creates obstacle-free trajectories and prioritizes / de-conflicts agents for safe navigation. (b) Standard environment with a regular shelf layout. (c) Random environment with randomly generated shelf positions. Scenario 1 Warehouse. This scenario considers agents as robots in a warehouse with a set of shelves. The environment is of size [−10m, 10m] × [−10m, 10m… view at source ↗
Figure 8
Figure 8. Figure 8: (a) Optimized environment that guides agents to move clockwise towards their goal positions. (b) Empty environment with no roundabout that provides no navigation guidance for agents, although imposing no obstacle hindrance. (c) Baseline environment with a large roundabout that blocks efficient pathways, although providing navigation guidance. to optimize ro to facilitate safe multi-vehicle driving. We cons… view at source ↗
Figure 9
Figure 9. Figure 9: (a) Optimized environment that creates an asymmetric structure to prioritize / de-conflict agents to pass through the narrow passage. Blue dashed boundaries mirror the lower passage boundaries to demonstrate the asymmetry. (b) Baseline environment with narrow passage angles. (c) Baseline environment with wide passage angles. Both baselines are designed with human intuition and commonly used in practice, wh… view at source ↗
Figure 11
Figure 11. Figure 11: (a) Optimized environment that guides agents to pass through the intersection smoothly towards their goal positions. (b) Baseline environment with an intersection angle π/4. (c) Baseline environment with an intersection angle 3π/4. navigation. This highlights the difficulty of hand-designing environments and the significance of automatic optimization. Scenario 6 Track design. This scenario involves the de… view at source ↗
Figure 13
Figure 13. Figure 13: (a) Optimized environment that guides agents to move smoothly along the track. (b) Baseline environment. metric for the upper-level environment optimization, which measures an explicit safety level of the environment w.r.t. multi-agent navigation. By incorporating the proposed safety metric, we developed a differentiable optimization method, which iteratively tackles the lower-level trajectory optimizatio… view at source ↗
Figure 14
Figure 14. Figure 14: (a) Optimized environment with unicycle dynamics in the scenario of Narrow Passage. (b) Optimized environment with unicycle dynamics in the scenario of Road Intersection. where A = ∪ K k=1Ak is the union of the multi-agent systems {Ak} K k=1. For any other agent Ai ′ in a different multi-agent system Ak′ with k ′ ̸= k, we have X i ′̸=i∈A X T t=0 pxi ′ (x (t) i ) (45) = X i ′̸=i∈Ak X T t=0 pxi ′ (x (t) i )… view at source ↗
Figure 15
Figure 15. Figure 15: Optimized environment w.r.t. random navigation tasks, [PITH_FULL_IMAGE:figures/full_fig_p014_15.png] view at source ↗
read the original abstract

The environment plays a critical role in multi-agent navigation by imposing spatial constraints, rules, and limitations that agents must navigate around. Traditional approaches treat the environment as fixed, without exploring its impact on agents' performance. This work considers environment configurations as decision variables, alongside agent actions, to jointly achieve safe navigation. We formulate a bi-level problem, where the lower-level sub-problem optimizes agent trajectories that minimize navigation cost and the upper-level sub-problem optimizes environment configurations that maximize navigation safety. We develop a differentiable optimization method that iteratively solves the lower-level sub-problem with interior point methods and the upper-level sub-problem with gradient ascent. A key challenge lies in analytically coupling these two levels. We address this by leveraging KKT conditions and the Implicit Function Theorem to compute gradients of agent trajectories w.r.t. environment parameters, enabling differentiation throughout the bi-level structure. Moreover, we propose a novel metric that quantifies navigation safety as a criterion for the upper-level environment optimization, and prove its validity through measure theory. Our experiments validate the effectiveness of the proposed framework in a variety of safety-critical navigation scenarios, inspired from warehouse logistics to urban transportation. The results demonstrate that optimized environments provide navigation guidance, improving both agents' safety and efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript formulates a bi-level optimization problem for safe multi-agent navigation in which the lower level optimizes agent trajectories (minimizing navigation cost subject to dynamics and collision-avoidance constraints) via interior-point methods while the upper level optimizes continuous environment parameters to maximize a novel measure-theoretic safety metric via gradient ascent. Gradients of the lower-level solution with respect to environment variables are obtained by implicit differentiation through the KKT system using the Implicit Function Theorem. Experiments in simulated warehouse-logistics and urban-transportation scenarios are reported to show that the co-optimized environments improve both safety and efficiency.

Significance. If the gradient computations are reliable, the work offers a principled way to treat environment geometry as a decision variable rather than a fixed constraint, which could influence environment design for autonomous systems. The measure-theoretic safety metric and its proof constitute a clear theoretical contribution; the bi-level differentiable framework is a natural extension of recent differentiable-optimization literature to multi-agent settings.

major comments (2)
  1. [§3 and §4] §3 (Bi-level formulation) and §4 (Differentiable optimization): the central claim that environment parameters can be optimized by gradient ascent on the safety metric rests on the ability to compute d(trajectory*)/d(env) via the Implicit Function Theorem applied to the KKT system of the lower-level non-convex trajectory problem. For problems with active collision constraints whose active-set structure changes with environment geometry, the KKT Jacobian is not guaranteed to be invertible (LICQ, strict complementarity, and SOSC may fail at interior-point solutions). The manuscript provides no verification, regularization, or conditioning analysis that these conditions hold in the reported scenarios.
  2. [§5] §5 (Experiments): the reported improvements in safety and efficiency are presented without ablation on the gradient-computation step itself (e.g., comparison against finite-difference gradients, Jacobian condition-number statistics, or failure cases where the IFT step is ill-conditioned). Because the co-optimization loop depends on these gradients, the absence of such diagnostics leaves the empirical support for the method incomplete.
minor comments (2)
  1. [Abstract and §5] The abstract states that experiments validate the framework but supplies no quantitative metrics, error bars, or baseline comparisons; the full manuscript should ensure these appear in the main text and tables with clear statistical reporting.
  2. [§2 and §4] Notation for the safety metric (measure-theoretic construction) and its relation to the KKT stationarity conditions should be cross-referenced explicitly so readers can trace how the upper-level objective depends on the lower-level solution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the concerns regarding the applicability of the Implicit Function Theorem to the non-convex lower-level problem and the need for additional empirical diagnostics on the gradient computations. We outline revisions that will strengthen the manuscript while clarifying the scope of our contributions.

read point-by-point responses
  1. Referee: [§3 and §4] §3 (Bi-level formulation) and §4 (Differentiable optimization): the central claim that environment parameters can be optimized by gradient ascent on the safety metric rests on the ability to compute d(trajectory*)/d(env) via the Implicit Function Theorem applied to the KKT system of the lower-level non-convex trajectory problem. For problems with active collision constraints whose active-set structure changes with environment geometry, the KKT Jacobian is not guaranteed to be invertible (LICQ, strict complementarity, and SOSC may fail at interior-point solutions). The manuscript provides no verification, regularization, or conditioning analysis that these conditions hold in the reported scenarios.

    Authors: We acknowledge that the lower-level trajectory optimization is non-convex and that active-set changes can violate LICQ, strict complementarity, or SOSC, potentially rendering the KKT Jacobian singular. Our framework applies the Implicit Function Theorem under the standard local regularity assumption that holds at converged interior-point solutions, which is common in differentiable optimization work. In our experiments the co-optimization converged reliably without singular Jacobians. To address the referee's point we will add a dedicated paragraph in §4 discussing these regularity conditions, together with empirical condition-number statistics collected from the KKT systems in both warehouse and urban scenarios. We will also describe a simple regularization heuristic (small diagonal perturbation) used when the Jacobian is near-singular. A general theoretical guarantee for arbitrary environment geometries remains outside the paper's scope and would require stronger problem assumptions; we will explicitly state this limitation. revision: partial

  2. Referee: [§5] §5 (Experiments): the reported improvements in safety and efficiency are presented without ablation on the gradient-computation step itself (e.g., comparison against finite-difference gradients, Jacobian condition-number statistics, or failure cases where the IFT step is ill-conditioned). Because the co-optimization loop depends on these gradients, the absence of such diagnostics leaves the empirical support for the method incomplete.

    Authors: We agree that direct validation of the implicit gradients would improve the empirical grounding of the method. In the revised manuscript we will extend §5 with an ablation that compares IFT-derived gradients against central finite-difference approximations on a representative subset of environment parameters. We will also tabulate the condition numbers of the KKT Jacobians observed across all optimization iterations and report any ill-conditioned cases together with the regularization steps taken. These additions will quantify the reliability of the gradient step and directly respond to the referee's concern. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper's bi-level formulation separates the lower-level trajectory optimization (solved via interior-point methods) from the upper-level environment optimization (via gradient ascent). Gradients are obtained by applying the standard KKT conditions and Implicit Function Theorem to couple the levels, which are external mathematical tools rather than quantities derived from the paper's own fitted results or definitions. The proposed safety metric is defined and justified independently via measure theory with an explicit proof of validity. No load-bearing step reduces by construction to a self-citation, a fitted input renamed as a prediction, or an ansatz smuggled through prior work; the central claims rest on these independent derivations plus experimental validation in the reported scenarios.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on standard optimization theory plus domain assumptions about continuous environment parameterization and the validity of the new safety metric.

axioms (2)
  • domain assumption Environment configurations can be represented as continuous decision variables.
    Required for the upper-level optimization problem.
  • standard math KKT conditions and the implicit function theorem apply to the lower-level trajectory optimization.
    Used to obtain analytic gradients between levels.
invented entities (1)
  • Novel navigation safety metric no independent evidence
    purpose: Quantifies safety as the objective for upper-level environment optimization
    Introduced and proved valid through measure theory; no external independent evidence supplied.

pith-pipeline@v0.9.0 · 5528 in / 1303 out tokens · 45705 ms · 2026-05-10T18:48:23.209986+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    Advances in multi-robot systems,

    T. Arai, E. Pagello, L. E. Parker, et al., “Advances in multi-robot systems,”IEEE Transactions on Robotics and Automation, vol. 18, no. 5, pp. 655–661, 2002

  2. [2]

    A survey and analysis of cooperative multi-agent robot systems: Challenges and directions,

    Z. H. Ismail, N. Sariff, and E. G. Hurtado, “A survey and analysis of cooperative multi-agent robot systems: Challenges and directions,” Applications of Mobile Robots, vol. 5, pp. 8–14, 2018

  3. [3]

    Online control barrier functions for decentralized multi-agent navigation,

    Z. Gao, G. Yang, and A. Prorok, “Online control barrier functions for decentralized multi-agent navigation,” inIEEE International Sympo- sium on Multi-Robot and Multi-Agent Systems (MRS), 2023

  4. [4]

    Formation control for a coop- erative multi-agent system using decentralized navigation functions,

    M. C. De Gennaro and A. Jadbabaie, “Formation control for a coop- erative multi-agent system using decentralized navigation functions,” inIEEE American Control Conference (ACC), 2006

  5. [5]

    Reciprocal velocity obstacles for real-time multi-agent navigation,

    J. Van den Berg, M. Lin, and D. Manocha, “Reciprocal velocity obstacles for real-time multi-agent navigation,” inIEEE International Conference on Robotics and Automation (ICRA), 2008

  6. [6]

    Decentralized path planning for multi- agent teams with complex constraints,

    V . R. Desaraju and J. P. How, “Decentralized path planning for multi- agent teams with complex constraints,”Autonomous Robots, vol. 32, no. 4, pp. 385–403, 2012

  7. [7]

    Primal: Pathfinding via reinforcement and imi- tation multi-agent learning,

    G. Sartoretti et al., “Primal: Pathfinding via reinforcement and imi- tation multi-agent learning,”IEEE Robotics and Automation Letters, vol. 4, no. 3, pp. 2378–2385, 2019

  8. [8]

    Provably safe online multi-agent navigation in unknown environments,

    Z. Gao, G. Yang, J. Bayrooti, and A. Prorok, “Provably safe online multi-agent navigation in unknown environments,” inConference on Robot Learning (CoRL), 2024

  9. [9]

    Search-based testing of multi-agent manufacturing systems for deadlocks based on models,

    N. Mani, V . Garousi, and B. H. Far, “Search-based testing of multi-agent manufacturing systems for deadlocks based on models,” International Journal on Artificial Intelligence Tools, vol. 19, no. 04, pp. 417–437, 2010

  10. [10]

    Uncovering surprising behaviors in reinforce- ment learning via worst-case analysis,

    A. Ruderman et al., “Uncovering surprising behaviors in reinforce- ment learning via worst-case analysis,” 2019

  11. [11]

    A new-generation auto- mated warehousing capability,

    Q. Wang, R. McIntosh, and M. Brain, “A new-generation auto- mated warehousing capability,”International Journal of Computer Integrated Manufacturing, vol. 23, no. 6, pp. 565–573, 2010

  12. [12]

    Robotic building (s),

    H. Bier, “Robotic building (s),”Next Generation Building, vol. 1, no. 1, 2014

  13. [13]

    Flexible automated warehouse: A literature review and an innovative framework,

    L. Custodio and R. Machado, “Flexible automated warehouse: A literature review and an innovative framework,”The International Journal of Advanced Manufacturing Technology, vol. 106, no. 1, pp. 533–558, 2020

  14. [14]

    Prioritized planning algorithms for trajectory coordination of multiple mobile robots,

    M. ˇC´ap, P. Nov´ak, A. Kleiner, and M. Seleck `y, “Prioritized planning algorithms for trajectory coordination of multiple mobile robots,” IEEE Transactions on Automation Science and Engineering, vol. 12, no. 3, pp. 835–849, 2015

  15. [15]

    Coordinating hundreds of cooperative, autonomous vehicles in warehouses,

    P. R. Wurman, R. D’Andrea, and M. Mountz, “Coordinating hundreds of cooperative, autonomous vehicles in warehouses,”AI Magazine, vol. 29, no. 1, pp. 9–9, 2008

  16. [16]

    Selecting an optimum configura- tion of one-way and two-way routes,

    Z. Drezner and G. O. Wesolowsky, “Selecting an optimum configura- tion of one-way and two-way routes,”Transportation Science, vol. 31, no. 4, pp. 386–394, 1997

  17. [17]

    Use of unmanned vehicles in search and rescue operations in forest fires: Advantages and limitations observed in a field trial,

    S. Karma et al., “Use of unmanned vehicles in search and rescue operations in forest fires: Advantages and limitations observed in a field trial,”International Journal of Disaster Risk Reduction, vol. 13, pp. 307–312, 2015

  18. [18]

    Computer games with intelligence,

    D. Johnson and J. Wiles, “Computer games with intelligence,” in IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2001

  19. [19]

    Clark,Being there: Putting brain, body, and world together again

    A. Clark,Being there: Putting brain, body, and world together again. MIT press, 1998

  20. [20]

    Control under communication con- straints,

    S. Tatikonda and S. Mitter, “Control under communication con- straints,”IEEE Transactions on Automatic Control, vol. 49, no. 7, pp. 1056–1068, 2004

  21. [21]

    Sdp-based joint sensor and controller design for information-regularized optimal lqg control,

    T. Tanaka and H. Sandberg, “Sdp-based joint sensor and controller design for information-regularized optimal lqg control,” inIEEE Conference on Decision and Control (CDC), 2015

  22. [22]

    Lqg control and sensing co-design,

    V . Tzoumas, L. Carlone, G. J. Pappas, and A. Jadbabaie, “Lqg control and sensing co-design,”IEEE Transactions on Automatic Control, vol. 66, no. 4, pp. 1468–1483, 2020

  23. [23]

    Automatic design and manufacture of robotic lifeforms,

    H. Lipson and J. B. Pollack, “Automatic design and manufacture of robotic lifeforms,”Nature, vol. 406, no. 6799, pp. 974–978, 2000

  24. [24]

    Scalable co- optimization of morphology and control in embodied machines,

    N. Cheney, J. Bongard, V . SunSpiral, and H. Lipson, “Scalable co- optimization of morphology and control in embodied machines,” Journal of The Royal Society Interface, vol. 15, no. 143, p. 20 170 937, 2018

  25. [25]

    Co-designing versatile quadruped robots for dynamic and energy-efficient motions,

    G. Fadini et al., “Co-designing versatile quadruped robots for dynamic and energy-efficient motions,”Robotica, vol. 42, no. 6, pp. 2004– 2025, 2024

  26. [26]

    Computational design of energy-efficient legged robots: Optimizing for size and actuators,

    G. Fadini, T. Flayols, A. Del Prete, N. Mansard, and P. Sou `eres, “Computational design of energy-efficient legged robots: Optimizing for size and actuators,” inIEEE International Conference on Robotics and Automation (ICRA), 2021

  27. [27]

    Simulation aided co-design for robust robot optimization,

    G. Fadini, T. Flayols, A. Del Prete, and P. Sou `eres, “Simulation aided co-design for robust robot optimization,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 11 306–11 313, 2022

  28. [28]

    Control-aware design optimization for bio-inspired quadruped robots,

    F. De Vincenti, D. Kang, and S. Coros, “Control-aware design optimization for bio-inspired quadruped robots,” inIEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), 2021

  29. [29]

    Computational co-optimization of design parameters and motion trajectories for robotic systems,

    S. Ha, S. Coros, A. Alspach, J. Kim, and K. Yamane, “Computational co-optimization of design parameters and motion trajectories for robotic systems,”The International Journal of Robotics Research, vol. 37, no. 13-14, pp. 1521–1536, 2018

  30. [30]

    Joint optimization of robot design and motion parameters using the implicit function theorem.,

    S. Ha, S. Coros, A. Alspach, J. Kim, and K. Yamane, “Joint optimization of robot design and motion parameters using the implicit function theorem.,” inRobotics: Science and systems (RSS), 2017

  31. [31]

    Concurrent optimization of mechanical design and locomotion control of a legged robot,

    K. M. Digumarti, C. Gehring, S. Coros, J. Hwangbo, and R. Sieg- wart, “Concurrent optimization of mechanical design and locomotion control of a legged robot,”Mobile Service Robotics, pp. 315–323, 2014

  32. [32]

    Joint equilibrium policy search for multi-agent scheduling problems,

    T. Gabel and M. Riedmiller, “Joint equilibrium policy search for multi-agent scheduling problems,” inGerman Conference on Mul- tiagent System Technologies (MATES), 2008

  33. [33]

    Co-optimizating multi-agent placement with task assignment and scheduling,

    C. Zhang and J. A. Shah, “Co-optimizating multi-agent placement with task assignment and scheduling,” inInternational Joint Confer- ences on Artificial Intelligence (IJCAI), 2016

  34. [34]

    Decentralized energy aware co- optimization of mobility and communication in multiagent systems,

    H. Jaleel and J. S. Shamma, “Decentralized energy aware co- optimization of mobility and communication in multiagent systems,” inIEEE Conference on Decision and Control (CDC), 2016

  35. [35]

    Motion-communication co-optimization with cooperative load transfer in mobile robotics: An optimal control perspective,

    U. Ali, H. Cai, Y . Mostofi, and Y . Wardi, “Motion-communication co-optimization with cooperative load transfer in mobile robotics: An optimal control perspective,”IEEE Transactions on Control of Network Systems, vol. 6, no. 2, pp. 621–632, 2018

  36. [36]

    Environment optimization for multi-agent navigation,

    Z. Gao and A. Prorok, “Environment optimization for multi-agent navigation,” inIEEE International Conference on Robotics and Au- tomation (ICRA), 2023

  37. [37]

    Constrained environment optimization for prioritized multi-agent navigation,

    Z. Gao and A. Prorok, “Constrained environment optimization for prioritized multi-agent navigation,”IEEE Open Journal of Control Systems, vol. 2, pp. 337–355, 2023

  38. [38]

    Co-optimizing reconfigurable environments and policies for decentralized multiagent navigation,

    Z. Gao, G. Yang, and A. Prorok, “Co-optimizing reconfigurable environments and policies for decentralized multiagent navigation,” IEEE Transactions on Robotics, vol. 41, pp. 4741–4760, 2025

  39. [39]

    Decentralized collision avoidance, deadlock detection, and deadlock resolution for multiple mobile robots,

    M. Jager and B. Nebel, “Decentralized collision avoidance, deadlock detection, and deadlock resolution for multiple mobile robots,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2001

  40. [40]

    Finding and optimizing solvable priority schemes for decoupled path planning techniques for teams of mobile robots,

    M. Bennewitz, W. Burgard, and S. Thrun, “Finding and optimizing solvable priority schemes for decoupled path planning techniques for teams of mobile robots,”Robotics and Autonomous Systems, vol. 41, no. 2-3, pp. 89–99, 2002

  41. [41]

    Complete decentralized method for on-line multi-robot trajectory planning in well-formed infras- tructures,

    M. ˇC´ap, J. V okˇr´ınek, and A. Kleiner, “Complete decentralized method for on-line multi-robot trajectory planning in well-formed infras- tructures,” inInternational Conference on Automated Planning and Scheduling (ICAPS), 2015

  42. [42]

    Multi-robot path deconflic- tion through prioritization by path prospects,

    W. Wu, S. Bhattacharya, and A. Prorok, “Multi-robot path deconflic- tion through prioritization by path prospects,” inIEEE International Conference on Robotics and Automation (ICRA), 2020

  43. [43]

    arXiv preprint arXiv:2103.01991 , year=

    I. Gur, N. Jaques, K. Malta, M. Tiwari, H. Lee, and A. Faust, “Adversarial environment generation for learning to navigate the web,”arXiv preprint arXiv:2103.01991, 2021

  44. [44]

    Minimum constraint displacement motion planning.,

    K. K. Hauser, “Minimum constraint displacement motion planning.,” inRobotics: Science and Systems (RSS), 2013

  45. [45]

    The minimum constraint removal problem with three robotics applications,

    K. Hauser, “The minimum constraint removal problem with three robotics applications,”The International Journal of Robotics Re- search, vol. 33, no. 1, pp. 5–17, 2014

  46. [46]

    Multi-agent path finding in configurable environments,

    M. Bellusci, N. Basilico, F. Amigoni, et al., “Multi-agent path finding in configurable environments,” inInternational Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2020

  47. [47]

    Rudin,Principles of mathematical analysis, 3rd edition

    W. Rudin,Principles of mathematical analysis, 3rd edition. McGraw- Hill, 2008

  48. [48]

    On Evaluation of Embodied Navigation Agents

    P. Anderson et al., “On evaluation of embodied navigation agents,” arXiv preprint arXiv:1807.06757, 2018

  49. [49]

    Navigating to objects in the real world,

    T. Gervet, S. Chintala, D. Batra, J. Malik, and D. S. Chaplot, “Navigating to objects in the real world,”Science Robotics, vol. 8, no. 79, eadf6991, 2023

  50. [50]

    Reinforced cross-modal matching and self-supervised imitation learning for vision-language navigation,

    X. Wang et al., “Reinforced cross-modal matching and self-supervised imitation learning for vision-language navigation,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019