pith. machine review for the scientific record. sign in

arxiv: 2603.28286 · v2 · submitted 2026-03-30 · 📡 eess.SY · cs.SY

Recognition: no theorem link

Competitor-aware Race Management for Electric Endurance Racing

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:56 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords electric racingaerodynamic interactionsgame theoryreinforcement learningenergy managementmulti-agent controlpit stops
0
0 comments X

The pith

Exploiting aerodynamic drafting from competitors determines the winner in electric endurance races, unlike solo minimum-time strategies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a bi-level framework that first solves multi-agent game-theoretic optimal control problems to model how cars interact aerodynamically and avoid collisions on each lap. Reinforcement learning then trains agents to manage energy and decide pit stops across the full race using this lap solver as the environment. In a simulated two-car 45-lap race, the approach shows that drafting saves energy and changes optimal strategies from those ignoring other cars. A reader would care because electric vehicles have tight energy budgets, so ignoring interactions wastes opportunities to finish ahead.

Core claim

Race-winning policies in electric endurance racing require jointly governing low-level driver inputs and high-level strategic decisions like energy management and charging through a bi-level setup. The lower level captures aerodynamic effects and asymmetric collision-avoidance constraints in a multi-agent game-theoretic optimal control problem for single laps. The upper level uses reinforcement learning on this environment to allocate battery energy and schedule pit stops over many laps, as shown in a two-agent 45-lap simulation where position-prioritizing strategies differ fundamentally from single-agent minimum-time ones.

What carries the argument

The bi-level framework combining multi-agent game-theoretic optimal control for single-lap interactions with reinforcement learning for race-long energy and pit management.

If this is right

  • Effective exploitation of aerodynamic interactions is decisive for race outcome.
  • Strategies prioritizing finishing position differ fundamentally from single-agent minimum-time approaches.
  • Joint governance of low-level inputs and high-level decisions like charging becomes necessary.
  • The single-lap multi-agent problem serves as the environment for training long-horizon policies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar bi-level methods could apply to other multi-vehicle energy systems like truck platooning.
  • Extending to more agents might reveal emergent cooperative or competitive behaviors.
  • Validating transfer from simulation to real tracks would test the framework's practical value.
  • Rule makers in motorsport could use such models to adjust energy limits or safety constraints.

Load-bearing premise

The simulated aerodynamic interactions and collision-avoidance constraints accurately represent real-world motorsport physics and rules so that policies transfer to actual races.

What would settle it

Running the trained policies on physical electric race cars in a multi-car endurance event and observing whether the multi-agent strategy achieves better finishing positions than single-agent alternatives under real aerodynamic conditions.

Figures

Figures reproduced from arXiv: 2603.28286 by Erik van den Eshof, Jorn van Kampen, Mauro Salazar, Wytze de Vries.

Figure 1
Figure 1. Figure 1: InMotion’s fully electric endurance race car at the Zandvoort circuit. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Free-body diagrams of the race car illustrating the longitudinal, [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Map of the Zandvoort circuit showing the mini-sectors and the [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Electric motor power, front braking power, velocity, time gap, [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Driven racing lines, electric motor power, and front braking power during the overtaking maneuver through the combination of Turns 9 and 10. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The evolution of the battery energy, time gap and the battery energy [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
read the original abstract

Electric endurance racing is characterized by severe energy constraints and strong aerodynamic interactions. Determining race-winning policies therefore becomes a fundamentally multi-agent, game-theoretic problem. These policies must jointly govern low-level driver inputs as well as high-level strategic decisions, including energy management and charging. This paper proposes a bi-level framework for competitor-aware race management that combines game-theoretic optimal control with reinforcement learning. At the lower level, a multi-agent game-theoretic optimal control problem is solved to capture aerodynamic effects and asymmetric collision-avoidance constraints inspired by motorsport rules. Using this single-lap problem as the environment, reinforcement learning agents are trained to allocate battery energy and schedule pit stops over an entire race. The framework is demonstrated in a two-agent, 45-lap simulated race. The results show that effective exploitation of aerodynamic interactions is decisive for race outcome, with strategies that prioritize finishing position differing fundamentally from single-agent, minimum-time approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a bi-level framework for competitor-aware race management in electric endurance racing. The lower level solves a multi-agent game-theoretic optimal control problem to capture aerodynamic drag/wake effects and asymmetric collision-avoidance constraints for single-lap planning; the upper level uses reinforcement learning to optimize battery energy allocation and pit-stop scheduling over a full race. The approach is demonstrated in a two-agent 45-lap simulation, with the central claim that drafting-aware strategies differ fundamentally from single-agent minimum-time policies and that effective exploitation of aerodynamic interactions is decisive for race outcome.

Significance. If the simulation model is shown to be sufficiently representative of real aerodynamic coefficients, wake effects, and motorsport rules, the bi-level construction would offer a principled way to handle hierarchical multi-agent decisions under energy constraints. The explicit separation of single-lap game-theoretic interactions from multi-lap RL strategy is a clean architectural contribution that could be extended to other energy-limited competitive settings.

major comments (2)
  1. [Results / Simulation Setup] The headline assertion that aerodynamic exploitation 'is decisive for race outcome' (abstract and results) rests on the fidelity of the simulated aero interactions and collision constraints; no wind-tunnel validation, telemetry comparison, or sensitivity analysis on drag/wake coefficients is reported, so the transfer from simulation ranking to real-world decisiveness remains unsupported.
  2. [Method / Bi-level Framework] The demonstration is limited to a two-agent, 45-lap scenario; the computational cost and convergence behavior of repeatedly solving the game-theoretic OCP inside the RL loop for larger fields or longer races is not quantified, leaving open whether the framework scales to realistic endurance fields.
minor comments (2)
  1. [Problem Formulation] Notation for the asymmetric collision constraints and the precise form of the aerodynamic interaction terms (e.g., wake velocity deficit model) should be stated explicitly rather than described only qualitatively.
  2. [Results] The abstract states that strategies 'differ fundamentally' from single-agent minimum-time approaches; a quantitative metric (e.g., lap-time difference or energy-use delta) comparing the two policies would strengthen this claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on simulation fidelity and framework scalability. We address each major comment below with targeted revisions to the manuscript.

read point-by-point responses
  1. Referee: [Results / Simulation Setup] The headline assertion that aerodynamic exploitation 'is decisive for race outcome' (abstract and results) rests on the fidelity of the simulated aero interactions and collision constraints; no wind-tunnel validation, telemetry comparison, or sensitivity analysis on drag/wake coefficients is reported, so the transfer from simulation ranking to real-world decisiveness remains unsupported.

    Authors: We agree that the lack of wind-tunnel validation or telemetry comparison means the real-world decisiveness claim is not fully supported by the current manuscript. The aerodynamic model uses coefficients drawn from published vehicle dynamics literature. In revision we will add a sensitivity analysis that varies drag-reduction and wake coefficients over a ±20% range around nominal values and show that the qualitative superiority of drafting-aware policies is preserved. We will also revise the abstract and results text to qualify the 'decisive' claim as holding within the simulated environment. revision: partial

  2. Referee: [Method / Bi-level Framework] The demonstration is limited to a two-agent, 45-lap scenario; the computational cost and convergence behavior of repeatedly solving the game-theoretic OCP inside the RL loop for larger fields or longer races is not quantified, leaving open whether the framework scales to realistic endurance fields.

    Authors: The two-agent, 45-lap case was chosen to focus exposition on the bi-level coupling. We acknowledge that computational cost and scaling behavior are not quantified. In the revised manuscript we will report average wall-clock time per lower-level OCP solve, per RL training episode, and a brief complexity discussion (noting that the number of agents enters the game-theoretic OCP size). We will also outline that for larger fields the lower level can be approximated by pairwise drafting models, which we flag as future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity; framework is self-contained modeling construction

full rationale

The paper defines a bi-level architecture (game-theoretic single-lap OCP capturing aero drag/wake plus collision constraints, followed by RL for energy/pit allocation) and evaluates it inside a 2-agent 45-lap simulation. No equations, fitted parameters, or self-citations are shown that reduce the claimed outcome (drafting-aware policies differing from minimum-time policies) to a definition or input by construction. The central result is obtained by solving the stated optimization and learning problems; the transfer claim to real racing is an explicit modeling assumption rather than a derived equality. This is the normal case of an independent algorithmic proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no explicit free parameters, axioms, or invented entities can be extracted from the text.

pith-pipeline@v0.9.0 · 5462 in / 1080 out tokens · 34253 ms · 2026-05-14T21:56:26.942041+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 2 internal anchors

  1. [1]

    Superb or tedious? the ver- dict on formula e’s peloton racing,

    S. Smith, “Superb or tedious? the ver- dict on formula e’s peloton racing,” Sep

  2. [2]

    Available: https://www.the-race.com/formula-e/ superb-or-tedious-the-verdict-on-formula-e-peloton-racing/

    [Online]. Available: https://www.the-race.com/formula-e/ superb-or-tedious-the-verdict-on-formula-e-peloton-racing/

  3. [3]

    Minimum curvature trajectory planning and control for an autonomous race car,

    A. Heilmeier, A. Wischnewski, L. Hermansdorfer, J. Betz, M. Lienkamp, and B. Lohmann, “Minimum curvature trajectory planning and control for an autonomous race car,”Vehicle System Dynamics, 2020

  4. [4]

    Time-optimal control strategies for a hybrid electric race car,

    S. Ebbesen, M. Salazar, P. Elbert, C. Bussi, and C. H. Onder, “Time-optimal control strategies for a hybrid electric race car,”IEEE Transactions on control systems technology, vol. 26, no. 1, pp. 233– 247, 2017

  5. [5]

    Time-optimal control policy for a hybrid electric race car,

    M. Salazar, P. Elbert, S. Ebbesen, C. Bussi, and C. H. Onder, “Time-optimal control policy for a hybrid electric race car,”IEEE Transactions on Control Systems Technology, vol. 25, no. 6, pp. 1921– 1934, 2017

  6. [6]

    A convex optimization framework for minimum lap time design and control of electric race cars,

    O. Borsboom, C. A. Fahdzyana, T. Hofman, and M. Salazar, “A convex optimization framework for minimum lap time design and control of electric race cars,”IEEE Transactions on Vehicular Technology, vol. 70, no. 9, pp. 8478–8489, 2021

  7. [7]

    Optimal energy management for formula-e cars with regulatory limits and thermal constraints,

    X. Liu, A. Fotouhi, and D. J. Auger, “Optimal energy management for formula-e cars with regulatory limits and thermal constraints,” Applied Energy, vol. 279, p. 115805, 2020. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0306261920312861

  8. [8]

    D. J. Limebeer and M. Massaro,Dynamics and optimal control of road vehicles. Oxford University Press, 2018

  9. [9]

    Interaction- aware game-theoretic motion planning for automated vehicles using bi-level optimization,

    C. Burger, J. Fischer, F. Bieder, ¨O. S ¸. Tas ¸, and C. Stiller, “Interaction- aware game-theoretic motion planning for automated vehicles using bi-level optimization,” in2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2022, pp. 3978– 3985

  10. [10]

    A generalized nash equilibrium approach for optimal control problems of autonomous cars,

    A. Dreves and M. Gerdts, “A generalized nash equilibrium approach for optimal control problems of autonomous cars,”Optimal Control Applications and Methods, vol. 39, no. 1, pp. 326–342, 2018

  11. [11]

    Game theory in formula 1: Multi-agent physical and strategical interactions,

    G. Fieni, M.-P. Neumann, F. Furia, A. Caucino, A. Cerofolini, V . Ravaglioli, and C. H. Onder, “Game theory in formula 1: Multi-agent physical and strategical interactions,”arXiv preprint arXiv:2503.05421, 2025

  12. [12]

    Game- theoretic planning for self-driving cars in multivehicle competitive scenarios,

    M. Wang, Z. Wang, J. Talbot, J. C. Gerdes, and M. Schwager, “Game- theoretic planning for self-driving cars in multivehicle competitive scenarios,”IEEE Transactions on Robotics, vol. 37, no. 4, pp. 1313– 1325, 2021

  13. [13]

    Outracing champion gran turismo drivers with deep reinforcement learning,

    P. R. Wurman, S. Barrett, K. Kawamoto, J. MacGlashan, K. Subrama- nian, T. J. Walsh, R. Capobianco, A. Devlic, F. Eckert, F. Fuchset al., “Outracing champion gran turismo drivers with deep reinforcement learning,”Nature, vol. 602, no. 7896, pp. 223–228, 2022

  14. [14]

    A race simulation for strategy decisions in circuit motorsports,

    A. Heilmeier, M. Graf, and M. Lienkamp, “A race simulation for strategy decisions in circuit motorsports,” in2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 2986–2993

  15. [15]

    Planning formula one race strategies using discrete-event simulation,

    J. Bekker and W. Lotz, “Planning formula one race strategies using discrete-event simulation,”Journal of the Operational Research Soci- ety, vol. 60, no. 7, pp. 952–961, 2009

  16. [16]

    Minimum-race-time energy allocation strategies for the hybrid- electric formula 1 power unit,

    P. Duhr, D. Buccheri, C. Balerna, A. Cerofolini, and C. H. On- der, “Minimum-race-time energy allocation strategies for the hybrid- electric formula 1 power unit,”IEEE Transactions on Vehicular Technology, vol. 72, no. 6, pp. 7035–7050, 2023

  17. [17]

    Maximum-distance race strategies for a fully electric endurance race car,

    J. van Kampen, T. Herrmann, and M. Salazar, “Maximum-distance race strategies for a fully electric endurance race car,”European Journal of Control, vol. 68, p. 100679, 2022

  18. [18]

    Formula-e race strategy development using distributed policy gradient reinforcement learning,

    X. Liu, A. Fotouhi, and D. J. Auger, “Formula-e race strategy development using distributed policy gradient reinforcement learning,” Knowledge-Based Systems, vol. 216, p. 106781, 2021

  19. [19]

    Model predictive control strategies for electric endurance race cars accounting for competitors’ interactions,

    J. van Kampen, M. Moriggi, F. Braghin, and M. Salazar, “Model predictive control strategies for electric endurance race cars accounting for competitors’ interactions,”IEEE Control Systems Letters, 2024

  20. [20]

    Optimizing pit stop strategies in formula 1 with dynamic programming and game theory,

    F. Aguad and C. Thraves, “Optimizing pit stop strategies in formula 1 with dynamic programming and game theory,”European Journal of Operational Research, vol. 319, no. 3, pp. 908–919, 2024

  21. [21]

    Roughgarden,Best-Response Dynamics

    T. Roughgarden,Best-Response Dynamics. Cambridge University Press, 2016, p. 216–229

  22. [22]

    Solving zero-sum games through alternating projections,

    I. Anagnostides and P. Penna, “Solving zero-sum games through alternating projections,”arXiv preprint arXiv:2010.00109, 2020

  23. [23]

    Planning in the presence of cost functions controlled by an adversary,

    H. B. McMahan, G. J. Gordon, and A. Blum, “Planning in the presence of cost functions controlled by an adversary,” inProceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 536–543

  24. [24]

    Formula-e multi-car race strategy development—a novel approach using reinforcement learning,

    X. Liu, A. Fotouhi, and D. Auger, “Formula-e multi-car race strategy development—a novel approach using reinforcement learning,”IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 8, pp. 9524–9534, 2024

  25. [25]

    H. B. Pacejka,Tire characteristics and vehicle handling and stability. Butterworth-Heinemann Oxford, UK, 2012

  26. [26]

    Driving standards guidelines,

    FIA, “Driving standards guidelines,” Feb 2025

  27. [27]

    Double oracle algorithm for computing equilibria in continuous games,

    L. Adam, R. Hor ˇc´ık, T. Kasl, and T. Kroupa, “Double oracle algorithm for computing equilibria in continuous games,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 6, 2021, pp. 5070–5077

  28. [28]

    S. V . Albrecht, F. Christianos, and L. Sch ¨afer,Multi-agent reinforce- ment learning: Foundations and modern approaches. MIT Press, 2024

  29. [29]

    Policy invariance under reward transformations: Theory and application to reward shaping,

    A. Y . Ng, D. Harada, and S. Russell, “Policy invariance under reward transformations: Theory and application to reward shaping,” inIcml, vol. 99. Citeseer, 1999, pp. 278–287

  30. [30]

    Proximal Policy Optimization Algorithms

    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017

  31. [31]

    Emergent Complexity via Multi-Agent Competition

    T. Bansal, J. Pachocki, S. Sidor, I. Sutskever, and I. Mordatch, “Emergent complexity via multi-agent competition,”arXiv preprint arXiv:1710.03748, 2017

  32. [32]

    Overcoming policy collapse in deep reinforcement learning,

    S. Dohare, Q. Lan, and A. R. Mahmood, “Overcoming policy collapse in deep reinforcement learning,” inSixteenth European Workshop on Reinforcement Learning, 2023

  33. [33]

    Available: https://www.inmotion.tue.nl/

    [Online]. Available: https://www.inmotion.tue.nl/

  34. [34]

    Competitor-aware race management for electric endurance racing - full lap,

    “Competitor-aware race management for electric endurance racing - full lap,” https://youtu.be/rUjys60b0j4?si=wiI53awyPaeB2y4r

  35. [35]

    Competitor-aware race management for electric endurance racing - overtake,

    “Competitor-aware race management for electric endurance racing - overtake,” https://youtu.be/wwAB7gZU6tc?si=jxsC Z44Nr6gCIrv

  36. [36]

    CasADi – A software framework for nonlinear optimization and optimal control,

    J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “CasADi – A software framework for nonlinear optimization and optimal control,”Mathematical Programming Computation, 2018

  37. [37]

    On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear program- ming,

    A. W ¨achter and L. T. Biegler, “On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear program- ming,”Mathematical programming, vol. 106, no. 1, pp. 25–57, 2006

  38. [38]

    Reinforcement learning toolbox version: 25.2 (r2025b),

    T. M. Inc., “Reinforcement learning toolbox version: 25.2 (r2025b),” Natick, Massachusetts, United States, 2025. [Online]. Available: https://www.mathworks.com 8