Transformer-Guided Deep Reinforcement Learning for Optimal Takeoff Trajectory Design of an eVTOL Drone

Nathan M. Roberts II; Xiaosong Du

arxiv: 2511.14887 · v2 · submitted 2025-11-18 · 💻 cs.LG

Transformer-Guided Deep Reinforcement Learning for Optimal Takeoff Trajectory Design of an eVTOL Drone

Nathan M. Roberts II , Xiaosong Du This is my paper

Pith reviewed 2026-05-17 20:17 UTC · model grok-4.3

classification 💻 cs.LG

keywords eVTOLtakeoff trajectorydeep reinforcement learningtransformerenergy minimizationvertical takeoffdrone controloptimal trajectory design

0 comments

The pith

Transformer-guided DRL trains eVTOL takeoff trajectories with 25 percent of the steps needed by standard reinforcement learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that inserting a transformer into a deep reinforcement learning loop lets the agent focus only on realistic parts of the state space at each moment, cutting the training effort for minimum-energy eVTOL takeoff paths. The controls are power level and wing angle; the constraints are minimum height gain and minimum forward speed at the end of the maneuver. A sympathetic reader would care because eVTOL aircraft are currently limited by high power draw right at liftoff, and any reliable way to reduce that draw without solving huge optimal-control problems could make battery sizing and range more practical. The authors show the guided agent reaches 97.2 percent of the energy performance of a simulation-based reference optimum.

Core claim

The transformer-guided DRL agent learned to take off with 4.57×10^6 time steps, representing 25 percent of the 19.79×10^6 time steps needed by a vanilla DRL agent. It achieved 97.2 percent accuracy on the optimal energy consumption compared against the simulation-based optimal reference, while the vanilla DRL achieved 96.1 percent accuracy. The transformer works by exploring a realistic state space at each time step using power and wing angle to the vertical as control variables.

What carries the argument

The transformer module that, at each time step, identifies and prioritizes realistic regions of the state space to guide the reinforcement learning agent's exploration and policy updates.

If this is right

Training converges with roughly one-quarter the number of environment interactions required by unguided DRL.
The final policy satisfies the takeoff constraints on vertical displacement and horizontal velocity.
Energy use lies within three percent of the value obtained from a separate simulation-based optimizer.
The same guidance structure can be reused for other eVTOL trajectory problems that share the same state and action structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same transformer guidance pattern could shorten training for landing or transition-to-cruise phases without new algorithm development.
If the state-space pruning remains accurate under sensor noise, the method may transfer to onboard hardware with modest additional tuning.
Extending the transformer to output uncertainty estimates could allow the agent to request more samples only in ambiguous regions.

Load-bearing premise

The transformer can reliably select realistic state-space regions without systematically excluding high-reward trajectories or biasing the learned policy away from true optimality.

What would settle it

A new high-fidelity simulation run in which the final energy consumption of the transformer-guided policy exceeds the known simulation-based optimum by more than a few percent, or in which a vanilla DRL agent reaches comparable performance with similar total steps.

read the original abstract

The rapid advancement of electric vertical takeoff and landing (eVTOL) aircraft offers a promising opportunity to alleviate urban traffic congestion but is still limited by excessive power demands, especially during the takeoff phase. Thus, developing optimal takeoff trajectories for minimum energy consumption becomes essential for broader eVTOL aircraft applications. Conventional optimal control methods (such as dynamic programming and linear quadratic regulator) provide highly efficient and well-established solutions but are prohibited by problem dimensionality and complexity. Deep reinforcement learning (DRL) emerges as a special type of artificial intelligence tackling complex, nonlinear systems; however, the training difficulty is a key bottleneck that hinders DRL applications. To address these challenges, we propose the transformer-guided DRL to alleviate the training difficulty by exploring a realistic state space at each time step using a transformer. The proposed transformer-guided DRL was demonstrated on an optimal takeoff trajectory design of an eVTOL drone for minimal energy consumption while meeting takeoff conditions (i.e., minimum vertical displacement and minimum horizontal velocity) by varying control variables (i.e., power and wing angle to the vertical). Results presented that the transformer-guided DRL agent learned to take off with $4.57\times10^6$ time steps, representing $25\%$ of the $19.79\times10^6$ time steps needed by a vanilla DRL agent. In addition, the transformer-guided DRL achieved $97.2\%$ accuracy on the optimal energy consumption compared against the simulation-based optimal reference, while the vanilla DRL achieved $96.1\%$ accuracy. Therefore, the proposed transformer-guided DRL outperformed vanilla DRL in terms of both training efficiency and optimal design verification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Transformer guidance speeds up DRL training for eVTOL takeoff but the supporting details are too thin to fully trust the gains.

read the letter

The key takeaway is that adding a transformer to guide exploration in a DRL agent for eVTOL takeoff trajectory optimization cuts the required training steps by 75 percent and edges out vanilla DRL on closeness to the simulated energy minimum. What the paper does is apply an existing transformer-in-RL approach to this specific continuous control problem in aerospace. It does well at highlighting the energy demands of the takeoff phase and presenting a method that improves training efficiency enough to make DRL more feasible for this kind of design task. The comparison to both a simulation-based optimum and a standard DRL baseline gives a clear before-and-after picture. The main weaknesses are in the level of detail provided. The abstract mentions the performance numbers but does not describe the reward shaping, how states and actions are represented, the transformer's architecture or training procedure, or any sensitivity analysis. Without those, it is difficult to reproduce or understand why the guidance works. The 1.1 percent accuracy improvement is small enough that it could be within run-to-run variation, and there are no mentions of statistical tests or multiple seeds. On the potential bias issue raised in the stress test, the paper does not include ablations that would show whether the transformer is filtering out better trajectories or not, so that remains an open question. This kind of paper is for people applying RL techniques to trajectory planning in drones or similar systems. A practitioner looking for ways to speed up training on similar problems could pick up some ideas from the setup. I would recommend sending it for peer review. The results are specific and the problem is timely, so referees can help fill in the gaps around methods and validation.

Referee Report

2 major / 1 minor

Summary. The paper proposes a transformer-guided deep reinforcement learning (DRL) method to optimize takeoff trajectories for an eVTOL drone, minimizing energy consumption subject to minimum vertical displacement and horizontal velocity constraints by controlling power and wing angle. It reports that the guided agent requires 4.57×10^6 training steps (25% of the 19.79×10^6 steps for vanilla DRL) and reaches 97.2% of the energy optimality achieved by a simulation-based reference, versus 96.1% for the baseline.

Significance. If substantiated with full methodological details, the approach could provide a practical means to accelerate DRL training for high-dimensional aerospace trajectory optimization by restricting exploration to realistic state regions, addressing a known bottleneck in applying model-free RL to nonlinear optimal control problems.

major comments (2)

[Abstract] Abstract and Methods: the central performance claims (4.57×10^6 vs. 19.79×10^6 steps and 97.2% vs. 96.1% optimality) are presented without any description of the reward function, state representation, action space discretization, hyperparameter search procedure, or number of independent runs with statistical significance testing; these omissions make it impossible to evaluate whether the reported gains are robust or sensitive to modeling assumptions.
[Methods] The manuscript provides no ablation isolating the transformer module's contribution nor any analysis showing that the guided policy class still contains the simulation-based global optimum; without this, the efficiency gain could result from unintended restriction of the search space rather than improved guidance.

minor comments (1)

[Abstract] The abstract states '25% of the 19.79×10^6 time steps' but 4.57/19.79 ≈ 0.231; a precise ratio or clarification would improve accuracy.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment in turn below, indicating where we agree that revisions are warranted and outlining the changes we will make.

read point-by-point responses

Referee: [Abstract] Abstract and Methods: the central performance claims (4.57×10^6 vs. 19.79×10^6 steps and 97.2% vs. 96.1% optimality) are presented without any description of the reward function, state representation, action space discretization, hyperparameter search procedure, or number of independent runs with statistical significance testing; these omissions make it impossible to evaluate whether the reported gains are robust or sensitive to modeling assumptions.

Authors: We agree that these methodological details are essential for reproducibility and for assessing robustness. The current manuscript focuses on the high-level results in the abstract and provides only a concise methods overview. In the revised version we will expand the Methods section with explicit descriptions of the reward function (including all weighting terms and constraints), the full state representation, the discretization scheme for the action space (power and wing angle), the hyperparameter search procedure employed, and results aggregated over multiple independent runs together with statistical significance testing. revision: yes
Referee: [Methods] The manuscript provides no ablation isolating the transformer module's contribution nor any analysis showing that the guided policy class still contains the simulation-based global optimum; without this, the efficiency gain could result from unintended restriction of the search space rather than improved guidance.

Authors: We acknowledge that a dedicated ablation would more cleanly isolate the transformer's contribution. The existing comparison to vanilla DRL already holds the underlying DRL algorithm, environment, and hyperparameters fixed while varying only the presence of transformer guidance; nevertheless, we will add an explicit ablation study in the revision. On the question of whether the guided policy class contains the simulation-based global optimum, we note that the transformer is trained to propose realistic next states consistent with the physics of the eVTOL takeoff problem rather than to exclude feasible regions. The fact that the guided agent reaches 97.2 % of the simulation-based reference (versus 96.1 % for vanilla DRL) provides empirical evidence that the guidance does not exclude the optimum. We will augment the revision with a short theoretical argument and, if space permits, additional verification runs confirming that the simulation-based optimum remains reachable under the guided policy. revision: yes

Circularity Check

0 steps flagged

No circularity: results rest on external simulation reference and vanilla DRL baseline

full rationale

The paper's central claims concern empirical training efficiency (4.57e6 vs 19.79e6 steps) and optimality accuracy (97.2% vs 96.1%) for the transformer-guided DRL agent. These quantities are obtained by direct comparison against an independent simulation-based optimal reference trajectory and a standard vanilla DRL run; neither metric is obtained by algebraic rearrangement of the method's own fitted parameters, state-space definitions, or transformer outputs. No equations or sections in the abstract or described methods reduce the reported performance figures to self-definition, fitted-input renaming, or load-bearing self-citation. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on a standard eVTOL dynamics model treated as ground truth, a reward function that encodes the takeoff constraints, and several untuned transformer and RL hyperparameters; no new physical entities are postulated.

free parameters (1)

Transformer and RL hyperparameters
Architecture depth, attention heads, learning rate, and discount factor are chosen or tuned to produce the reported training curves.

axioms (1)

domain assumption The simulation dynamics accurately capture real eVTOL aerodynamics and power consumption during takeoff.
All optimality comparisons are performed inside this simulator; any mismatch with physical reality would invalidate the accuracy percentages.

pith-pipeline@v0.9.0 · 5608 in / 1352 out tokens · 75850 ms · 2026-05-17T20:17:58.830338+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The transformer produces an action proposal distribution characterized by a mean and variance for each action component, conditioned on the previous action history. The action proposal distribution sets up the DRL state space at each time step...
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The proposed transformer-guided DRL was demonstrated on an optimal takeoff trajectory design of an eVTOL drone for minimal energy consumption...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 2 internal anchors

[1]

Emerging Trends in Urban Air Mobility: An Extensive Review,

Tripaldi, F., Vianello, S., and Bianchi, N., “Emerging Trends in Urban Air Mobility: An Extensive Review,”Energies, Vol. 18, No. 6, 2025, p. 1426

work page 2025
[2]

URL https://d1nyezh1ys8wfo.cloudfront.net/static/PDFs/Elevate%2BWhitepaper.pdf?uclick_id=a12a5e10- ccfe-4b20-b2b7-b13a6485bd26

Fast-Forwarding to a Future of On-Demand Urban Air Transportation, Uber Elevate, October 2016. URL https://d1nyezh1ys8wfo.cloudfront.net/static/PDFs/Elevate%2BWhitepaper.pdf?uclick_id=a12a5e10- ccfe-4b20-b2b7-b13a6485bd26. [3]Concept of Operations for Uncrewed Urban Air Mobility, Boeing, 2023. URLhttps://wisk.aero/conops/

work page 2016
[3]

URL https://www.faa.gov/ researchdevelopment/trafficmanagement/utm-concept-operations-version-20-utm-conops-v20

UTM Concept of Operations Version 2.0 (UTM ConOps v2.0), FAA, 2020. URL https://www.faa.gov/ researchdevelopment/trafficmanagement/utm-concept-operations-version-20-utm-conops-v20

work page 2020
[4]

Urban Aviation: The Future Aerospace Transportation System for Intercity and Intracity Mobility,

Wild, G., “Urban Aviation: The Future Aerospace Transportation System for Intercity and Intracity Mobility,”Urban Science, Vol. 8, No. 4, 2024. https://doi.org/10.3390/urbansci8040218, URL https://www.mdpi.com/2413-8851/8/4/218

work page doi:10.3390/urbansci8040218 2024
[5]

Avionics of Electric Vertical Take-off and Landing in the Urban Air Mobility: A Review,

Zhou, Q., and Tan, F., “Avionics of Electric Vertical Take-off and Landing in the Urban Air Mobility: A Review,”IEEE Aerospace and Electronic Systems Magazine, 2024, pp. 1–26. https://doi.org/10.1109/MAES.2024.3488655

work page doi:10.1109/maes.2024.3488655 2024
[6]

Robust environmental life cycle assessment of electric VTOL concepts for urban air mobility,

André, N., and Hajek, M., “Robust environmental life cycle assessment of electric VTOL concepts for urban air mobility,” AIAA aviation 2019 forum, 2019, p. 3473

work page 2019
[7]

Advisory Circular, Subject: Type Certification—Powered-lift, AC No: 21.17-4, United States Department of Transportation, Federal Aviation Administration, July 2025

work page 2025
[8]

FAA Drone and AAM Symposium Remarks,

Thomson, K., “FAA Drone and AAM Symposium Remarks,” FAA Drone and AAM Symposium, Baltimore, Maryland, July 30 2024

work page 2024
[9]

Minimum-TimeTrajectoryGenerationofeVTOLinLow-Speed Phase: Application in Control Law Design,

Wang,M.,Chu,N.,Bhardwaj,P.,Zhang,S.,andHolzapfel,F.,“Minimum-TimeTrajectoryGenerationofeVTOLinLow-Speed Phase: Application in Control Law Design,”IEEE Transactions on Aerospace and Electronic Systems, Vol. 59, No. 2, 2023, pp. 1260–1275. https://doi.org/10.1109/TAES.2022.3198033

work page doi:10.1109/taes.2022.3198033 2023
[10]

IEEE Transactions on Intelligent Vehicles pp

Wei, H., Lou, B., Zhang, Z., Liang, B., Wang, F.-Y., and Lv, C., “Autonomous Navigation for eVTOL: Review and Future Perspectives,”IEEE Transactions on Intelligent Vehicles, Vol. 9, No. 2, 2024, pp. 4145–4171. https://doi.org/10.1109/TIV.2024. 3352613

work page doi:10.1109/tiv.2024 2024
[11]

Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction,

Yeh, S.-T., and Du, X., “Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction,”Electronics, Vol. 13, No. 10, 2024, p. 1911

work page 2024
[12]

Surrogate-Based Multidisciplinary Optimization for the Takeoff Trajectory Design of Electric Drones,

Sisk, S., and Du, X., “Surrogate-Based Multidisciplinary Optimization for the Takeoff Trajectory Design of Electric Drones,” Processes, Vol. 12, No. 9, 2024. https://doi.org/10.3390/pr12091864, URL https://www.mdpi.com/2227-9717/12/9/1864

work page doi:10.3390/pr12091864 2024
[13]

Tilt-wing eVTOL takeoff trajectory optimization,

Chauhan, S. S., and Martins, J. R., “Tilt-wing eVTOL takeoff trajectory optimization,”Journal of Aircraft, Vol. 57, No. 1, 2020, pp. 93–112

work page 2020
[14]

dymos: A Python package for optimal control of multidisciplinary systems,

Falck, R., Gray, J. S., Ponnapalli, K., and Wright, T., “dymos: A Python package for optimal control of multidisciplinary systems,”Journal of Open Source Software, Vol. 6, No. 59, 2021, p. 2809. https://doi.org/10.21105/joss.02809, URL https://doi.org/10.21105/joss.02809. 11

work page doi:10.21105/joss.02809 2021
[15]

https://doi.org/10.2514/6.2025-3800, URL https://arc.aiaa.org/doi/abs/10

Roberts,N.M.,andDu,X.,DeepReinforcementLearningforOptimalTakeoffTrajectoryDesignofaneVTOLDrone,American Institute of Aeronautics and Astronautics, inc., 2025. https://doi.org/10.2514/6.2025-3800, URL https://arc.aiaa.org/doi/abs/10. 2514/6.2025-3800

work page doi:10.2514/6.2025-3800 2025
[16]

Attention is all you need,

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I., “Attention is all you need,”Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, 2017, p. 6000–6010

work page 2017
[17]

Trans- formers in Time Series: A Survey,

Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., and Sun, L., “Transformers in time series: A survey,”arXiv preprint arXiv:2202.07125, 2022

work page arXiv 2022
[18]

URL https://acubed.airbus.com/blog/vahana/, [Online; accessed in 2025]

Airbus, “Vahana,” , 2016. URL https://acubed.airbus.com/blog/vahana/, [Online; accessed in 2025]

work page 2016
[19]

Horizontal axis wind turbine post stall airfoil characteristics synthesization,

Tangler, J. L., and Ostowari, C., “Horizontal axis wind turbine post stall airfoil characteristics synthesization,” Tech. rep., Solar Energy Research Inst., Golden, CO (United States), 1991

work page 1991
[20]

Propeller at high incidence,

Young, J. D., “Propeller at high incidence,”Journal of Aircraft, Vol. 2, No. 3, 1965, pp. 241–250

work page 1965
[21]

Reinforcement learning: a survey,

Kaelbling, L. P., Littman, M. L., and Moore, A. W., “Reinforcement learning: a survey,”J. Artif. Int. Res., Vol. 4, No. 1, 1996, p. 237–285

work page 1996
[22]

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S., “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor,” , 2018. URL https://arxiv.org/abs/1801.01290

work page internal anchor Pith review Pith/arXiv arXiv 2018
[23]

Soft Actor-Critic,

OpenAI, “Soft Actor-Critic,” , 2018. URL https://spinningup.openai.com/en/latest/algorithms/sac.html, accessed online, 2025

work page 2018
[24]

Stable-Baselines3: Reliable Reinforcement Learning Implementations,

Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., and Dormann, N., “Stable-Baselines3: Reliable Reinforcement Learning Implementations,”Journal of Machine Learning Research, Vol. 22, No. 268, 2021, pp. 1–8. URL http://jmlr.org/ papers/v22/20-1364.html

work page 2021
[25]

Optimal Tilt-Wing eVTOL Takeoff Trajectory Prediction Using Regression Generative Adversarial Networks,

Yeh, S.-T., and Du, X., “Optimal Tilt-Wing eVTOL Takeoff Trajectory Prediction Using Regression Generative Adversarial Networks,”Mathematics, Vol. 12, No. 1, 2023. https://doi.org/10.3390/math12010026, URL https://www.mdpi.com/2227- 7390/12/1/26

work page doi:10.3390/math12010026 2023
[26]

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Towers, M., Kwiatkowski, A., Terry, J., Balis, J. U., Cola, G. D., Deleu, T., Goulão, M., Kallinteris, A., Krimmel, M., KG, A., Perez-Vicente, R., Pierré, A., Schulhoff, S., Tai, J. J., Tan, H., and Younis, O. G., “Gymnasium: A Standard Interface for Reinforcement Learning Environments,” , 2025. URL https://arxiv.org/abs/2407.17032

work page internal anchor Pith review Pith/arXiv arXiv 2025
[27]

Reward shaping in multiagent reinforcement learning for self-organizing systems in assembly tasks,

Huang, B., and Jin, Y., “Reward shaping in multiagent reinforcement learning for self-organizing systems in assembly tasks,” Advanced Engineering Informatics, Vol. 54, 2022, p. 101800. https://doi.org/https://doi.org/10.1016/j.aei.2022.101800, URL https://www.sciencedirect.com/science/article/pii/S1474034622002580. 12

work page doi:10.1016/j.aei.2022.101800 2022

[1] [1]

Emerging Trends in Urban Air Mobility: An Extensive Review,

Tripaldi, F., Vianello, S., and Bianchi, N., “Emerging Trends in Urban Air Mobility: An Extensive Review,”Energies, Vol. 18, No. 6, 2025, p. 1426

work page 2025

[2] [2]

URL https://d1nyezh1ys8wfo.cloudfront.net/static/PDFs/Elevate%2BWhitepaper.pdf?uclick_id=a12a5e10- ccfe-4b20-b2b7-b13a6485bd26

Fast-Forwarding to a Future of On-Demand Urban Air Transportation, Uber Elevate, October 2016. URL https://d1nyezh1ys8wfo.cloudfront.net/static/PDFs/Elevate%2BWhitepaper.pdf?uclick_id=a12a5e10- ccfe-4b20-b2b7-b13a6485bd26. [3]Concept of Operations for Uncrewed Urban Air Mobility, Boeing, 2023. URLhttps://wisk.aero/conops/

work page 2016

[3] [3]

URL https://www.faa.gov/ researchdevelopment/trafficmanagement/utm-concept-operations-version-20-utm-conops-v20

UTM Concept of Operations Version 2.0 (UTM ConOps v2.0), FAA, 2020. URL https://www.faa.gov/ researchdevelopment/trafficmanagement/utm-concept-operations-version-20-utm-conops-v20

work page 2020

[4] [4]

Urban Aviation: The Future Aerospace Transportation System for Intercity and Intracity Mobility,

Wild, G., “Urban Aviation: The Future Aerospace Transportation System for Intercity and Intracity Mobility,”Urban Science, Vol. 8, No. 4, 2024. https://doi.org/10.3390/urbansci8040218, URL https://www.mdpi.com/2413-8851/8/4/218

work page doi:10.3390/urbansci8040218 2024

[5] [5]

Avionics of Electric Vertical Take-off and Landing in the Urban Air Mobility: A Review,

Zhou, Q., and Tan, F., “Avionics of Electric Vertical Take-off and Landing in the Urban Air Mobility: A Review,”IEEE Aerospace and Electronic Systems Magazine, 2024, pp. 1–26. https://doi.org/10.1109/MAES.2024.3488655

work page doi:10.1109/maes.2024.3488655 2024

[6] [6]

Robust environmental life cycle assessment of electric VTOL concepts for urban air mobility,

André, N., and Hajek, M., “Robust environmental life cycle assessment of electric VTOL concepts for urban air mobility,” AIAA aviation 2019 forum, 2019, p. 3473

work page 2019

[7] [7]

Advisory Circular, Subject: Type Certification—Powered-lift, AC No: 21.17-4, United States Department of Transportation, Federal Aviation Administration, July 2025

work page 2025

[8] [8]

FAA Drone and AAM Symposium Remarks,

Thomson, K., “FAA Drone and AAM Symposium Remarks,” FAA Drone and AAM Symposium, Baltimore, Maryland, July 30 2024

work page 2024

[9] [9]

Minimum-TimeTrajectoryGenerationofeVTOLinLow-Speed Phase: Application in Control Law Design,

Wang,M.,Chu,N.,Bhardwaj,P.,Zhang,S.,andHolzapfel,F.,“Minimum-TimeTrajectoryGenerationofeVTOLinLow-Speed Phase: Application in Control Law Design,”IEEE Transactions on Aerospace and Electronic Systems, Vol. 59, No. 2, 2023, pp. 1260–1275. https://doi.org/10.1109/TAES.2022.3198033

work page doi:10.1109/taes.2022.3198033 2023

[10] [10]

IEEE Transactions on Intelligent Vehicles pp

Wei, H., Lou, B., Zhang, Z., Liang, B., Wang, F.-Y., and Lv, C., “Autonomous Navigation for eVTOL: Review and Future Perspectives,”IEEE Transactions on Intelligent Vehicles, Vol. 9, No. 2, 2024, pp. 4145–4171. https://doi.org/10.1109/TIV.2024. 3352613

work page doi:10.1109/tiv.2024 2024

[11] [11]

Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction,

Yeh, S.-T., and Du, X., “Transfer-Learning-Enhanced Regression Generative Adversarial Networks for Optimal eVTOL Takeoff Trajectory Prediction,”Electronics, Vol. 13, No. 10, 2024, p. 1911

work page 2024

[12] [12]

Surrogate-Based Multidisciplinary Optimization for the Takeoff Trajectory Design of Electric Drones,

Sisk, S., and Du, X., “Surrogate-Based Multidisciplinary Optimization for the Takeoff Trajectory Design of Electric Drones,” Processes, Vol. 12, No. 9, 2024. https://doi.org/10.3390/pr12091864, URL https://www.mdpi.com/2227-9717/12/9/1864

work page doi:10.3390/pr12091864 2024

[13] [13]

Tilt-wing eVTOL takeoff trajectory optimization,

Chauhan, S. S., and Martins, J. R., “Tilt-wing eVTOL takeoff trajectory optimization,”Journal of Aircraft, Vol. 57, No. 1, 2020, pp. 93–112

work page 2020

[14] [14]

dymos: A Python package for optimal control of multidisciplinary systems,

Falck, R., Gray, J. S., Ponnapalli, K., and Wright, T., “dymos: A Python package for optimal control of multidisciplinary systems,”Journal of Open Source Software, Vol. 6, No. 59, 2021, p. 2809. https://doi.org/10.21105/joss.02809, URL https://doi.org/10.21105/joss.02809. 11

work page doi:10.21105/joss.02809 2021

[15] [15]

https://doi.org/10.2514/6.2025-3800, URL https://arc.aiaa.org/doi/abs/10

Roberts,N.M.,andDu,X.,DeepReinforcementLearningforOptimalTakeoffTrajectoryDesignofaneVTOLDrone,American Institute of Aeronautics and Astronautics, inc., 2025. https://doi.org/10.2514/6.2025-3800, URL https://arc.aiaa.org/doi/abs/10. 2514/6.2025-3800

work page doi:10.2514/6.2025-3800 2025

[16] [16]

Attention is all you need,

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I., “Attention is all you need,”Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, 2017, p. 6000–6010

work page 2017

[17] [17]

Trans- formers in Time Series: A Survey,

Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., and Sun, L., “Transformers in time series: A survey,”arXiv preprint arXiv:2202.07125, 2022

work page arXiv 2022

[18] [18]

URL https://acubed.airbus.com/blog/vahana/, [Online; accessed in 2025]

Airbus, “Vahana,” , 2016. URL https://acubed.airbus.com/blog/vahana/, [Online; accessed in 2025]

work page 2016

[19] [19]

Horizontal axis wind turbine post stall airfoil characteristics synthesization,

Tangler, J. L., and Ostowari, C., “Horizontal axis wind turbine post stall airfoil characteristics synthesization,” Tech. rep., Solar Energy Research Inst., Golden, CO (United States), 1991

work page 1991

[20] [20]

Propeller at high incidence,

Young, J. D., “Propeller at high incidence,”Journal of Aircraft, Vol. 2, No. 3, 1965, pp. 241–250

work page 1965

[21] [21]

Reinforcement learning: a survey,

Kaelbling, L. P., Littman, M. L., and Moore, A. W., “Reinforcement learning: a survey,”J. Artif. Int. Res., Vol. 4, No. 1, 1996, p. 237–285

work page 1996

[22] [22]

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S., “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor,” , 2018. URL https://arxiv.org/abs/1801.01290

work page internal anchor Pith review Pith/arXiv arXiv 2018

[23] [23]

Soft Actor-Critic,

OpenAI, “Soft Actor-Critic,” , 2018. URL https://spinningup.openai.com/en/latest/algorithms/sac.html, accessed online, 2025

work page 2018

[24] [24]

Stable-Baselines3: Reliable Reinforcement Learning Implementations,

Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., and Dormann, N., “Stable-Baselines3: Reliable Reinforcement Learning Implementations,”Journal of Machine Learning Research, Vol. 22, No. 268, 2021, pp. 1–8. URL http://jmlr.org/ papers/v22/20-1364.html

work page 2021

[25] [25]

Optimal Tilt-Wing eVTOL Takeoff Trajectory Prediction Using Regression Generative Adversarial Networks,

Yeh, S.-T., and Du, X., “Optimal Tilt-Wing eVTOL Takeoff Trajectory Prediction Using Regression Generative Adversarial Networks,”Mathematics, Vol. 12, No. 1, 2023. https://doi.org/10.3390/math12010026, URL https://www.mdpi.com/2227- 7390/12/1/26

work page doi:10.3390/math12010026 2023

[26] [26]

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Towers, M., Kwiatkowski, A., Terry, J., Balis, J. U., Cola, G. D., Deleu, T., Goulão, M., Kallinteris, A., Krimmel, M., KG, A., Perez-Vicente, R., Pierré, A., Schulhoff, S., Tai, J. J., Tan, H., and Younis, O. G., “Gymnasium: A Standard Interface for Reinforcement Learning Environments,” , 2025. URL https://arxiv.org/abs/2407.17032

work page internal anchor Pith review Pith/arXiv arXiv 2025

[27] [27]

Reward shaping in multiagent reinforcement learning for self-organizing systems in assembly tasks,

Huang, B., and Jin, Y., “Reward shaping in multiagent reinforcement learning for self-organizing systems in assembly tasks,” Advanced Engineering Informatics, Vol. 54, 2022, p. 101800. https://doi.org/https://doi.org/10.1016/j.aei.2022.101800, URL https://www.sciencedirect.com/science/article/pii/S1474034622002580. 12

work page doi:10.1016/j.aei.2022.101800 2022