pith. sign in

arxiv: 2605.21213 · v1 · pith:7T563272new · submitted 2026-05-20 · 🪐 quant-ph · cs.AI· cs.LG· math.OC

Enhanced Reinforcement Learning-based Process Synthesis via Quantum Computing

Pith reviewed 2026-05-21 04:30 UTC · model grok-4.3

classification 🪐 quant-ph cs.AIcs.LGmath.OC
keywords quantum reinforcement learningprocess synthesisflowsheet designMarkov decision processquantum computingchemical engineering optimizationscalability
0
0 comments X

The pith

Quantum reinforcement learning matches classical performance in finding optimal chemical flowsheets while using fewer parameters for moderate-scale problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a framework that formulates process synthesis as a Markov decision process and applies quantum-enhanced reinforcement learning to solve it. New state encoding algorithms are introduced to decouple qubit requirements from the number of units in the design space. This allows quantum RL agents to identify the same optimal flowsheet solutions as classical RL agents but with improved parameter efficiency on moderate-sized problems. A reader would care because chemical process design involves searching large combinatorial spaces, and any method that maintains solution quality while reducing resource demands could scale better for real industrial applications.

Core claim

For moderate-scale unit counts, quantum approaches demonstrate competitive performance on a per-episode basis and improved efficiency on a per-parameter basis versus the classical RL benchmark, after state encoding algorithms are used to reduce qubit scaling with problem complexity.

What carries the argument

State encoding algorithms that map flowsheet states into quantum representations while decoupling qubit count from the number of process units.

If this is right

  • All tested algorithms, classical and quantum, locate the optimal flowsheet designs when the design space remains small.
  • Quantum variants stay competitive with classical RL for the specific process synthesis problems examined.
  • The work creates a controlled benchmark that can be reused to compare future classical and quantum algorithms on identical training conditions.
  • This setup supplies a foundation for extending quantum methods to other process systems engineering tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same encoding technique might be tested on other combinatorial design problems outside chemical engineering, such as network layout or scheduling.
  • Hybrid quantum-classical training loops could be explored to handle even larger unit counts where pure classical scaling becomes expensive.
  • Performance metrics focused on parameter count rather than episode count may become a standard way to evaluate quantum advantage in optimization settings.

Load-bearing premise

The state encoding algorithms preserve enough information for the quantum agent to reach the same optimal flowsheet solutions as the classical agent.

What would settle it

Running both agents on the same moderate-unit-count flowsheet synthesis task and finding that the quantum version either misses the known optimal design or requires as many or more parameters to match classical success rates.

Figures

Figures reproduced from arXiv: 2605.21213 by (2) R.F. Smith School of Chemical, Austin Braniff (1), Biomedical Engineering, Biomolecular Engineering, Cornell University), Fengqi You (2), West Virginia University, Yuhe Tian (1) ((1) Department of Chemical.

Figure 1
Figure 1. Figure 1: An overview of the classical and quantum RL approaches for process design. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Value-based deep reinforcement learning nomenclature for the present work. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Bloch sphere diagram for quantum state representation. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Circuit diagram of a PQC [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Circuit diagram for a PQC layer. shown in [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Circuit diagram for a re-uploading PQC. works as value-function or policy approximators [48, 49]. In value-based methods such as DQN, the PQC serves as a parameterized mapping from state representations to action-value estimates. Training proceeds through classical optimization of circuit parameters using reward-driven loss func￾tions, while circuit evaluations are performed via quantum simulation or hardw… view at source ↗
Figure 7
Figure 7. Figure 7: Flowsheet configurations following an optimal policy for an infinite horizon. [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Workflow for RL-driven process synthesis (adapted from Wang et al. [32]). [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: PQC using the encoding strategy described by Variant 2. [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: PQC using the encoding strategy described by Variant 3. [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Optimal flowsheet configuration with maximum reward. [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗
read the original abstract

In this work, we present quantum reinforcement learning (RL) as a solution strategy for process synthesis problems. Building on our prior work, we develop a generalized framework that formally poses process synthesis as a Markov decision process and introduces quantum-enhanced RL algorithms to solve it with improved scalability. Earlier implementations of quantum-based RL for process synthesis were limited by qubit requirements, which scaled poorly with problem complexity. This work overcomes this challenge by introducing state encoding algorithms to decouple qubit requirements from problem size. A classical RL-based solution strategy is used as a baseline to benchmark the quantum algorithms under identical training conditions. All algorithms are evaluated across a flowsheet synthesis problem of increasing unit counts to analyze their performance and scalability. Results show that all approaches are capable of identifying the optimal flowsheet designs in small design spaces. For moderate-scale unit counts, quantum approaches demonstrate competitive performance on a per-episode basis and improved efficiency on a per-parameter basis versus the classical RL benchmark. This work provides a foundation for future quantum computing applications within process systems engineering, establishes a controlled benchmark for comparing classical and quantum algorithms, and shows that the proposed quantum variants remain competitive for the process synthesis problem examined in this work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript formulates process synthesis as a Markov decision process and develops quantum reinforcement learning algorithms augmented by state encoding schemes that aim to decouple qubit count from the number of units. A classical RL baseline is trained under identical conditions and the methods are compared on flowsheet synthesis instances with increasing unit counts. The central empirical claim is that, for small design spaces, all methods recover optimal flowsheets, while for moderate-scale instances the quantum variants remain competitive on a per-episode basis and exhibit improved per-parameter efficiency.

Significance. If the optimality verification and encoding fidelity claims hold under quantitative scrutiny, the work supplies a controlled, reproducible benchmark for quantum versus classical RL on a combinatorial design task relevant to process systems engineering. The explicit introduction of state encoding to mitigate qubit scaling is a concrete technical contribution that could be built upon in future hybrid quantum-classical optimization studies.

major comments (2)
  1. [Results] Results section: the statement that 'all approaches are capable of identifying the optimal flowsheet designs' is load-bearing for the competitiveness claim, yet the manuscript reports only per-episode reward curves without tabulating the final flowsheet topologies, objective values, or feasibility checks obtained by each algorithm. Without this explicit comparison it is impossible to confirm that the quantum agents reach the identical optima rather than alternative solutions of comparable reward.
  2. [Methods] State-encoding subsection (Methods): the claim that the introduced encoding algorithms 'preserve enough information' to allow the quantum agent to reach the same optimal solutions while reducing qubit scaling is not supported by an injectivity argument or by an empirical collision test. If distinct unit-connection or mode configurations map to indistinguishable quantum states, policy gradients can converge to a different policy even when average rewards appear competitive; a small-scale example showing the mapping for two non-isomorphic feasible flowsheets would directly address this risk.
minor comments (2)
  1. The abstract would be strengthened by naming the specific quantum RL variants (e.g., variational quantum circuit policy or amplitude amplification) and by stating the qubit counts actually used for each unit-count instance.
  2. Figure captions and axis labels should explicitly indicate whether shaded regions represent standard deviation across seeds or inter-quartile range; current presentation makes it difficult to judge whether the reported competitiveness is statistically meaningful.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and constructive suggestions. We address each of the major comments in detail below and have made revisions to the manuscript to incorporate the requested clarifications and additions.

read point-by-point responses
  1. Referee: [Results] Results section: the statement that 'all approaches are capable of identifying the optimal flowsheet designs' is load-bearing for the competitiveness claim, yet the manuscript reports only per-episode reward curves without tabulating the final flowsheet topologies, objective values, or feasibility checks obtained by each algorithm. Without this explicit comparison it is impossible to confirm that the quantum agents reach the identical optima rather than alternative solutions of comparable reward.

    Authors: We agree that providing explicit details on the final flowsheet topologies is important to substantiate our claim. In the revised manuscript, we have added a table in the Results section that compares the optimal flowsheets identified by each method for the small design space cases. This table includes the topologies, objective values, and confirmation of feasibility, demonstrating that all algorithms recover the same optimal designs. revision: yes

  2. Referee: [Methods] State-encoding subsection (Methods): the claim that the introduced encoding algorithms 'preserve enough information' to allow the quantum agent to reach the same optimal solutions while reducing qubit scaling is not supported by an injectivity argument or by an empirical collision test. If distinct unit-connection or mode configurations map to indistinguishable quantum states, policy gradients can converge to a different policy even when average rewards appear competitive; a small-scale example showing the mapping for two non-isomorphic feasible flowsheets would directly address this risk.

    Authors: We recognize the value of an explicit demonstration that the state encoding preserves distinguishability. We have revised the Methods section to include a small-scale example illustrating the encoding of two non-isomorphic feasible flowsheets. This example shows their distinct mappings to quantum states, supporting that the encoding maintains sufficient information for the agent to differentiate between configurations. We have also added a short discussion on the encoding properties to address potential collision risks. revision: yes

Circularity Check

0 steps flagged

Minor self-citation for MDP framework; central benchmark comparisons remain independent

full rationale

The paper builds on prior work to pose process synthesis as an MDP and introduces state encoding to decouple qubit count from problem size. Performance claims rest on empirical evaluation of quantum RL variants against a classical RL baseline under identical training conditions across increasing unit counts. No equations or derivations reduce claimed improvements to fitted parameters, self-referential definitions, or unverified self-citations. The self-reference is limited to framework setup and is not load-bearing for the reported per-episode competitiveness and per-parameter efficiency results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unverified premise that the chosen state encoding preserves solution quality while reducing qubit scaling; no free parameters, invented entities, or additional axioms are visible in the abstract.

axioms (1)
  • domain assumption Process synthesis can be faithfully represented as a Markov decision process whose optimal policy corresponds to the globally optimal flowsheet.
    Stated in the abstract as the formal posing of the problem; if false, the RL formulation itself would not guarantee optimality.

pith-pipeline@v0.9.0 · 5778 in / 1140 out tokens · 21000 ms · 2026-05-21T04:30:00.967821+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 1 internal anchor

  1. [1]

    E. N. Pistikopoulos, R. Gani, Data, models, algorithms, AI and the role of PSE – the generation next, Computers & Chemical Engineering 207 (2026) 109564

  2. [2]

    Braniff, S

    A. Braniff, S. S. Akundi, Y. Liu, B. Dantas, S. S. Niknezhad, F. Khan, E. N. Pistikopoulos, Y. Tian, Real-time process safety and systems decision-making toward safe and smart chemical manufacturing, Dig- ital Chemical Engineering 15 (2025) 100227

  3. [3]

    Mencarelli, Q

    L. Mencarelli, Q. Chen, A. Pagot, I. E. Grossmann, A review on super- structure optimization approaches in process system engineering, Com- puters & Chemical Engineering 136 (2020) 106808

  4. [4]

    F.Boukouvala, R.Misener, C.A.Floudas, Globaloptimizationadvances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO, European Journal of Operational Research 252 (3) (2016) 701–727

  5. [5]

    D. A. Liñán, L. A. Ricardez-Sandoval, Trends and perspectives in deter- ministic MINLP optimization for integrated planning, scheduling, con- trol, and design of chemical processes, Reviews in Chemical Engineering 41 (5) (2025) 451–472

  6. [6]

    I. E. Grossmann, F. Trespalacios, Systematic modeling of discrete- continuous optimization models through generalized disjunctive pro- gramming, AIChE Journal 59 (9) (2013) 3276–3295

  7. [7]

    J. A. Frumkin, M. F. Doherty, Target bounds on reaction selectivity via Feinberg’s CFSTR equivalence principle, AIChE Journal 64 (3) (2018) 926–939

  8. [8]

    E. N. Pistikopoulos, Y. Tian, Synthesis and Operability Strategies for Computer-Aided Modular Process Intensification, Elsevier, 2022

  9. [9]

    Y. Tian, S. E. Demirel, M. M. F. Hasan, E. N. Pistikopoulos, An overview of process systems engineering approaches for process intensifi- cation: State of the art, Chemical Engineering and Processing - Process Intensification 133 (2018) 160–210. 36

  10. [10]

    E. N. Pistikopoulos, Y. Tian, Advanced Modeling and Optimization Strategies for Process Synthesis, Annual Review of Chemical and Biomolecular Engineering 15 (Volume 15, 2024) (2024) 81–103

  11. [11]

    Q. Gao, A. M. Schweidtmann, Deep reinforcement learning for process design: Review and perspective, Current Opinion in Chemical Engineer- ing 44 (2024) 101012

  12. [12]

    Göttl, D

    Q. Göttl, D. G. Grimm, J. Burger, Automated synthesis of steady-state continuous processes using reinforcement learning, Frontiers of Chemical Science and Engineering 16 (2) (2022) 288–302

  13. [13]

    D. E. Bernal, A. Ajagekar, S. M. Harwood, S. T. Stober, D. Trenev, F. You, Perspectives of quantum computing for chemical engineering, AIChE Journal 68 (6) (2022) e17651

  14. [14]

    Ajagekar, F

    A. Ajagekar, F. You, New frontiers of quantum computing in chemical engineering, Korean Journal of Chemical Engineering 39 (4) (2022) 811– 820

  15. [15]

    D. E. Bernal, K. E. C. Booth, R. Dridi, H. Alghassi, S. Tayur, D. Ven- turelli, Integer Programming Techniques for Minor-Embedding in Quan- tum Annealers, in: E. Hebrard, N. Musliu (Eds.), Integration of Con- straint Programming, Artificial Intelligence, and Operations Research, Springer International Publishing, Cham, 2020, pp. 112–129

  16. [16]

    Ajagekar, T

    A. Ajagekar, T. Humble, F. You, Quantum computing based hybrid solution strategies for large-scale discrete-continuous optimization prob- lems, Computers & Chemical Engineering 132 (2020) 106630

  17. [17]

    Nieman, K

    K. Nieman, K. Kasturi Rangan, H. Durand, Control Implemented on Quantum Computers: Effects of Noise, Nondeterminism, and Entan- glement, Industrial & Engineering Chemistry Research 61 (28) (2022) 10133–10155

  18. [18]

    Heinen, H

    X. Heinen, H. Chen, Quantum Computing for Complex Energy Sys- tems: A Review, in: 2024 6th International Conference on Data-driven Optimization of Complex Systems (DOCS), 2024, pp. 572–577. 37

  19. [19]

    Benedetti, E

    M. Benedetti, E. Lloyd, S. Sack, M. Fiorentini, Parameterized quantum circuits as machine learning models, Quantum Science and Technology 4 (4) (2019) 043001

  20. [20]

    Meyer, C

    N. Meyer, C. Ufrecht, M. Periyasamy, D. D. Scherer, A. Plinge, C. Mutschler, A Survey on Quantum Reinforcement Learning (Mar. 2024). arXiv:2211.03464

  21. [21]

    Braniff, F

    A. Braniff, F. You, Y. Tian, Enhanced Reinforcement Learning-driven Process Design via Quantum Machine Learning, in: The 35th European Symposium on Computer Aided Process Engineering, Ghent, Belgium, 2025, pp. 1403–1408

  22. [22]

    Braniff, F

    A. Braniff, F. You, Y. Tian, Process flowsheet synthesis via quantum reinforcement learning with improved scalability, in: The 36th European Symposium on Computer Aided Process Engineering, 2026

  23. [23]

    Sutton, A

    R. Sutton, A. Barto, Reinforcement Learning, Second Edition: An In- troduction, Adaptive Computation and Machine Learning Series, MIT Press, 2018

  24. [24]

    J. Shin, T. A. Badgwell, K.-H. Liu, J. H. Lee, Reinforcement Learn- ing – Overview of recent progress and implications for process control, Computers & Chemical Engineering 127 (2019) 282–294

  25. [25]

    Braniff, Y

    A. Braniff, Y. Tian, Reinforcement learning-based control via Y-wise Affine Neural Networks (YANNs), Computers & Chemical Engineering 209 (2026) 109610

  26. [26]

    Rangel-Martinez, L

    D. Rangel-Martinez, L. A. Ricardez-Sandoval, Interpretable online schedulingforchemicalbatchplantswithattentionaugmentedreinforce- ment learning agents, Computers & Chemical Engineering 205 (2026) 109469

  27. [27]

    C. D. Hubbs, C. Li, N. V. Sahinidis, I. E. Grossmann, J. M. Wassick, A deep reinforcement learning approach for chemical production schedul- ing, Computers & Chemical Engineering 141 (2020) 106982

  28. [28]

    Stops, R

    L. Stops, R. Leenhouts, Q. Gao, A. M. Schweidtmann, Flowsheet gen- eration through hierarchical reinforcement learning and graph neural networks, AIChE Journal 69 (1) (2023) e17938. 38

  29. [29]

    Q. Gao, H. Yang, S. M. Shanbhag, A. M. Schweidtmann, Transfer learn- ing for process design with reinforcement learning, in: A. C. Kokossis, M. C. Georgiadis, E. Pistikopoulos (Eds.), 33rd European Symposium on Computer Aided Process Engineering, Vol. 52 of Computer Aided Chemical Engineering, Elsevier, 2023, pp. 2005–2010

  30. [30]

    A. Khan, A. Lapkin, Searching for optimal process routes: A reinforce- ment learning approach, Computers & Chemical Engineering 141 (2020) 107027

  31. [31]

    Kim, M.-G

    S. Kim, M.-G. Jang, J.-K. Kim, Process design and optimization of sin- gle mixed-refrigerant processes with the application of deep reinforce- ment learning, Applied Thermal Engineering 223 (2023) 120038

  32. [32]

    D. Wang, J. Bao, M. Zamarripa-Perez, B. Paul, Y. Chen, P. Gao, T. Ma, A. Noring, A. Iyengar, D. Schwartz, E. Eggleton, Q. He, A. Liu, O. Ma- rina, B. Koeppel, Z. Xu, Reinforcement learning for automated concep- tual design of advanced energy and chemical systems (Dec. 2022)

  33. [33]

    Göttl, Y

    Q. Göttl, Y. Tönges, D. G. Grimm, J. Burger, Automated Flowsheet Synthesis Using Hierarchical Reinforcement Learning: Proof of Concept, Chemie Ingenieur Technik 93 (12) (2021) 2010–2018

  34. [34]

    Göttl, D

    Q. Göttl, D. G. Grimm, J. Burger, Using reinforcement learning in a game-like setup for automated process synthesis without prior process knowledge, in: Y. Yamashita, M. Kano (Eds.), 14th International Sym- posium on Process Systems Engineering, Vol. 49 of Computer Aided Chemical Engineering, Elsevier, 2022, pp. 1555–1560

  35. [35]

    Reynoso-Donzelli, L

    S. Reynoso-Donzelli, L. A. Ricardez-Sandoval, A reinforcement learn- ing approach with masked agents for chemical process flowsheet design, AIChE Journal 71 (1) (2025) e18584

  36. [36]

    J. R. Seidenberg, A. A. Khan, A. A. Lapkin, Boosting autonomous process design and intensification with formalized domain knowledge, Computers & Chemical Engineering 169 (2023) 108097

  37. [37]

    Sachio, M

    S. Sachio, M. Mowbray, M. M. Papathanasiou, E. A. del Rio-Chanona, P. Petsagkourakis, Integrating process design and control using rein- forcement learning, Chemical Engineering Research and Design 183 (2022) 160–169. 39

  38. [38]

    Reynoso-Donzelli, L

    S. Reynoso-Donzelli, L. A. Ricardez-Sandoval, An integrated reinforce- ment learning framework for simultaneous generation, design, and con- trol of chemical process flowsheets, Computers & Chemical Engineering 194 (2025) 108988

  39. [39]

    Q. Gao, H. Yang, M. F. Theisen, A. M. Schweidtmann, Accelerating process synthesis with reinforcement learning: Transfer learning from multi-fidelity simulations and variational autoencoders, Computers & Chemical Engineering 201 (2025) 109192

  40. [40]

    V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wier- stra, M. Riedmiller, Playing Atari with Deep Reinforcement Learning (Dec. 2013). arXiv:1312.5602

  41. [41]

    Harwood, C

    S. Harwood, C. Gambella, D. Trenev, A. Simonetto, D. Bernal Neira, D. Greenberg, Formulating and Solving Routing Problems on Quantum Computers, IEEE Transactions on Quantum Engineering 2 (2021) 1–17

  42. [42]

    Raseena, Quantum computing: Foundations, algorithms, and emerg- ing applications, Frontiers in Quantum Science and Technology 4 (Dec

    V. Raseena, Quantum computing: Foundations, algorithms, and emerg- ing applications, Frontiers in Quantum Science and Technology 4 (Dec. 2025)

  43. [43]

    Preskill, Quantum Computing in the NISQ era and beyond, Quantum 2 (2018) 79

    J. Preskill, Quantum Computing in the NISQ era and beyond, Quantum 2 (2018) 79

  44. [44]

    Y. Wang, J. Liu, A comprehensive review of Quantum Machine Learn- ing: From NISQ to Fault Tolerance, Reports on Progress in Physics 87 (11) (2024) 116402. arXiv:2401.11351

  45. [45]

    Biamonte, P

    J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, S. Lloyd, Quantum machine learning, Nature 549 (7671) (2017) 195–202

  46. [46]

    L. Chen, T. Li, Y. Chen, X. Chen, M. Wozniak, N. Xiong, W. Liang, Design and analysis of quantum machine learning: A survey, Connection Science 36 (1) (2024) 2312121

  47. [47]

    Schuld, F

    M. Schuld, F. Petruccione, Machine Learning with Quantum Comput- ers, Quantum Science and Technology, Springer International Publish- ing, Cham, 2021. 40

  48. [48]

    S. Y.-C. Chen, An Introduction to Quantum Reinforcement Learning (QRL) (Sep. 2024). arXiv:2409.05846

  49. [49]

    S. Y.-C. Chen, C.-H. H. Yang, J. Qi, P.-Y. Chen, X. Ma, H.-S. Goan, Variational Quantum Circuits for Deep Reinforcement Learning, IEEE Access 8 (2020) 141007–141024

  50. [50]

    Chen, Y.-J

    H.-Y. Chen, Y.-J. Chang, S.-W. Liao, C.-R. Chang, Deep-Q Learning with Hybrid Quantum Neural Network on Solving Maze Problems (Dec. 2023). arXiv:2304.10159

  51. [51]

    Skolik, S

    A. Skolik, S. Jerbi, V. Dunjko, Quantum agents in the Gym: A varia- tional quantum algorithm for deep Q-learning, Quantum 6 (2022) 720. arXiv:2103.15084

  52. [52]

    Jerbi, C

    S. Jerbi, C. Gyurik, S. C. Marshall, H. J. Briegel, V. Dunjko, Parametrized quantum policies for reinforcement learning (Dec. 2021). arXiv:2103.05577

  53. [53]

    Y. Tian, A. Akintola, Y. Jiang, D. Wang, J. Bao, M. A. Zamarripa, B. Paul, Y. Chen, P. Gao, A. Noring, A. Iyengar, A. Liu, O. Marina, B. Koeppel, Z. Xu, Reinforcement Learning-Driven Process Design: A Hydrodealkylation Example, in: Foundations of Computer-Aided Pro- cess Design, Breckenridge, Colorado, USA, 2024, pp. 387–393

  54. [54]

    A. Lee, J. H. Ghouse, J. C. Eslick, C. D. Laird, J. D. Siirola, M. A. Zamarripa, D. Gunter, J. H. Shinn, A. W. Dowling, D. Bhattacharyya, L. T. Biegler, A. P. Burgard, D. C. Miller, The IDAES process modeling framework and model library—Flexibility for process simulation and optimization, Journal of Advanced Manufacturing and Processing 3 (3) (2021) e10095

  55. [55]

    L. T. Biegler, V. M. Zavala, Large-scale nonlinear programming using IPOPT: An integrating framework for enterprise-wide dynamic opti- mization, Computers & Chemical Engineering 33 (3) (2009) 575–582. 41