pith. sign in

arxiv: 2506.09323 · v4 · submitted 2025-06-11 · 🪐 quant-ph

Learning-Optimized Qubit Mapping and Reuse to Minimize Inter-Core Communication in Modular Quantum Architectures

Pith reviewed 2026-05-19 10:33 UTC · model grok-4.3

classification 🪐 quant-ph
keywords qubit mappingmodular quantum architecturesreinforcement learningquantum compilationinter-core communicationqubit reuseattention mechanismsgraph neural networks
0
0 comments X

The pith

Attention-based reinforcement learning learns qubit mappings and reuse policies that cut inter-core communications in modular quantum systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops QARMA, a deep reinforcement learning method that uses attention mechanisms and graph neural networks to assign qubits to separate quantum processing units while routing operations to reduce expensive transfers between cores. An extension called QARMA-R adds mid-circuit measurement and reset to reuse qubits dynamically during execution. If successful, the method would allow larger quantum algorithms to run on collections of smaller chips connected together instead of requiring a single monolithic device. This matters because inter-core state transfers introduce noise and decoherence that limit practical scaling in current hardware.

Core claim

QARMA trains an attention-based policy with a transformer encoder and graph neural networks to choose qubit allocation, routing paths, and reuse opportunities that minimize inter-core operations; QARMA-R further incorporates dynamic reuse via mid-circuit measurements. On benchmark circuits, QARMA-R achieves up to 100 percent reduction in inter-core communications (86 percent on average) versus highly optimized Qiskit with modular settings, while QARMA alone delivers 15-40 percent improvement on larger circuits without reuse and 97-100 percent reduction against traditional modular mapping.

What carries the argument

An attention-based deep reinforcement learning policy that combines a transformer encoder for global circuit structure with graph neural networks for local qubit interactions to decide allocation, routing, and reuse.

If this is right

  • Larger quantum algorithms become executable on resource-constrained modular systems that connect multiple smaller QPUs.
  • Fewer inter-core state transfers reduce accumulated noise and decoherence during circuit execution.
  • Dynamic reuse through mid-circuit measurements lowers the total number of physical qubits needed for a given circuit.
  • The learned policies can be applied at compile time to produce mappings that scale better than static or heuristic approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same attention-plus-reuse framework might generalize to other quantum compilation tasks such as gate scheduling or error mitigation on modular layouts.
  • If the reinforcement learning policy transfers across different hardware topologies, it could support online recompilation when qubit errors change during a run.
  • Combining this mapping with mid-circuit measurement reuse may interact with variational algorithms that already rely on frequent resets.

Load-bearing premise

The measured reductions assume that the simulated inter-core communication costs on benchmark circuits accurately predict noise and latency on actual modular quantum hardware.

What would settle it

Running the QARMA-compiled circuits on physical multi-QPU hardware and directly counting inter-core operations plus observing final fidelity would show whether the reported reductions hold outside simulation.

read the original abstract

Modular quantum architectures have emerged as a promising approach for scaling quantum computing systems by connecting multiple Quantum Processing Units (QPUs). However, this approach introduces significant challenges due to costly inter-core operations between chips and quantum state transfers, which contribute to noise and quantum decoherence. This paper presents QARMA, a novel Qubit mapping using Attention-based deep Reinforcement learning (DRL) for Modular quantum Architectures, along with its extension QARMA-R that incorporates dynamic qubit reuse capabilities. Our approach combines an attention-based mechanism with Graph Neural Networks (GNN) to learn optimal qubit allocation, routing, and reuse strategies that minimize inter-core communications. We introduce two key innovations: (1) a transformer-based encoder that captures both the global circuit structure and local qubit interactions and (2) a dynamic qubit reuse compilation mechanism that leverages mid-circuit measurement and reset operations to reduce inter-operation and qubit requirements. Our experimental results show significant improvements over state-of-the-art approaches. Compared to highly-optimized Qiskit with modular architecture configuration, QARMA-R reduces inter-core communications by up to 100% (on average 86%), while QARMA maintains 15-40% improvement for larger circuits without reuse. Against traditional modular qubit mapping, our approach achieves 97-100% reduction in inter-core operation. The proposed methods advance quantum circuit compilation techniques and enable the execution of more extensive quantum algorithms on resource-constrained modular quantum systems, contributing to the growing body of research on scalable quantum computing architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces QARMA, an attention-based deep reinforcement learning method combined with graph neural networks and a transformer encoder for qubit mapping and routing in modular quantum architectures to minimize inter-core communications. It also presents QARMA-R, which adds dynamic qubit reuse via mid-circuit measurement and reset. The central claims are large empirical gains: QARMA-R reduces inter-core communications by up to 100% (average 86%) versus highly-optimized Qiskit with modular configuration, 15-40% improvement for QARMA on larger circuits without reuse, and 97-100% reduction versus traditional modular mapping.

Significance. If the performance claims are reproducible and the cost model is representative of hardware, the work would advance automated compilation techniques for modular QPUs, potentially allowing larger algorithms on systems with limited inter-core bandwidth. The attention-GNN hybrid for capturing global circuit structure and local interactions, plus the dynamic reuse mechanism, represent a concrete step beyond static mapping heuristics.

major comments (2)
  1. [Abstract and §4] Abstract and §4 (Experimental Results): The headline reductions (up to 100%, avg. 86% vs. Qiskit; 97-100% vs. traditional mapping) are reported without any description of the inter-core cost model used in the RL reward, the exact definition of 'inter-core communications' (e.g., teleportations, latency-weighted SWAPs, or reset overhead), or whether this cost is identical at training and test time. Because the reward directly drives the policy, this omission is load-bearing for the central performance claim and prevents verification of whether gains are genuine or artifacts of the simulation.
  2. [§4 and §3] §4 and §3 (Methods): No information is provided on the benchmark circuits (type, size, number), whether training and evaluation circuit distributions are disjoint, number of random seeds, statistical significance, or error bars on the reported percentages. Without these, the robustness of the 86% average and 15-40% claims cannot be assessed.
minor comments (2)
  1. [§3] Notation for the attention mechanism and GNN layers could be clarified with an explicit equation for the combined embedding in the transformer encoder.
  2. [Figures in §4] Figure captions should explicitly state the circuit sizes and the exact baseline configurations (Qiskit version and modular settings) used for comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The comments highlight important aspects of clarity and reproducibility that we have addressed in the revised manuscript. Below we respond point-by-point to the major comments.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Experimental Results): The headline reductions (up to 100%, avg. 86% vs. Qiskit; 97-100% vs. traditional mapping) are reported without any description of the inter-core cost model used in the RL reward, the exact definition of 'inter-core communications' (e.g., teleportations, latency-weighted SWAPs, or reset overhead), or whether this cost is identical at training and test time. Because the reward directly drives the policy, this omission is load-bearing for the central performance claim and prevents verification of whether gains are genuine or artifacts of the simulation.

    Authors: We thank the referee for identifying this gap. The original manuscript described the cost model only at a high level in Section 3. In the revision we have added an explicit subsection that defines inter-core communications as the count of qubit state transfers (teleportations) between QPUs; this quantity is used identically as the negative reward signal during RL training and as the primary evaluation metric at test time. For QARMA-R we further specify how mid-circuit measurement/reset overhead is folded into the same count. These clarifications are now cross-referenced in the abstract and Section 4 so that the reported 86 % average reduction can be directly verified against the training objective. revision: yes

  2. Referee: [§4 and §3] §4 and §3 (Methods): No information is provided on the benchmark circuits (type, size, number), whether training and evaluation circuit distributions are disjoint, number of random seeds, statistical significance, or error bars on the reported percentages. Without these, the robustness of the 86% average and 15-40% claims cannot be assessed.

    Authors: We agree that these details are essential. The revised Section 4 now includes a dedicated benchmark description: circuits comprise QAOA, VQE, Grover, and random instances with 20–200 qubits (50 circuits per family). Training and test sets are drawn from disjoint distributions. All results are averaged over 5 independent random seeds; we report means together with standard-deviation error bars and include p-values from paired statistical tests confirming significance of the 15–40 % and 86 % improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical RL optimization against external baselines remains self-contained.

full rationale

The paper introduces QARMA as an attention-based DRL method (with GNN and transformer encoder) that learns qubit allocation, routing, and reuse to minimize inter-core communication costs in modular architectures. Performance numbers are obtained by direct comparison to independent external baselines (highly-optimized Qiskit modular configuration and traditional mapping) on benchmark circuits. No load-bearing step reduces by construction to a self-definition, a fitted parameter renamed as a prediction, or a self-citation chain; the reward is an explicit optimization objective whose outputs are measured against separate reference implementations. The derivation is therefore self-contained and falsifiable against those baselines.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach depends on standard RL assumptions plus an implicit cost model for inter-core operations; no new physical entities are postulated.

free parameters (1)
  • RL reward weights and attention hyperparameters
    Typical in DRL methods; values are fitted during training to maximize the reported communication reduction.
axioms (1)
  • domain assumption Mid-circuit measurement and reset operations are available and noiseless enough to enable qubit reuse without introducing prohibitive errors.
    Invoked when describing QARMA-R's dynamic reuse mechanism.

pith-pipeline@v0.9.0 · 5812 in / 1290 out tokens · 32174 ms · 2026-05-19T10:33:42.735716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Assessing System Capabilities and Bottlenecks of an Early Fault-Tolerant Bicycle Architecture

    quant-ph 2026-04 unverdicted novelty 6.0

    Syn@fac optimization reduces estimated circuit failure probability by a factor of 9 on average across non-Clifford benchmarks for bivariate bicycle code modular FTQC architectures, with additional gains from transvect...

  2. A Quantum Reservoir Computing Approach to Quantum Stock Movement Forecasting in Quantum-Invested Markets

    quant-ph 2026-02 unverdicted novelty 4.0

    A six-qubit quantum reservoir achieves over 86% accuracy in classifying stock trend movements for quantum-sector companies using daily and intraday volume data.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 2 Pith papers · 7 internal anchors

  1. [1]

    Cambridge university press Cambridge, Cambridge (2010)

    Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge university press Cambridge, Cambridge (2010)

  2. [2]

    Pseudorandom Generators with- out the XOR Lemma

    Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete log- arithms on a quantum computer. SIAM review 41(2), 303–332 (1999) https: //doi.org/10.1137/S0036144598347011

  3. [3]

    Preskill

    Grover, L.K.: A fast quantum mechanical algorithm for database search. In: Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, pp. 212–219 (1996). https://doi.org/10.1145/237814.237866 . https://doi.org/10.1145/237814.237866

  4. [4]

    The Journal of Supercomputing81(5), 687 (2025) https://doi.org/10

    AbuGhanem, M.: Ibm quantum computers: evolution, performance, and future directions. The Journal of Supercomputing81(5), 687 (2025) https://doi.org/10. 1007/s11227-025-07047-7

  5. [5]

    Quantum Computing in the NISQ era and beyond

    Preskill, J.: Quantum Computing in the NISQ era and beyond. Quantum2, 79 (2018) https://doi.org/10.22331/q-2018-08-06-79

  6. [6]

    Nature , author=

    Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J.C., Barends, R., Biswas, R., Boixo, S., Brandao, F.G.S.L., Buell, D.A., Burkett, B., Chen, Y., Chen, Z., Chiaro, B., Collins, R., Courtney, W., Dunsworth, A., Farhi, E., Foxen, B., Fowler, A., Gidney, C., Giustina, M., Graff, R., Guerin, K., Habegger, S., Harrigan, M.P., Hartmann, M.J., Ho, A., Hoffma...

  7. [7]

    Astra: Exploiting predictability to optimize deep learning,

    Murali, P., Baker, J.M., Javadi-Abhari, A., Chong, F.T., Martonosi, M.: Noise-adaptive compiler mappings for noisy intermediate-scale quantum com- puters. ASPLOS ’19, pp. 1015–1029. Association for Computing Machin- ery, New York, NY, USA (2019). https://doi.org/10.1145/3297858.3304075 . https://doi.org/10.1145/3297858.3304075

  8. [8]

    Monroe, C., Raussendorf, R., Ruthven, A., Brown, K.R., Maunz, P., Duan, L.- M., Kim, J.: Large-scale modular quantum-computer architecture with atomic memory and photonic interconnects. Phys. Rev. A89, 022317 (2014) https:// doi.org/10.1103/PhysRevA.89.022317

  9. [9]

    Jnane, H., Undseth, B., Cai, Z., Benjamin, S.C., Koczor, B.: Multicore quan- tum computing. Phys. Rev. Appl.18, 044064 (2022) https://doi.org/10.1103/ PhysRevApplied.18.044064

  10. [10]

    In: Proceedings of the Eight Annual ACM International Conference on Nanoscale Computing and Communication

    Rodrigo, S., Abadal, S., Almudéver, C.G., Alarcón, E.: Modelling short-range quantum teleportation for scalable multi-core quantum computing architectures. In: Proceedings of the Eight Annual ACM International Conference on Nanoscale Computing and Communication. NANOCOM ’21. Association for Comput- ing Machinery, New York, NY, USA (2021). https://doi.org/...

  11. [11]

    ACMTransactionsonQuantumComputing 4(2)(2023)https://doi.org/10.1145/ 3579367

    Cuomo, D., Caleffi, M., Krsulich, K., Tramonto, F., Agliardi, G., Prati, E., Cacciapuoti, A.S.: Optimized compiler for distributed quantum computing. ACMTransactionsonQuantumComputing 4(2)(2023)https://doi.org/10.1145/ 3579367

  12. [12]

    In: Proceedings of the International Symposium on Combinatorial Search, vol

    Botea, A., Kishimoto, A., Marinescu, R.: On the complexity of quantum circuit compilation. In: Proceedings of the International Symposium on Combinatorial Search, vol. 9, pp. 138–142 (2018).https://doi.org/10.1609/socs.v9i1.18463

  13. [13]

    ACM Transactions on Quantum Computing 4(1) (2022) https://doi.org/10.1145/3544563

    Nannicini, G., Bishop, L.S., Günlük, O., Jurcevic, P.: Optimal qubit assign- ment and routing via integer programming. ACM Transactions on Quantum Computing 4(1) (2022) https://doi.org/10.1145/3544563

  14. [14]

    Keckler, Christopher W

    Li, G., Ding, Y., Xie, Y.: Tackling the qubit mapping problem for nisq-era quantum devices. In: Proceedings of the Twenty-Fourth International Con- ference on Architectural Support for Programming Languages and Operating Systems. ASPLOS ’19, pp. 1001–1014. Association for Computing Machin- ery, New York, NY, USA (2019). https://doi.org/10.1145/3297858.330...

  15. [15]

    In: Proceedings of the 17th ACM International Conference on Computing Frontiers

    Baker, J.M., Duckering, C., Hoover, A., Chong, F.T.: Time-sliced quantum cir- cuit partitioning for modular architectures. In: Proceedings of the 17th ACM International Conference on Computing Frontiers. CF ’20, pp. 98–107. Associa- tion for Computing Machinery, New York, NY, USA (2020). https://doi.org/10. 1145/3387902.3392617 . https://doi.org/10.1145/3...

  16. [16]

    Almudever, and Sebastian Feld

    Bandic, M., Prielinger, L., Nüßlein, J., Ovide, A., Rodrigo, S., Abadal, S., Someren, H., Vardoyan, G., Alarcon, E., Almudever, C.G., Feld, S.: Mapping quantum circuits to modular architectures with qubo. In: 2023 IEEE Interna- tional Conference on Quantum Computing and Engineering (QCE), vol. 01, pp. 790–801 (2023). https://doi.org/10.1109/QCE57702.2023.00094

  17. [17]

    IEEE Computer Architecture Letters 22(2), 161–164 (2023) https: //doi.org/10.1109/LCA.2023.3318857

    Escofet, P., Ovide, A., Almudever, C.G., Alarcón, E., Abadal, S.: Hungarian qubit assignment for optimized mapping of quantum circuits on multi-core archi- tectures. IEEE Computer Architecture Letters 22(2), 161–164 (2023) https: //doi.org/10.1109/LCA.2023.3318857

  18. [18]

    Córcoles, A.D., Takita, M., Inoue, K., Lekuch, S., Minev, Z.K., Chow, J.M., Gambetta, J.M.: Exploiting dynamic quantum circuits in a quantum algorithm with superconducting qubits. Phys. Rev. Lett.127, 100501 (2021) https://doi. org/10.1103/PhysRevLett.127.100501

  19. [19]

    arXiv preprint arXiv:2211.01925 (2022)

    Hua, F., Jin, Y., Chen, Y., Vittal, S., Krsulich, K., Bishop, L.S., Lapeyre, J., Javadi-Abhari, A., Zhang, E.Z.: Exploiting qubit reuse through mid-circuit measurement and reset. arXiv preprint arXiv:2211.01925 (2022)

  20. [20]

    In: Proceedings of the 31st International Conference on Neural Information Processing Systems

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17, pp. 6000–6010. Curran Associates Inc., Red Hook, NY, USA (2017)

  21. [21]

    Neural Message Passing for Quantum Chemistry

    Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: Proceedings of the 34th International Con- ference on Machine Learning - Volume 70. ICML’17, pp. 1263–1272. JMLR.org, Sydney, NSW, Australia (2017).https://doi.org/10.48550/arXiv.1704.01212

  22. [22]

    Neural Combinatorial Optimization with Reinforcement Learning

    Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinato- rial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016)

  23. [23]

    Kool, W., Van Hoof, H., Welling, M.: Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475 (2018)

  24. [24]

    In: Proceedings of the 33rd International Conference on Neural Information Processing Systems

    Chen, X., Tian, Y.: Learning to perform local rewriting for combinatorial optimization. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA (2019) 34

  25. [25]

    In: NeurIPS 2023 Workshop: New Frontiers in Graph Learning (2023)

    Berto, F., Hua, C., Park, J., Kim, M., Kim, H., Son, J., Kim, H., Kim, J., Park, J.: RL4CO: a unified reinforcement learning for combinatorial optimization library. In: NeurIPS 2023 Workshop: New Frontiers in Graph Learning (2023). https://openreview.net/forum?id=YXSJxi8dOV

  26. [26]

    Naseri, M., Kondra, T.V., Goswami, S., Fellous-Asiani, M., Streltsov, A.: Entan- glement and coherence in the bernstein-vazirani algorithm. Phys. Rev. A106, 062429 (2022) https://doi.org/10.1103/PhysRevA.106.062429

  27. [27]

    Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices,

    Zhang, C., Hayes, A.B., Qiu, L., Jin, Y., Chen, Y., Zhang, E.Z.: Time- optimal qubit mapping. In: Proceedings of the 26th ACM International Con- ference on Architectural Support for Programming Languages and Operating Systems. ASPLOS ’21, pp. 360–374. Association for Computing Machin- ery, New York, NY, USA (2021). https://doi.org/10.1145/3445814.3446706 ...

  28. [28]

    In: Proceedings of the 2018 International Symposium on Code Gener- ation and Optimization

    Siraichi, M.Y., Santos, V.F.d., Collange, C., Pereira, F.M.Q.: Qubit alloca- tion. In: Proceedings of the 2018 International Symposium on Code Gener- ation and Optimization. CGO ’18, pp. 113–125. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3168822 . https://doi.org/10.1145/3168822

  29. [29]

    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems38(7), 1226–1236 (2019) https://doi

    Zulehner, A., Paler, A., Wille, R.: An efficient methodology for mapping quan- tum circuits to the ibm qx architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems38(7), 1226–1236 (2019) https://doi. org/10.1109/TCAD.2018.2846658

  30. [30]

    DeCross, M., Chertkov, E., Kohagen, M., Foss-Feig, M.: Qubit-reuse compilation with mid-circuit measurement and reset. Phys. Rev. X13, 041057 (2023) https: //doi.org/10.1103/PhysRevX.13.041057

  31. [31]

    Attention-Based Deep Reinforcement Learning for Qubit Allocation in Modular Quantum Architectures

    Russo, E., Palesi, M., Patti, D., Ascia, G., Catania, V.: Attention-based deep reinforcement learning for qubit allocation in modular quantum architectures. arXiv preprint arXiv:2406.11452 (2024)

  32. [32]

    Ice: An intelligent cognition engine with 3d nand-based in-memory computing for vector similarity search acceleration,

    Smith, K.N., Ravi, G.S., Baker, J.M., Chong, F.T.: Scaling superconducting quantum computers with chiplet architectures. In: 2022 55th IEEE/ACM Inter- national Symposium on Microarchitecture (MICRO), pp. 1092–1109 (2022). https://doi.org/10.1109/MICRO56248.2022.00078

  33. [33]

    Xue, C., Xu, X.-F., Wu, Y.-C., Guo, G.-P.: Quantum algorithm for solving a quadratic nonlinear system of equations. Phys. Rev. A106, 032427 (2022) https: //doi.org/10.1103/PhysRevA.106.032427

  34. [34]

    35 Nature 592(7853), 209–213 (2021) https://doi.org/10.1038/s41586-021-03318-4

    Pino, J.M., Dreiling, J.M., Figgatt, C., Gaebler, J.P., Moses, S.A., Allman, M.S., Baldwin, C.H., Foss-Feig, M., Hayes, D., Mayer, K., Ryan-Anderson, C., Neyen- huis, B.: Demonstration of the trapped-ion quantum ccd computer architecture. 35 Nature 592(7853), 209–213 (2021) https://doi.org/10.1038/s41586-021-03318-4

  35. [35]

    Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimiza- tion:Amethodologicaltourd’horizon.EuropeanJournalofOperationalResearch 290(2), 405–421 (2021) https://doi.org/10.1016/j.ejor.2020.07.063

  36. [36]

    In: Precup, D., Teh, Y.W

    Mirhoseini, A., Pham, H., Le, Q.V., Steiner, B., Larsen, R., Zhou, Y., Kumar, N., Norouzi, M., Bengio, S., Dean, J.: Device placement optimization with rein- forcement learning. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learn- ing Research, vol. 70, pp. 2430–2439. PMLR, Sy...

  37. [37]

    Quantum circuit optimization with deep reinforcement learning,

    Fösel, T., Niu, M.Y., Marquardt, F., Li, L.: Quantum circuit optimization with deep reinforcement learning. arXiv preprint arXiv:2103.07585 (2021)

  38. [38]

    In: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp

    Wille, R., Van Meter, R., Naveh, Y.: Ibm’s qiskit tool chain: Working with and developing for real quantum computers. In: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1234–1240 (2019). https://doi. org/10.23919/DATE.2019.8715261

  39. [39]

    https://docs.quantum.ibm

    Contributors, Q.: Qiskit Transpiler Architecture. https://docs.quantum.ibm. com/api/qiskit/transpiler_passes (2021)

  40. [40]

    In: Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis

    Hua, F., Wang, M., Li, G., Peng, B., Liu, C., Zheng, M., Stein, S., Ding, Y., Zhang, E.Z., Humble, T., Li, A.: Qasmtrans: A qasm quantum transpiler framework for nisq devices. In: Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis. SC-W ’23, pp. 1468–1477. Association for Compu...

  41. [41]

    https://learning.quantum.ibm.com/ tutorial/repeat-until-success (2023)

    Contributors, Q.: Dynamic Circuits. https://learning.quantum.ibm.com/ tutorial/repeat-until-success (2023)

  42. [42]

    In: IEEE/ACM International Conference on Computer-Aided Design

    Huang, C.-Y., Lien, C.-H., Mak, W.-K.: Reinforcement learning and dear frame- work for solving the qubit mapping problem. In: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. ICCAD ’22. Association for Computing Machinery, New York, NY, USA (2022). https://doi. org/10.1145/3508352.3549472 . https://doi.org/10.1145/350835...

  43. [43]

    In: Proceedings of the 61st ACM/IEEE Design Automation Conference

    Chu, C., Fu, Z., Xu, Y., Huang, G., Muller, H., Chen, F., Jiang, L.: Titan: A fast and distributed large-scale trapped-ion nisq computer. In: Proceedings of the 61st ACM/IEEE Design Automation Conference. DAC ’24. Association for Computing Machinery, New York, NY, USA (2024). https://doi.org/10.1145/ 3649329.3655908 . https://doi.org/10.1145/3649329.3655908 36

  44. [44]

    Self-Attention with Relative Position Representations

    Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. arXiv preprint arXiv:1803.02155 (2018)

  45. [45]

    Adam: A Method for Stochastic Optimization

    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  46. [46]

    Nature Computational Science 5(5), 427–435 (2025) https://doi.org/10.1038/ s43588-025-00792-y

    Nation, P.D., Saki, A.A., Brandhofer, S., Bello, L., Garion, S., Treinish, M., Javadi-Abhari, A.: Benchmarking the performance of quantum comput- ing software for quantum circuit creation, manipulation and compilation. Nature Computational Science 5(5), 427–435 (2025) https://doi.org/10.1038/ s43588-025-00792-y

  47. [47]

    https://github.com/jtiosue/qubovert (2020)

    Tiosue, J.: qubovert: A Python library for solving QUBO problems and related models. https://github.com/jtiosue/qubovert (2020)

  48. [48]

    Almudever, C.: An accurate and efficient analytic model of fidelity under depolarizing noise oriented to large scale quantum system design

    Escofet, P., Rodrigo, S., Garcia-Saez, A., Alarcon, E., Abadal, S., G. Almudever, C.: An accurate and efficient analytic model of fidelity under depolarizing noise oriented to large scale quantum system design. Quantum Science and Technology (2025)

  49. [49]

    Nature638(8050), 383–388 (2025) https://doi.org/10

    Main, D., Drmota, P., Nadlinger, D.P., Ainley, E.M., Agrawal, A., Nichol, B.C., Srinivas, R., Araneda, G., Lucas, D.M.: Distributed quantum computing across an optical network link. Nature638(8050), 383–388 (2025) https://doi.org/10. 1038/s41586-024-08404-x 37