arxiv: 2604.26609 · v1 · submitted 2026-04-29 · 🪐 quant-ph · cs.SE

Recognition: unknown

Probabilistic Condition, Decision and Path Coverage of Circuit-based Quantum Programs

Daniel Fortunato , Jos\'e Campos , Rui Abreu

Authors on Pith no claims yet

Pith reviewed 2026-05-07 13:10 UTC · model grok-4.3

classification 🪐 quant-ph cs.SE

keywords quantum software testingcoverage criteriaquantum circuitsmutation testingprobabilistic coveragecondition coveragedecision coveragepath coverage

0 comments

The pith

Quantum circuits achieve high condition and decision coverage but limited path coverage, with structural metrics showing weak correlation to fault detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper adapts classical software testing coverage criteria to quantum circuits by defining condition, decision, and path coverage along with probabilistic variants that incorporate measurement outcome probabilities. Evaluation on 540 circuits shows average condition coverage of 97.56 percent and decision coverage of 97.63 percent, yet path coverage averages only 71.84 percent and drops sharply when multi-controlled gates create many possible execution paths. Probabilistic coverage adds confidence scores that average 88.87 percent for conditions, 88.65 percent for decisions, and 37.18 percent for paths. Mutation testing across the same circuits finds only weak or no correlation between these coverage values and the ability to detect injected faults. The work supplies concrete metrics that quantum developers can use to judge whether a circuit has been tested thoroughly enough.

Core claim

We adapt condition, decision, and path coverage from classical testing to circuit-based quantum programs and introduce probabilistic versions that augment structural coverage with a confidence measure derived from measurement probabilities. Using the QaCoCo tool on 540 circuits we measure high average condition and decision coverage but substantially lower path coverage, with multi-controlled gates producing extreme path explosion and imbalance. Mutation analysis shows these coverage scores have weak or no correlation with fault detection effectiveness.

What carries the argument

QaCoCo, the tool that analyzes quantum circuit structure to compute condition, decision, path, and probabilistic coverage by tracing gate dependencies and measurement probabilities.

If this is right

Path coverage will remain low for any circuit containing multi-controlled gates unless test generation explicitly targets the combinatorial paths they create.
Probabilistic coverage supplies a numerical confidence value that can be reported alongside binary coverage percentages for each criterion.
Test adequacy for quantum programs cannot be judged by structural coverage numbers alone because those numbers do not reliably predict fault detection power.
Developers may need to combine coverage criteria with mutation testing or other oracles to obtain a more trustworthy assessment of test quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Automated test generators for quantum circuits could be guided by path-coverage targets or by prioritizing high-probability execution paths to raise overall adequacy.
The same coverage machinery could be applied to hybrid quantum-classical programs by treating the classical control flow separately from the quantum circuit portion.
Similar adaptations of coverage criteria might prove useful for other quantum programming models such as quantum walks or measurement-based computation.
Quantum software teams should treat coverage metrics as one indicator among several rather than as a sufficient stopping criterion for testing.

Load-bearing premise

That classical structural coverage criteria can be meaningfully adapted to assess test adequacy in quantum circuits despite their probabilistic behavior and the observed weak link to fault detection.

What would settle it

A new experiment on a comparable set of quantum circuits that finds a strong positive correlation between the proposed coverage scores and the rate of detected faults in mutation testing would falsify the weak-correlation result.

Figures

Figures reproduced from arXiv: 2604.26609 by Daniel Fortunato, Jos\'e Campos, Rui Abreu.

**Figure 1.** Figure 1: Swap test [5, 9] written in Qiskit [1] view at source ↗

**Figure 2.** Figure 2: Original, transpiled, and instrumented circuit of the Swap test written in Qiskit in Figure view at source ↗

**Figure 3.** Figure 3: Transpilation and instrumentation of the Qiskit view at source ↗

**Figure 4.** Figure 4: Execution of the instrumented circuit in Figure view at source ↗

read the original abstract

Coverage criteria play a central role in assessing test adequacy in classical software, yet their effectiveness for quantum programs remains poorly understood and largely unexplored. In this paper, we propose six quantum-tailored criteria - condition, decision, and path coverage, and their probabilistic variants - adapted from their classical counterparts. We present QaCoCo, a tool that computes these criteria for circuit-based quantum programs. We empirically evaluate these criteria on a large and diverse set of 540 circuits and analyze the coverage achieved. Our results show that while circuits frequently achieve high condition and decision coverage (97.56% and 97.63%, on average), path coverage remains limited (71.84%), particularly in the presence of multi-controlled gates, which induce extreme path explosion and coverage imbalance. Moreover, to account for the probabilistic nature of quantum circuits, we introduce probabilistic coverage, which augments structural coverage with a confidence measure (88.87%, 88.65%, and 37.18% for condition, decision, and path coverage, respectively, on average). Finally, through mutation testing, we find weak or no correlation between fault detection and structural coverage, consistent with observations in classical computing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts classical coverage criteria to quantum circuits with a large empirical study on 540 circuits, finding high condition/decision coverage but low path coverage from multi-controlled gates, plus weak mutation correlation.

read the letter

The main things to know are that this paper defines probabilistic versions of condition, decision, and path coverage for quantum circuits and backs them up with numbers from 540 circuits plus mutation tests. Condition and decision coverage come out high on average, path coverage low because of multi-controlled gates, and there's weak correlation between coverage and catching faults. The new part is the probabilistic twist and the concrete scale of the experiment showing coverage imbalance. They built a tool called QaCoCo to compute these, which is practical work. The mutation testing adds some evidence that these metrics may not be strong indicators of test quality. On the downside, the approach treats the circuit's control structure in a fairly classical way, enumerating paths from multi-controlled gates as if they were separate branches. Quantum circuits evolve all possibilities at once under superposition, so without a clear link from these coverage scores to the actual quantum state evolution or measurement probabilities, it's hard to see what the percentages really tell us about program correctness. The stress test note raises a fair point here, and the weak mutation correlation supports the concern rather than refuting it. The abstract doesn't spell out the exact definitions or how paths are counted, though the full paper presumably does. This is for software engineering researchers moving into quantum, or tool builders. A reader looking for data on testing quantum programs will find some useful observations. It deserves peer review because the empirical claims are specific and the area needs this kind of groundwork, even with the conceptual questions. I'd send it to referees with a request to address how the criteria connect to quantum semantics.

Referee Report

3 major / 2 minor

Summary. The paper proposes six coverage criteria for circuit-based quantum programs—condition, decision, and path coverage together with probabilistic variants—adapted from classical structural testing. It presents the QaCoCo tool to compute these metrics and reports results from an empirical study on 540 circuits: average condition and decision coverage reach 97.56% and 97.63%, path coverage is 71.84% (limited by multi-controlled gates), probabilistic coverage yields lower confidence values (88.87%, 88.65%, 37.18%), and mutation testing shows weak or no correlation between coverage and fault detection.

Significance. A large-scale empirical evaluation on 540 circuits together with an implemented tool constitutes a concrete contribution to quantum software testing. If the syntactic adaptations can be shown to relate to quantum correctness (via state evolution or measurement statistics), the criteria and the observed path-explosion phenomenon would offer practical guidance for test adequacy. The reported weak correlation with mutation-based fault detection is itself a useful negative result that aligns with classical findings and underscores the need for quantum-specific adequacy measures.

major comments (3)

[§3] §3: The formal definitions of condition, decision, and path coverage (and their probabilistic extensions) are not provided in sufficient detail. It is unclear how control dependencies or paths are extracted from the circuit DAG, how multi-controlled gates are enumerated as classical branches, and how the probabilistic confidence measure is computed from Born-rule probabilities or the density operator. These omissions are load-bearing because all reported averages and the central claim of 'quantum-tailored' criteria rest on the correctness of these definitions.
[§5] §5 (experimental setup): The paper does not specify the circuit selection criteria, the exact mutation operators applied, or the precise method used to determine whether a mutant is killed. Without these, the averages (e.g., 97.56% condition coverage) and the 'weak or no correlation' conclusion cannot be reproduced or assessed for bias, undermining the empirical support for the six criteria.
[§3–4] §3–4: The adaptation presupposes a classical control-flow model on the circuit syntax, yet no explicit mapping is given from the coverage predicates to the quantum semantics (superposition, entanglement, or measurement outcomes). The introduction of probabilistic coverage is noted but lacks a derivation showing that the confidence values measure test adequacy for the actual unitary evolution rather than merely the syntactic structure; this gap directly affects the relevance of the reported percentages to quantum program correctness.

minor comments (2)

[Abstract, §5] The abstract and §5 could more clearly separate the structural coverage percentages from the probabilistic confidence values to avoid conflating the two families of metrics.
Table or figure captions reporting the 540-circuit averages should include the standard deviation or range to convey variability across circuit families.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. The comments highlight important areas for improving clarity, reproducibility, and the linkage to quantum semantics. We address each major comment below and will make corresponding revisions to strengthen the paper.

read point-by-point responses

Referee: [§3] §3: The formal definitions of condition, decision, and path coverage (and their probabilistic extensions) are not provided in sufficient detail. It is unclear how control dependencies or paths are extracted from the circuit DAG, how multi-controlled gates are enumerated as classical branches, and how the probabilistic confidence measure is computed from Born-rule probabilities or the density operator. These omissions are load-bearing because all reported averages and the central claim of 'quantum-tailored' criteria rest on the correctness of these definitions.

Authors: We agree that §3 would benefit from greater formal detail to ensure the definitions are unambiguous and the results reproducible. In the revised manuscript, we will augment the section with: (1) an explicit algorithm (in pseudocode) for constructing the control-flow graph from the circuit DAG by traversing gates and recording control-qubit dependencies; (2) a precise enumeration rule for multi-controlled gates, treating each as a multi-way branch over the 2^k control combinations while using memoization to mitigate path explosion; and (3) the probabilistic confidence formula, defined as the minimum (or product) of the Born-rule probabilities of the exercised conditions/decisions/paths, obtained by simulating the circuit to obtain the final state vector or density operator and computing measurement probabilities. These additions will directly support the reported averages without changing the underlying approach. revision: yes
Referee: [§5] §5 (experimental setup): The paper does not specify the circuit selection criteria, the exact mutation operators applied, or the precise method used to determine whether a mutant is killed. Without these, the averages (e.g., 97.56% condition coverage) and the 'weak or no correlation' conclusion cannot be reproduced or assessed for bias, undermining the empirical support for the six criteria.

Authors: We acknowledge that the experimental setup in §5 lacks the necessary specificity for full reproducibility. We will expand this section to include: circuit selection criteria (540 circuits drawn from Qiskit’s algorithm library—Grover, QFT, QAOA, VQE variants—plus randomly generated circuits with 2–25 qubits and depths 5–60, stratified by size and gate types to ensure diversity); the five mutation operators (single-qubit gate substitution, control-qubit addition/removal, rotation-angle perturbation, two-qubit gate swap, and measurement-basis flip); and the mutant-killing criterion (a mutant is killed if the total-variation distance between its output probability distribution and the original exceeds 0.1, or if a Kolmogorov–Smirnov test on 1024-shot histograms rejects equality at p < 0.05). These details will allow independent verification of the coverage figures and the correlation analysis. revision: yes
Referee: [§3–4] §3–4: The adaptation presupposes a classical control-flow model on the circuit syntax, yet no explicit mapping is given from the coverage predicates to the quantum semantics (superposition, entanglement, or measurement outcomes). The introduction of probabilistic coverage is noted but lacks a derivation showing that the confidence values measure test adequacy for the actual unitary evolution rather than merely the syntactic structure; this gap directly affects the relevance of the reported percentages to quantum program correctness.

Authors: We partially agree that an explicit semantic bridge strengthens the presentation. The criteria are intentionally syntactic—adapting classical structural coverage to the gate sequence and control dependencies that define execution order in circuit-based programs—yet the probabilistic variants are grounded in quantum mechanics: confidence is computed from Born-rule probabilities of the measurement outcomes that result from the unitary evolution. In the revision we will insert a short derivation in §4 showing that exercising a control condition or path corresponds to ensuring that the test suite samples subspaces whose amplitudes affect distinguishable final measurement statistics, thereby linking syntactic coverage to the observable behavior under superposition and entanglement. We do not claim these criteria constitute a complete semantic adequacy measure (the mutation results already indicate their limitations), but they supply practical, quantum-aware guidance. This addition addresses the relevance concern while preserving the paper’s core contribution. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical tool-based evaluation is self-contained

full rationale

The paper defines six coverage criteria by direct adaptation from classical software testing, implements them in the QaCoCo tool, and reports measured averages over 540 circuits plus mutation-testing results. No equations, predictions, or first-principles derivations are present that could reduce to fitted parameters, self-definitions, or self-citation chains. All reported quantities (e.g., 97.56% condition coverage) are computed outputs from the tool on external circuit benchmarks, not inputs renamed as results. The analysis therefore contains no load-bearing circular steps and rests on independent empirical measurement.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests primarily on the domain assumption that structural coverage metrics transfer usefully to quantum circuits; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Coverage criteria developed for classical programs can be meaningfully adapted to the structure and probabilistic behavior of quantum circuits
The six proposed criteria and the QaCoCo tool are built directly on this premise.

pith-pipeline@v0.9.0 · 5506 in / 1445 out tokens · 85229 ms · 2026-05-07T13:10:57.524372+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 52 canonical work pages · 1 internal anchor

[1]

yG x-e|v e ! B ! R9Q Rm޼ t,acc kkkw??? B ! B (p@ | gϦ !< ؄ B ! B ) 8 Th࠰ 8@, #00_ B ! B ! ! ;vӱrJر gϞEHH K O | B ! B ! ! LLL

Gadi Aleksandrowicz, Thomas Alexander, Panagiotis Barkoutsos, Luciano Bello, Yael Ben-Haim, David Bucher, Francisco Jose Cabrera-Hernández, Jorge Carballo- Franquis, Adrian Chen, Chun-Fu Chen, Jerry M. Chow, Antonio D. Córcoles- Gonzales, Abigail J. Cross, Andrew Cross, Juan Cruz-Benito, Chris Culver, Sal- vador De La Puente González, Enrique De La Torre,...

work page doi:10.5281/zenodo.2562111 2019
[2]

Shaukat Ali, Paolo Arcaini, Xinyi Wang, and Tao Yue. 2021. Assessing the Effec- tiveness of Input and Output Coverage Criteria for Testing Quantum Programs. In2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). 13–23. doi:10.1109/ICST49551.2021.00014

work page doi:10.1109/icst49551.2021.00014 2021
[3]

Juan Altmayer Pizzorno and Emery D. Berger. 2025. CoverUp: Effective High Coverage Test Generation for Python.Proc. ACM Softw. Eng.2, FSE, Article FSE128 (June 2025), 23 pages. doi:10.1145/3729398

work page doi:10.1145/3729398 2025
[4]

Boran Apak, Medina Bandic, Aritra Sarkar, and Sebastian Feld. 2024. KetGPT – Dataset Augmentation of Quantum Circuits Using Transformers. InComputa- tional Science – ICCS 2024, Leonardo Franco, Clélia de Mulatier, Maciej Paszyn- ski, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, and Peter M. A. Sloot (Eds.). Springer Nature Switzerland, Cham, 235–251. do...

work page doi:10.1007/978-3-031-63778-0_17 2024
[5]

Adriano Barenco, André Berthiaume, David Deutsch, Artur Ek- ert, Richard Jozsa, and Chiara Macchiavello. 1997. Stabilization of Quantum Computations by Symmetrization.SIAM J. Comput.26, 5 (1997), 1541–1557. arXiv:https://doi.org/10.1137/S0097539796302452 doi:10.1137/S0097539796302452

work page doi:10.1137/s0097539796302452 1997
[6]

Ned Batchelder. [n. d.]. coverage.py: Python code coverage tool. https://coverage. readthedocs.io/en/7.11.3/ Accessed: 2025-11-18

2025
[7]

Bin-Obaid and Theodore B

Hamoud S. Bin-Obaid and Theodore B. Trafalis. 2020. Fairness in Resource Allocation: Foundation and Applications. InNetwork Algorithms, Data Mining, and Applications, Ilya Bychkov, Valery A. Kalyagin, Panos M. Pardalos, and Oleg Prokopyev (Eds.). Springer International Publishing, Cham, 3–18. doi:10.1007/ 978-3-030-37157-9_1

2020
[8]

Jean-Luc Brylinski and Ranee Brylinski. 2002. Universal quantum gates. In Mathematics of quantum computation. Chapman and Hall/CRC, 117–134

2002
[9]

Harry Buhrman, Richard Cleve, John Watrous, and Ronald de Wolf. 2001. Quantum Fingerprinting.Phys. Rev. Lett.87 (Sep 2001), 167902. Issue 16. doi:10.1103/PhysRevLett.87.167902

work page doi:10.1103/physrevlett.87.167902 2001
[10]

José Campos, Rui Abreu, Gordon Fraser, and Marcelo d’Amorim. 2013. Entropy- based test generation for improved fault localization. In2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 257–267. doi:10.1109/ASE.2013.6693085

work page doi:10.1109/ase.2013.6693085 2013
[11]

José Campos, Andrea Arcuri, Gordon Fraser, and Rui Abreu. 2014. Continuous test generation: enhancing continuous integration with automated test genera- tion. InProceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering(Vasteras, Sweden)(ASE ’14). Association for Computing Machinery, New York, NY, USA, 55–66. doi:10.1145/2...

work page doi:10.1145/2642937.2643002 2014
[12]

José Campos, Yan Ge, Nasser Albunian, Gordon Fraser, Marcelo Eler, and Andrea Arcuri. 2018. An empirical evaluation of evolutionary algorithms for unit test suite generation.Information and Software Technology104 (2018), 207–235. doi:10.1016/j.infsof.2018.08.010

work page doi:10.1016/j.infsof.2018.08.010 2018
[13]

José Campos and André Souto. 2021. QBugs: A Collection of Reproducible Bugs in Quantum Algorithms and a Supporting Infrastructure to Enable Controlled Quantum Software Testing and Debugging Experiments. In2021 IEEE/ACM 2nd International Workshop on Quantum Software Engineering (Q-SE). 28–32. doi:10.1109/Q-SE52541.2021.00013

work page doi:10.1109/q-se52541.2021.00013 2021
[14]

Kean Chen, Wang Fang, Ji Guan, Xin Hong, Mingyu Huang, Junyi Liu, Qisheng Wang, and Mingsheng Ying. 2022. VeriQBench: A Benchmark for Multiple Types of Quantum Circuits. arXiv:2206.10880 [quant-ph] https://arxiv.org/abs/2206. 10880

work page arXiv 2022
[15]

Andrew W Cross, Lev S Bishop, John A Smolin, and Jay M Gambetta. 2017. Open Quantum Assembly Language.arXiv preprint arXiv:1707.03429(2017). https://arxiv.org/abs/1707.03429

work page Pith review arXiv 2017
[16]

Felipe Ferreira and José Campos. 2025. An exploratory study on the usage of quantum programming languages.Science of Computer Programming240 (2025), 103217. doi:10.1016/j.scico.2024.103217

work page doi:10.1016/j.scico.2024.103217 2025
[17]

Daniel Fortunato, José Campos, and Rui Abreu. 2022. Mutation Testing of Quantum Programs: A Case Study With Qiskit.IEEE Transactions on Quantum Engineering3 (2022), 1–17. doi:10.1109/TQE.2022.3195061

work page doi:10.1109/tqe.2022.3195061 2022
[18]

Daniel Fortunato, José Campos, and Rui Abreu. 2024. Gate Branch Coverage: A Metric for Quantum Software Testing. InProceedings of the 1st ACM International Workshop on Quantum Software Engineering: The Next Evolution (QSE-NE 2024). Association for Computing Machinery, New York, NY, USA, 15–18. doi:10.1145/ 3663531.3664753

work page arXiv 2024
[19]

Gordon Fraser and José Miguel Rojas. 2019. Software Testing. InHandbook of Software Engineering, Sungdeok Cha, Richard N. Taylor, and Kyochul Kang (Eds.). Springer International Publishing, Cham, 123–192. doi:10.1007/978-3- 030-00262-6_4

work page doi:10.1007/978-3- 2019
[20]

Google. 2019. Cirq - An open source framework for programming quantum computers. https://quantumai.google/cirq. [Online; accessed March-2022]

2019
[21]

Lov K. Grover. 1996. A fast quantum mechanical algorithm for database search. InProceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Com- puting(Philadelphia, Pennsylvania, USA)(STOC ’96). Association for Computing Machinery, New York, NY, USA, 212–219. doi:10.1145/237814.237866

work page doi:10.1145/237814.237866 1996
[22]

Hoffmann, Evgeny Mandrikov, and Alexandre Godin

Marc R. Hoffmann, Evgeny Mandrikov, and Alexandre Godin. [n. d.]. JaCoCo Java Code Coverage Library. https://www.jacoco.org/jacoco/ Accessed: 2025-11-18

2025
[23]

Patrick Hopf, Nils Quetschlich, Laura Schulz, and Robert Wille. 2025. Improving Figures of Merit for Quantum Circuit Compilation. In2025 Design, Automation & Test in Europe Conference (DATE). 1–7. doi:10.23919/DATE64628.2025.10992761 ISSN: 1558-1101

work page doi:10.23919/date64628.2025.10992761 2025
[24]

Laura Inozemtseva and Reid Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. InProceedings of the 36th International Conference on Software Engineering(Hyderabad, India)(ICSE 2014). Association for Computing Machinery, New York, NY, USA, 435–445. doi:10.1145/2568225.2568271

work page doi:10.1145/2568225.2568271 2014
[25]

R. Jain, D. Chiu, and W. Hawe. 1984. A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems. doi:10. 48550/arXiv.cs/9809099 arXiv:cs/9809099

work page arXiv 1984
[26]

Yue Jia and Mark Harman. 2011. An Analysis and Survey of the Development of Mutation Testing.IEEE Transactions on Software Engineering37, 5 (2011), 649–678. doi:10.1109/TSE.2010.62

work page doi:10.1109/tse.2010.62 2011
[27]

Tiancheng Jin, Shangzhou Xia, and Jianjun Zhao. 2025. NovaQ: Improving Quantum Program Testing through Diversity-Guided Test Case Generation. arXiv:2509.04763 [cs.SE] https://arxiv.org/abs/2509.04763

work page arXiv 2025
[28]

Sabre Kais. 2014. Introduction to Quantum Information and Computation for Chemistry. InQuantum Information and Computation for Chemistry. John Wi- ley & Sons, Ltd, 1–38. doi:10.1002/9781118742631.ch01 Section: 1 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118742631.ch01

work page doi:10.1002/9781118742631.ch01 2014
[29]

Min-Sung Kang, Jino Heo, Seong-Gon Choi, Sung Moon, and Sang-Wook Han
[30]

doi:10.1038/s41598-019-42662-4

Implementation of SWAP test for two unknown states in photons via cross-Kerr nonlinearities under decoherence effect.Scientific reports9, 1 (April 2019), 6167. doi:10.1038/s41598-019-42662-4

work page doi:10.1038/s41598-019-42662-4 2019
[31]

Rafaqut Kazmi, Dayang N. A. Jawawi, Radziah Mohamad, and Imran Ghani. 2017. Effective Regression Test Case Selection: A Systematic Literature Review.ACM Comput. Surv.50, 2, Article 29 (may 2017), 32 pages. doi:10.1145/3057269

work page doi:10.1145/3057269 2017
[32]

Ajay Kumar. 2023. Formalization of structural test cases coverage criteria for quantum software testing.International Journal of Theoretical Physics62, 3 (2023), 49

2023
[33]

Ang Li, Samuel Stein, Sriram Krishnamoorthy, and James Ang. 2023. QASMBench: A Low-Level Quantum Benchmark Suite for NISQ Evaluation and Simulation. ACM Transactions on Quantum Computing4, 2 (Feb. 2023), 10:1–10:26. doi:10. 1145/3550488

2023
[34]

Peiyi Li, Ji Liu, Yangjia Li, and Huiyang Zhou. 2022. Exploiting Quantum As- sertions for Error Mitigation and Quantum Program Debugging. In2022 IEEE 40th International Conference on Computer Design (ICCD). 124–131. doi:10.1109/ ICCD56317.2022.00028

work page arXiv 2022
[35]

Yuechen Li, Minqi Shao, Jianjun Zhao, and Qichen Wang. 2026. A Methodological Analysis of Empirical Studies in Quantum Software Testing. arXiv:2601.08367 [quant-ph] https://arxiv.org/abs/2601.08367 Fortunato et al

work page internal anchor Pith review Pith/arXiv arXiv 2026
[36]

Ji Liu and Huiyang Zhou. 2021. Systematic Approaches for Precise and Ap- proximate Quantum State Runtime Assertion. In2021 IEEE International Sympo- sium on High-Performance Computer Architecture (HPCA). 179–193. doi:10.1109/ HPCA51647.2021.00025

work page arXiv 2021
[37]

Ning Ma, Jianjun Zhao, Foutse Khomh, Shaukat Ali, and Heng Li. 2025. QMon: Monitoring the Execution of Quantum Circuits with Mid-Circuit Measurement and Reset. doi:10.48550/arXiv.2512.13422 arXiv:2512.13422 [cs]

work page doi:10.48550/arxiv.2512.13422 2025
[38]

T.J. McCabe. 1976. A Complexity Measure.IEEE Transactions on Software EngineeringSE-2, 4 (1976), 308–320. doi:10.1109/TSE.1976.233837

work page doi:10.1109/tse.1976.233837 1976
[39]

Eñaut Mendiluze, Shaukat Ali, Paolo Arcaini, and Tao Yue. 2021. Muskit: A Mutation Analysis Tool for Quantum Software Testing. In2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1266–1270. doi:10.1109/ASE51524.2021.9678563

work page doi:10.1109/ase51524.2021.9678563 2021
[40]

Eñaut Mendiluze Usandizaga, Shaukat Ali, Tao Yue, and Paolo Arcaini. 2025. Quantum circuit mutants: Empirical analysis and recommendations.Empirical Software Engineering30, 4 (April 2025), 35 pages. doi:10.1007/s10664-025-10643-z

work page doi:10.1007/s10664-025-10643-z 2025
[41]

Microsoft. 2021. Azure Quantum documentation. https://docs.microsoft.com/en- us/azure/quantum/?view=qsharp-preview. [Online; accessed December-2021]

2021
[42]

Andriy Miranskyy, José Campos, Anila Mjeda, Lei Zhang, and Ignacio Gar- cía Rodríguez de Guzmán. 2025. On the Feasibility of Quantum Unit Testing. arXiv:2507.17235 [cs.SE] https://arxiv.org/abs/2507.17235

work page arXiv 2025
[43]

2011.The art of software testing

Glenford J Myers, Corey Sandler, and Tom Badgett. 2011.The art of software testing. John Wiley & Sons

2011
[44]

Krzysztof Nowicki, Aleksander Malinowski, and Marcin Sikorski. 2016. More Just Measure of Fairness for Sharing Network Resources. InComputer Networks, Piotr Gaj, Andrzej Kwiecień, and Piotr Stera (Eds.). Springer International Publishing, Cham, 52–58. doi:10.1007/978-3-319-39207-3_5

work page doi:10.1007/978-3-319-39207-3_5 2016
[45]

Kapfhammer, Gordon Fraser, and Phil McMinn

David Paterson, Jose Campos, Rui Abreu, Gregory M. Kapfhammer, Gordon Fraser, and Phil McMinn. 2019. An Empirical Study on the Use of Defect Predic- tion for Test Case Prioritization. In2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). 346–357. doi:10.1109/ICST.2019.00041

work page doi:10.1109/icst.2019.00041 2019
[46]

Rui Pereira, Marco Couto, Francisco Ribeiro, Rui Rua, Jácome Cunha, João Paulo Fernandes, and João Saraiva. 2021. Ranking programming languages by energy efficiency.Science of Computer Programming205 (2021), 102609. doi:10.1016/j. scico.2021.102609

work page doi:10.1016/j 2021
[47]

Goran Petrović, Marko Ivanković, Gordon Fraser, and René Just. 2022. Practical Mutation Testing at Scale: A view from Google.IEEE Transactions on Software Engineering48, 10 (Oct. 2022), 3900–3912. doi:10.1109/TSE.2021.3107634 Con- ference Name: IEEE Transactions on Software Engineering

work page doi:10.1109/tse.2021.3107634 2022
[48]

John Preskill. 2018. Quantum Computing in the NISQ era and beyond.Quantum 2 (Aug. 2018), 79. doi:10.22331/q-2018-08-06-79 Publisher: Verein zur Förderung des Open Access Publizierens in den Quantenwissenschaften

work page Pith review doi:10.22331/q-2018-08-06-79 2018
[49]

2026.QASMBench issues #9

QASMBench. 2026.QASMBench issues #9. https://github.com/pnnl/QASMBench/ issues/9

2026
[50]

2026.Qiskit Transpiler homepage

Qiskit. 2026.Qiskit Transpiler homepage. https://quantum.cloud.ibm.com/docs/ en/api/qiskit/2.1/transpiler

2026
[51]

2026.Set of quantum hardware primitive gates

Qiskit. 2026.Set of quantum hardware primitive gates. https://github.com/Qiskit/ qiskit/blob/main/qiskit/qasm/libs/qelib1.inc#L4-L17

2026
[52]

Nils Quetschlich, Lukas Burgholzer, and Robert Wille. 2023. Compiler Opti- mization for Quantum Computing Using Reinforcement Learning. In2023 60th ACM/IEEE Design Automation Conference (DAC). 1–6. doi:10.1109/DAC56929. 2023.10248002

work page doi:10.1109/dac56929 2023
[53]

Nils Quetschlich, Lukas Burgholzer, and Robert Wille. 2023. MQT Bench: Bench- marking Software and Design Automation Tools for Quantum Computing.Quan- tum7 (July 2023), 1062. doi:10.22331/q-2023-07-20-1062 Publisher: Verein zur Förderung des Open Access Publizierens in den Quantenwissenschaften

work page doi:10.22331/q-2023-07-20-1062 2023
[54]

José Miguel Rojas, José Campos, Mattia Vivanti, Gordon Fraser, and Andrea Ar- curi. 2015. Combining Multiple Coverage Criteria in Search-Based Unit Test Gen- eration. InSearch-Based Software Engineering, Márcio Barros and Yvan Labiche (Eds.). Springer International Publishing, Cham, 93–108

2015
[55]

Raul Santelices, Pavan Kumar Chittimalli, Taweesup Apiwattanapong, Alessan- dro Orso, and Mary Jean Harrold. 2008. Test-Suite Augmentation for Evolving Software. In2008 23rd IEEE/ACM International Conference on Automated Software Engineering. 218–227. doi:10.1109/ASE.2008.32

work page doi:10.1109/ase.2008.32 2008
[56]

Muhammad Shahid and Suhaimi Ibrahim. 2011. An evaluation of test coverage tools in software testing. In2011 International Conference on Telecommunication Technology and Applications Proc. of CSIT, Vol. 5. sn

2011
[57]

Muhammad Shahid, Suhaimi Ibrahim, and Mohd Naz’ri Mahrin. 2011. A study on test coverage in software testing.Advanced Informatics School (AIS), Uni- versiti Teknologi Malaysia, International Campus, Jalan Semarak, Kuala Lumpur, Malaysia1 (2011)

2011
[58]

Minqi Shao and Jianjun Zhao. 2026. Assessing Superposition-Targeted Cov- erage Criteria for Quantum Neural Networks. doi:10.48550/arXiv.2411.02450 arXiv:2411.02450 [quant-ph]

work page doi:10.48550/arxiv.2411.02450 2026
[59]

Peter W. Shor. 1999. Polynomial-Time Algorithms for Prime Factoriza- tion and Discrete Logarithms on a Quantum Computer.SIAM Rev.41, 2 (1999), 303–332. arXiv:https://doi.org/10.1137/S0036144598347011 doi:10.1137/ S0036144598347011

work page doi:10.1137/s0036144598347011 1999
[60]

Praveen Ranjan Srivastava. 2008. Test case prioritization.Journal of Theoretical & Applied Information Technology4, 3 (2008)

2008
[61]

Andrew Steane. 1998. Quantum computing.Reports on Progress in Physics61, 2 (Feb. 1998), 117. doi:10.1088/0034-4885/61/2/002

work page doi:10.1088/0034-4885/61/2/002 1998
[62]

Wei Tang, Teague Tomesh, Martin Suchara, Jeffrey Larson, and Margaret Martonosi. 2021. CutQC: using small Quantum computers for large Quantum circuit evaluations. InProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems(Virtual, USA)(ASPLOS ’21). Association for Computing Machinery, Ne...

work page doi:10.1145/3445814.3446758 2021
[63]

2026.Veri-Q issues #5

Veri-Q. 2026.Veri-Q issues #5. https://github.com/Veri-Q/Benchmark/issues/5

2026
[64]

2026.Veri-Q issues #6

Veri-Q. 2026.Veri-Q issues #6. https://github.com/Veri-Q/Benchmark/issues/6

2026
[65]

Xinyi Wang, Paolo Arcaini, Tao Yue, and Shaukat Ali. 2021. Quito: a Coverage- Guided Test Generator for Quantum Programs. In2021 36th IEEE/ACM In- ternational Conference on Automated Software Engineering (ASE). 1237–1241. doi:10.1109/ASE51524.2021.9678798

work page doi:10.1109/ase51524.2021.9678798 2021
[66]

Xinyi Wang, Tongxuan Yu, Paolo Arcaini, Tao Yue, and Shaukat Ali. 2022. Mutation-based test generation for quantum programs with multi-objective search. InProceedings of the Genetic and Evolutionary Computation Conference (Boston, Massachusetts)(GECCO ’22). Association for Computing Machinery, New York, NY, USA, 1345–1353. doi:10.1145/3512290.3528869

work page doi:10.1145/3512290.3528869 2022
[67]

Wohlin, P

C. Wohlin, P. Runeson, M. Höst, M.C. Ohlsson, B. Regnell, and A. Wesslén. 2012. Experimentation in Software Engineering. Springer Berlin Heidelberg

2012
[68]

Shangzhou Xia, Jianjun Zhao, Fuyuan Zhang, and Xiaoyu Guo. 2025. Quantum Concolic Testing.Proc. ACM Softw. Eng.2, ISSTA (June 2025), ISSTA051:1146– ISSTA051:1166. doi:10.1145/3728926

work page doi:10.1145/3728926 2025
[69]

Fan, Andrew H

Zhenyu Yang, James Z. Fan, Andrew H. Proppe, F. Pelayo García de Arquer, David Rossouw, Oleksandr Voznyy, Xinzheng Lan, Min Liu, Grant Walters, Rafael Quintero-Bermudez, Bin Sun, Sjoerd Hoogland, Gianluigi A. Botton, Shana O. Kelley, and Edward H. Sargent. 2017. Mixed-quantum-dot solar cells.Nature Com- munications8, 1 (Nov. 2017), 1325. doi:10.1038/s4146...

work page doi:10.1038/s41467-017-01362-1 2017
[70]

Yanofsky and Mirco A

Noson S. Yanofsky and Mirco A. Mannucci. 2008.Quantum Computing for Computer Scientists. Cambridge University Press. Google-Books-ID: U1chAwAAQBAJ

2008
[71]

Jiaming Ye, Shangzhou Xia, Fuyuan Zhang, Paolo Arcaini, Lei Ma, Jianjun Zhao, and Fuyuki Ishikawa. 2023. QuraTest: Integrating Quantum Specific Features in Quantum Program Testing. In2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1149–1161. doi:10.1109/ASE56229.2023. 00196

work page doi:10.1109/ase56229.2023 2023
[72]

R.K. Yin. 2009.Case Study Research: Design and Methods. SAGE Publications

2009
[73]

Pengzhan Zhao, Zhongtao Miao, Shuhan Lan, and Jianjun Zhao. 2023. Bugs4Q: A benchmark of existing bugs to enable controlled testing and debugging studies for quantum programs.Journal of Systems and Software205 (2023), 111805. doi:10.1016/j.jss.2023.111805

work page doi:10.1016/j.jss.2023.111805 2023
[74]

Hong Zhu, Patrick A. V. Hall, and John H. R. May. 1997. Software unit test coverage and adequacy.ACM Comput. Surv.29, 4 (dec 1997), 366–427. doi:10. 1145/267580.267590

work page arXiv 1997