arxiv: 2604.08367 · v2 · submitted 2026-04-09 · 🪐 quant-ph

Recognition: 2 theorem links

· Lean Theorem

Per-Shot Evaluation of QAOA on Max-Cut: A Black-Box Implementation Comparison with Goemans-Williamson

Evgenii Dolzhkov , Franz G. Fuchs , Dirk Oliver Theis

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:03 UTC · model grok-4.3

classification 🪐 quant-ph

keywords QAOAMax-CutGoemans-Williamsonblack-box optimizationquantum computingcombinatorial optimizationper-shot evaluation

0 comments

The pith

A per-shot statistical framework allows probabilistic comparisons of black-box QAOA to the Goemans-Williamson algorithm on Max-Cut.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts an empirical study of QAOA applied to the Max-Cut problem, using the Goemans-Williamson algorithm as the classical reference point. It deliberately evaluates QAOA implementations in their default, untuned state to mirror how typical users would apply them without expert intervention. Graphs for testing come from established generation models that reflect real structures instead of contrived cases. The core analytical tool is a per-shot framework that records how solution quality evolves with each additional circuit execution. This setup supports probabilistic judgments about when QAOA starts to outperform the classical method's expected value or lower bound.

Core claim

The authors establish that a per-shot statistical framework can track the quality of QAOA solutions as a function of the number of circuit executions. This framework supports probabilistic comparisons with the Goemans-Williamson algorithm by checking how often and after how many shots QAOA exceeds the classical expectation and lower bound on graphs from standard generation models. The evaluation uses black-box QAOA with default parameters to reflect realistic deployment.

What carries the argument

The per-shot statistical framework, which tracks QAOA output quality versus the number of circuit executions to enable probabilistic comparisons with classical baselines.

If this is right

QAOA can be compared to classical solvers under conditions that match non-expert usage.
Performance assessments become probabilistic, indicating the likelihood of surpassing GW baselines.
Results highlight current limitations of default QAOA for practical Max-Cut solving.
The framework can be used to evaluate other quantum optimization approaches in similar settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Applying this per-shot method to other problems like TSP or graph coloring could reveal similar patterns in quantum-classical trade-offs.
Hardware with faster circuit execution might make QAOA competitive sooner if the framework is adopted for decision-making.
Future QAOA variants could incorporate automatic parameter adjustment to improve default performance without user tuning.

Load-bearing premise

QAOA is treated as a black-box using only default parameter settings with no manual fine-tuning or instance-specific optimization.

What would settle it

Demonstrating that default-parameter QAOA exceeds the Goemans-Williamson lower bound in a high percentage of shots on the benchmark graphs after few executions would contradict the reported limitations.

read the original abstract

The Quantum Approximate Optimization Algorithm (QAOA) has emerged as a promising approach for addressing combinatorial optimization problems on near-term quantum hardware. In this work, we conduct an empirical evaluation of QAOA on the Max-Cut problem, using the Goemans-Williamson (GW) algorithm as a classical baseline for comparison. Unlike many prior studies, our methodology treats QAOA implementations as black-box optimizers, relying solely on default parameter settings without manual fine-tuning. We evaluate specific off-the-shelf QAOA implementations under default settings, not the algorithmic potential of QAOA with optimized parameters. This reflects a more realistic use case for end users who may lack the resources or expertise for instance-specific optimization. To facilitate fair and informative evaluation, we construct benchmark instances using well-known graph generation models that emulate practical graph structures, avoiding synthetic constructions tailored to either quantum or classical algorithms. A central component of our analysis is a per-shot statistical framework, which tracks the quality of QAOA outputs as a function of the number of circuit executions. This enables probabilistic comparisons with the GW algorithm by examining when and how frequently QAOA surpasses classical performance baselines such as the GW expectation and lower bound. Our results provide insight into the practical applicability of QAOA for Max-Cut and highlight its current limitations, offering a framework that can guide the assessment and development of future QAOA implementations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The per-shot black-box QAOA vs GW comparison on practical Max-Cut graphs is a useful empirical step, though unspecified defaults limit its impact.

read the letter

The main thing here is a per-shot statistical framework for comparing default QAOA runs on Max-Cut against the Goemans-Williamson baseline. They generate graphs from standard models that aim to look like real structures rather than hand-crafted test cases, then track how often QAOA outputs beat the classical expectation or lower bound as the number of shots grows. This setup lets them make probabilistic statements about practical performance without assuming anyone will tune parameters per instance. That focus on black-box, off-the-shelf use is the clearest novelty relative to earlier QAOA papers that usually optimize or use synthetic graphs. They get credit for keeping the evaluation honest about what an end user without tuning resources would actually see. The methodology outline is coherent and avoids circular claims or invented entities, grounding everything in external classical results and graph generators. The soft spot is exactly the one the stress-test flags: the paper never lists the concrete defaults. No values appear for layer depth p, the classical optimizer, initial parameter distribution, or convergence rules. QAOA cut-value distributions are known to shift with those choices, so the reported frequencies of surpassing GW thresholds could be tied to whatever specific implementation they picked. Without that information, the claim that the results reflect a realistic use case is hard to assess or reproduce. This is a moderate rather than minor issue because it sits at the center of their argument. The work is aimed at quantum optimization researchers who care about near-term benchmarking rather than theoretical bounds. A reader interested in practical evaluation frameworks would get value from the per-shot idea and graph construction approach. It deserves peer review because the core empirical question is worth referee time to push for full parameter disclosure and perhaps additional checks on robustness across a few more defaults or graph families.

Referee Report

1 major / 1 minor

Summary. The manuscript presents an empirical per-shot statistical evaluation of QAOA applied to Max-Cut, treating specific off-the-shelf implementations as black-box optimizers under default (untuned) parameter settings. It compares the frequency with which QAOA shot outputs exceed the Goemans-Williamson expectation and lower bound, using graphs generated from standard models, and introduces a framework that tracks performance as a function of circuit executions.

Significance. If the experimental protocol is fully specified, the per-shot framework and black-box evaluation strategy would provide a useful, realistic benchmark for assessing QAOA applicability on near-term hardware without expert tuning; the explicit use of well-known graph generators and probabilistic comparison to a classical baseline are strengths that could guide practical development.

major comments (1)

[Methodology / Experimental Setup] The central claim that the per-shot framework enables reproducible probabilistic comparisons under a 'realistic use case' for black-box QAOA rests on the use of fixed, default settings. However, the manuscript provides no explicit values or references for QAOA depth p, the classical optimizer, initial-parameter distribution, or convergence criteria (see abstract and the methodology description of 'specific off-the-shelf QAOA implementations under default settings'). Because QAOA output distributions are known to be sensitive to these choices, the reported frequencies of surpassing GW thresholds cannot be verified or generalized without this information.

minor comments (1)

[Abstract] The abstract refers to 'well-known graph generation models' without naming them (e.g., Erdős–Rényi, Barabási–Albert, or others); explicit citation or description in the main text would improve reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The major comment concerns the lack of explicit details on default QAOA parameters, which we address below by committing to revisions that enhance reproducibility while preserving the black-box evaluation focus.

read point-by-point responses

Referee: The central claim that the per-shot framework enables reproducible probabilistic comparisons under a 'realistic use case' for black-box QAOA rests on the use of fixed, default settings. However, the manuscript provides no explicit values or references for QAOA depth p, the classical optimizer, initial-parameter distribution, or convergence criteria (see abstract and the methodology description of 'specific off-the-shelf QAOA implementations under default settings'). Because QAOA output distributions are known to be sensitive to these choices, the reported frequencies of surpassing GW thresholds cannot be verified or generalized without this information.

Authors: We agree that the current manuscript does not provide explicit numerical values or library references for parameters such as QAOA depth p, the classical optimizer (e.g., its name and settings), initial parameter distribution, or convergence criteria. Our methodology intentionally evaluates specific off-the-shelf implementations in their out-of-the-box default configurations to reflect realistic end-user scenarios without expert tuning. However, this does limit independent verification. In the revised version, we will add a dedicated subsection (and table) that fully specifies the exact defaults used, including the software library/version, p value, optimizer details, initialization method, and stopping criteria, with appropriate citations. This addition will enable reproducibility of the reported per-shot frequencies without changing the black-box, untuned nature of the study or the core claims. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical black-box comparison

full rationale

The paper conducts an empirical evaluation of off-the-shelf QAOA implementations on Max-Cut graphs, comparing per-shot output quality against the Goemans-Williamson algorithm using external graph generators and classical baselines. No derivations, equations, fitted parameters, or predictions are present that could reduce to inputs by construction. The per-shot statistical framework is a direct counting procedure on observed cut values; it does not rely on self-citations, ansatzes, or uniqueness theorems. All performance claims are grounded in reproducible runs against independently implemented GW and standard graph models, satisfying the criteria for a self-contained, non-circular empirical study.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two domain assumptions about realism of default settings and graph models; no free parameters or invented entities are introduced.

axioms (2)

domain assumption Default QAOA parameter settings without fine-tuning represent realistic end-user scenarios.
Explicitly stated as the basis for treating implementations as black-box optimizers.
domain assumption Well-known graph generation models emulate practical graph structures.
Used to construct benchmark instances that avoid synthetic constructions tailored to either quantum or classical algorithms.

pith-pipeline@v0.9.0 · 5556 in / 1424 out tokens · 72884 ms · 2026-05-10T18:03:37.276440+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A central component of our analysis is a per-shot statistical framework, which tracks the quality of QAOA outputs as a function of the number of circuit executions... P100−ϵ(s) = (100−ϵ)-th percentile of {C(s)max across all R runs}
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

our methodology treats QAOA implementations as black-box optimizers, relying solely on default parameter settings without manual fine-tuning

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 25 canonical work pages · 3 internal anchors

[1]

and Williamson, David P

Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisﬁability problems using semideﬁnite pro- gramming. Journal of the ACM (JACM) 42(6), 1115–1145 (1995) https://doi.org/10.1145/227683.227684

work page doi:10.1145/227683.227684 1995
[2]

SIAM Journal on Discrete Mathematics 15(1), 58–72 (2001) https://doi.org/10.1137/S089548010037971 3

Alon, N., Sudakov, B., Zwick, U.: Constructing worst case instanc es for semidef- inite programming based approximation algorithms. SIAM Journal on Discrete Mathematics 15(1), 58–72 (2001) https://doi.org/10.1137/S089548010037971 3

work page doi:10.1137/s089548010037971 2001
[3]

In: Goldwasser, M.H., Johnson, D.S., McGeoch, C.C

Johnson, D.S.: A theoretician’s guide to the experimental analysis of algorithms. In: Goldwasser, M.H., Johnson, D.S., McGeoch, C.C. (eds.) Data Stru ctures, Near Neighbor Searches, and Methodology: Fifth and Sixth DIMACS Imple mentation Challenges. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 59, pp. 215–250. American Math...

work page doi:10.1090/dimacs/059/11 2002
[4]

arXiv preprint arXiv:2206.0920 4 (2022) https://doi.org/10.48550/arXiv.2206.09204

Hsieh, J.-T., Kothari, P.K.: Approximating max-cut on bounded deg ree graphs: Tighter analysis of the fkl algorithm. arXiv preprint arXiv:2206.0920 4 (2022) https://doi.org/10.48550/arXiv.2206.09204

work page doi:10.48550/arxiv.2206.09204 2022
[5]

INFORMS Journal on Co mputing 30(3), 608–624 (2018) https://doi.org/10.1287/ijoc.2017.0798

Dunning, I., Gupta, S., Silberholz, J.: What works best when? a sys tematic eval- uation of heuristics for max-cut and qubo. INFORMS Journal on Co mputing 30(3), 608–624 (2018) https://doi.org/10.1287/ijoc.2017.0798

work page doi:10.1287/ijoc.2017.0798 2018
[6]

Quantum Infor mation Processing 20(9) (2021) https://doi.org/10.1007/s11128-021-03232-8

Herrman, R., Treﬀert, L., Ostrowski, J., Lotshaw, P., Humble, T., Siopsis, G.: Impact of graph structures for qaoa on maxcut. Quantum Infor mation Processing 20(9) (2021) https://doi.org/10.1007/s11128-021-03232-8

work page doi:10.1007/s11128-021-03232-8 2021
[7]

A Quantum Approximate Optimization Algorithm

Farhi, E., Goldstone, J., Gutmann, S.: A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028 (2014) https://doi.org/10.48550/arXiv.1411.4028

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1411.4028 2014
[8]

Performance of the quantum approximate optimization algorithm on the maximum cut problem,

Crooks, G.E.: Performance of the quantum approximate optimiza tion algo- rithm on the maximum cut problem. arXiv preprint arXiv:1811.08419 (2 018) https://doi.org/10.48550/arXiv.1811.08419

work page doi:10.48550/arxiv.1811.08419
[9]

Farhi and A

Farhi, E., Harrow, A.W.: Quantum supremacy through the quantu m approximate optimization algorithm. arXiv preprint arXiv:1602.07674 (2016) https://doi.org/10.48550/arXiv.1602.07674

work page doi:10.48550/arxiv.1602.07674 2016
[10]

Columb ia University, New York, NY (2018)

Hadﬁeld, S.: Quantum Algorithms for Scientiﬁc Computing. Columb ia University, New York, NY (2018). https://doi.org/10.48550/arXiv.1805.03265

work page doi:10.48550/arxiv.1805.03265 2018
[11]

Evidence of Scaling Advantage for the Quantum Approximate Optimization Algorithm on a Classically Intractable Problem

Shaydulin, R., Li, C., Chakrabarti, S., DeCross, M., Herman, D., Ku mar, 17 N., Larson, J., Lykov, D., Minssen, P., Sun, Y., et al. : Evidence of scal- ing advantage for the quantum approximate optimization algorithm o n a classically intractable problem. Science Advances 10(22), 6761 (2024) https://doi.org/10.1126/sciadv.adm6761

work page doi:10.1126/sciadv.adm6761 2024
[12]

Towards a universal qaoa protocol: Evidence of a scaling advantage in solving some combinatorial optimization problems,

Montanez-Barrera, J.A., Michielsen, K.: Towards a universal QA OA protocol: Evidence of a scaling advantage in solving some combinatorial optimiza tion problems (2024). https://doi.org/10.48550/arXiv.2405.09169

work page doi:10.48550/arxiv.2405.09169 2024
[13]

Physical Review A 103(4), 042612 (2021) https://doi.org/10.1103/PhysRevA.103.042612

Wurtz, J., Love, P.: Maxcut quantum approximate optimization a lgorithm per- formance guarantees for p > 1. Physical Review A 103(4), 042612 (2021) https://doi.org/10.1103/PhysRevA.103.042612

work page doi:10.1103/physreva.103.042612 2021
[14]

Scientiﬁc reports 9(1), 6903 (2019) https://doi.org/10.1038/s41598-019-43176-9

Guerreschi, G.G., Matsuura, A.Y.: Qaoa for max-cut requires hu ndreds of qubits for quantum speed-up. Scientiﬁc reports 9(1), 6903 (2019) https://doi.org/10.1038/s41598-019-43176-9

work page doi:10.1038/s41598-019-43176-9 2019
[15]

Presentstatusand future challenges of non-interferometric tests of collapse models.Nature Phys., 18(3):243–250, 2022

Harrigan, M.P., Sung, K.J., Neeley, M., Satzinger, K.J., Arute, F., A rya, K., Atalaya, J., Bardin, J.C., Barends, R., Boixo, S., et al. : Quantum approxi- mate optimization of non-planar graph problems on a planar superco nducting processor. Nature Physics 17(3), 332–336 (2021) https://doi.org/10.1038/s41567- 020-01105-y

work page doi:10.1038/s41567- 2021
[16]

arXiv preprint arXiv:2407.06421 (2024) https://doi.org/10.48550/arXiv.2407.06421

Perez-Ramirez, D.F.: Variational quantum algorithms for com- binatorial optimization. arXiv preprint arXiv:2407.06421 (2024) https://doi.org/10.48550/arXiv.2407.06421

work page doi:10.48550/arxiv.2407.06421 2024
[17]

Nakanishi, Kosuke Mitarai, Ryosuke Imai, Shiro Tamiya, Takahiro Yamamoto, Tennin Yan, Toru Kawakubo, Yuya O

Marwaha, K.: Local classical max-cut algorithm outperforms p = 2 qaoa on high- girth regular graphs. Quantum 5, 437 (2021) https://doi.org/10.22331/q-2021- 04-20-437

work page doi:10.22331/q-2021- 2021
[19]

Accelerating MPI allreduce communication with efficient gpu-based, compression schemes on modern GPU clusters,

Keller, C.M., Eidenbenz, S., B¨ artschi, A., O’Malley, D., Golden, J., Mis ra, S.: Hierarchical multigrid ansatz for variational quantum algorithms. I n: ISC High Performance 2024 Research Paper Proceedings (39th Internat ional Conference), pp. 1–11 (2024). https://doi.org/10.23919/ISC.2024.10528934

work page doi:10.23919/isc.2024.10528934 2024
[20]

arXiv pre print arXiv:2306.00494 (2023) https://doi.org/10.48550/arXiv.2306.004 94

Ponce, M., Herrman, R., Lotshaw, P.C., Powers, S., Siopsis, G., Hu mble, T., Ostrowski, J.: Graph decomposition techniques for solving combin atorial optimization problems with variational quantum algorithms. arXiv pre print arXiv:2306.00494 (2023) https://doi.org/10.48550/arXiv.2306.004 94

work page doi:10.48550/arxiv.2306.004 2023
[21]

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

Bucher, D., Kraus, N., Blenninger, J., Lachner, M., Stein, J., Linn hoﬀ-Popien, C.: Towards robust benchmarking of quantum optimization algorithm s. arXiv 18 preprint arXiv:2405.07624 (2024) https://doi.org/10.48550/arXiv .2405.07624

work page internal anchor Pith review doi:10.48550/arxiv 2024
[22]

Benchmarking the quantum approximate optimization algorithm,

Willsch, M., Willsch, D., Jin, F., De Raedt, H., Michielsen, K.: Benchmark ing the quantum approximate optimization algorithm. Quantum Information Processing 19(7), 197 (2020) https://doi.org/10.1007/s11128-020-02692-8

work page doi:10.1007/s11128-020-02692-8 2020
[23]

arXiv preprint arXiv:2410.1562 6 (2024) https://doi.org/10.48550/arXiv.2410.15626

Patwardhan, I., Akkapelli, A.: Hybrid quantum-hpc solutions for max-cut: Bridg- ing classical and quantum algorithms. arXiv preprint arXiv:2410.1562 6 (2024) https://doi.org/10.48550/arXiv.2410.15626

work page doi:10.48550/arxiv.2410.15626 2024
[24]

Quantum Computing in the NISQ era and beyond.Quantum, 2:79, August 2018

Preskill, J.: Quantum computing in the nisq era and beyond. Quant um 2, 79 (2018) https://doi.org/10.22331/q-2018-08-06-79

work page internal anchor Pith review doi:10.22331/q-2018-08-06-79 2018
[25]

In: 2019 Tenth International Gr een and Sustainable Computing Conference (IGSC), pp

Shaydulin, R., Alexeev, Y.: Evaluating quantum approximate opti- mization algorithm: A case study. In: 2019 Tenth International Gr een and Sustainable Computing Conference (IGSC), pp. 1–6 (2019). https://doi.org/10.1109/IGSC48788.2019.8957201

work page doi:10.1109/igsc48788.2019.8957201 2019
[26]

Khot, S., Kindler, G., Mossel, E., O’Donnell, R.: Optimal inapproximab ility results for max-cut and other 2-variable csps? SIAM Journal on C omputing 37(1), 319–357 (2007) https://doi.org/10.1137/S0097539705447372

work page doi:10.1137/s0097539705447372 2007
[27]

Version 2.0.0

Fuchs, F.G.: QAOA: A Modular Python Library for the Quantum Approximate Optimization Algorithm. Version 2.0.0. https://github.com/OpenQuantumComputing/QAOA 19