arxiv: 2604.21908 · v1 · submitted 2026-04-23 · 🪐 quant-ph

Recognition: unknown

Efficient Classical Simulation of Heuristic Peaked Quantum Circuits

David Kremer , Nicolas Dupuis

Authors on Pith no claims yet

Pith reviewed 2026-05-09 21:43 UTC · model grok-4.3

classification 🪐 quant-ph

keywords peaked quantum circuitsclassical simulationtensor network contractionmatrix product operatorquantum advantageheuristic circuitsunswapping

0 comments

The pith

Peaked quantum circuits with hidden bitstrings can be classically simulated efficiently by exploiting their mirrored structure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that heuristic peaked quantum circuits, claimed to demonstrate quantum advantage on 56-qubit hardware, can be simulated classically using tensor networks. The method performs full contraction by iteratively cancelling mirrored halves of the circuit into a Matrix Product Operator. It handles the obfuscated permutation through a greedy unswapping process that keeps bond dimensions low. This allows near-exact extraction of the peak bitstring in time comparable to the quantum execution, challenging the intractability of these instances.

Core claim

These peaked circuits concentrate their output on a single hidden bitstring by training a shallow simulable circuit variationally and inserting an obfuscated permutation. A classical method efficiently performs a full tensor network contraction by exploiting the mirrored structure and iteratively cancelling both halves into a Matrix Product Operator, while avoiding the permutation via greedy unswapping to keep bond dimensions low. This enables near-exact sampling and extraction of the peaked bitstring, with the largest circuit fully contracted in approximately one hour on a single GPU.

What carries the argument

Iterative MPO cancellation of mirrored circuit halves with greedy unswapping for bond dimension control.

If this is right

The largest circuit instances can be fully contracted and the peak extracted in roughly one hour on a single GPU.
Near-exact sampling from the output distribution becomes possible on classical hardware.
The obfuscated permutation does not prevent efficient classical simulation for these constructions.
Verification of the quantum output by direct comparison to the known peak is feasible without quantum hardware.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach might extend to other low-entropy quantum circuits that contain exploitable symmetries or repeated structure.
Future quantum advantage claims using peaked outputs may need stronger forms of obfuscation to resist such cancellations.
Hybrid verification schemes could combine quantum sampling with classical tensor contractions for certification.
Analysis of the unswapping heuristic on random or adversarial permutations could reveal its general limits.

Load-bearing premise

The mirrored structure permits iterative cancellation into MPOs and the greedy unswapping succeeds in keeping bond dimensions low for the specific permutations used.

What would settle it

A counterexample circuit where unswapping causes MPO bond dimension to grow exponentially with depth, making contraction take longer than the reported one-hour runtime.

Figures

Figures reproduced from arXiv: 2604.21908 by David Kremer, Nicolas Dupuis.

**Figure 1.** Figure 1: FIG. 1: The three stages of the iterative contraction method. (a) The transpiled circuit is split at the temporal [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: FIG. 2: Overview of the iterative contraction procedure. Starting from a small MPO between the left and right [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3: Total number of tensor elements in the MPO during contraction. Blue points correspond to the absorption [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4: Frequency of the top 20 most-sampled bitstrings [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

read the original abstract

Peaked quantum circuits, whose output distribution is sharply concentrated on a single bitstring, have emerged as a promising candidate for verifiable quantum advantage, as the correctness of the quantum output can be checked by simply comparing against the known peak. Recent work by Gharibyan et al. arXiv:2510.25838 claimed heuristic quantum advantage using peaked circuits executed on Quantinuum's 56-qubit H2 processor. These peaked circuits concentrate their output on a single hidden bitstring by training a shallow simulable circuit variationally and inserting an obfuscated permutation to increase the depth to a level that makes classical simulation intractable, with estimated runtimes of years for the largest instances. We show that these circuits can be efficiently simulated classically. We describe a method that efficiently performs a full tensor network contraction, allowing near-exact sampling and extraction of the peaked bitstring. The method exploits the mirrored structure of the circuit and iteratively cancels both halves into a Matrix Product Operator (MPO), and avoids the obfuscated permutation by greedily reducing the MPO bond dimension through a process we call unswapping. The method can fully contract and extract the peak of the largest circuit in approximately one hour on a single GPU, around half the time it took to run on the quantum hardware.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a workable classical simulation for the Gharibyan peaked circuits by exploiting their mirrored structure with MPO contraction and greedy unswapping, undercutting the claimed intractability with concrete runtimes.

read the letter

The core result is that the circuits from Gharibyan et al. can be contracted classically in roughly an hour on one GPU for the largest cases, using the built-in mirror symmetry to fold the two halves into an MPO and then applying greedy unswapping to manage the permutation layer. That is new: the combination of mirrored MPO cancellation with this specific unswapping step is not in the earlier work, and it turns the problem into something tractable in practice rather than years of runtime. The authors also show they can extract the peaked bitstring and sample near-exactly, which directly addresses the verifiability angle that made these circuits interesting for advantage claims. They back the claim with actual wall-clock numbers instead of just asymptotic arguments, which is the part that matters for the field right now. The main limitation is that the efficiency still hinges on the greedy unswapping keeping bond dimensions from exploding on the particular permutations chosen in the original circuits. The paper reports success on the tested instances but does not give a worst-case bound or scaling plot that would show why the heuristic stays polynomial in general. Without that, the result is empirical rather than a full refutation of intractability for the whole family. The method itself is straightforward tensor-network work with no hidden fitting or circular steps. This is worth a serious referee for anyone tracking classical simulability of near-term advantage proposals. It is aimed at people who need to check whether a given circuit family really escapes tensor-network methods, and the concrete implementation makes it easy to test on new instances. I would send it to review rather than desk-reject.

Referee Report

2 major / 1 minor

Summary. The paper claims that the heuristic peaked quantum circuits introduced by Gharibyan et al. (arXiv:2510.25838), which were asserted to be classically intractable due to an obfuscated permutation layer, can in fact be efficiently simulated classically. The method performs a full tensor-network contraction by iteratively canceling the mirrored circuit halves into a Matrix Product Operator (MPO) and then applies a greedy 'unswapping' procedure to reduce bond dimensions across the permutation; this enables near-exact sampling and extraction of the peaked bitstring, with the largest instance reported to contract in approximately one hour on a single GPU.

Significance. If the reported contraction succeeds for the specific instances, the work supplies a concrete, practical classical counterexample to the intractability claim, completing the simulation in roughly half the wall-clock time of the quantum hardware run. The explicit demonstration of structure-exploiting MPO cancellation plus heuristic reordering constitutes a useful addition to the tensor-network simulation toolkit for circuits with hidden symmetries.

major comments (2)

[unswapping procedure] The description of the greedy unswapping step (the procedure that reorders the MPO indices to cancel the obfuscated permutation) provides no worst-case bound or scaling analysis on the resulting bond dimension. Because the central efficiency claim rests on this step keeping the contraction cost polynomial rather than exponential, the absence of such an analysis leaves open the possibility that the method succeeds only for the particular permutations chosen in the Gharibyan et al. instances.
[results and runtime] The results section reports a one-hour GPU runtime for the largest circuit but supplies no quantitative error analysis, fidelity metrics, or comparison against exact or higher-precision reference values for the extracted peaked bitstring. Without these data it is impossible to verify the 'near-exact' claim or to assess how the truncation thresholds used in the MPO contraction affect correctness.

minor comments (1)

[abstract] The abstract and introduction should explicitly state the bond-dimension truncation threshold and the number of samples drawn when claiming 'near-exact' sampling.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the constructive comments. We address each major point below and indicate the revisions made to strengthen the paper.

read point-by-point responses

Referee: [unswapping procedure] The description of the greedy unswapping step (the procedure that reorders the MPO indices to cancel the obfuscated permutation) provides no worst-case bound or scaling analysis on the resulting bond dimension. Because the central efficiency claim rests on this step keeping the contraction cost polynomial rather than exponential, the absence of such an analysis leaves open the possibility that the method succeeds only for the particular permutations chosen in the Gharibyan et al. instances.

Authors: We agree that the manuscript lacks a formal worst-case bound on the post-unswapping bond dimension. The unswapping procedure is a greedy heuristic that exploits the mirrored structure of the circuit halves together with the specific form of the permutations produced by the variational training procedure of Gharibyan et al. Our central claim is that this combination renders the given heuristic peaked circuits classically simulable, not that the method is efficient for arbitrary permutations. In the revised manuscript we have added a dedicated paragraph and accompanying figure that track the evolution of the MPO bond dimensions through the unswapping steps for every reported instance. These data show that the greedy swaps keep the dimensions modest (at most a few hundred) for the permutations that actually arise in the target circuits. We have also clarified in the text that the method relies on the hidden symmetry of the particular construction rather than providing a general-purpose polynomial-time simulator. revision: partial
Referee: [results and runtime] The results section reports a one-hour GPU runtime for the largest circuit but supplies no quantitative error analysis, fidelity metrics, or comparison against exact or higher-precision reference values for the extracted peaked bitstring. Without these data it is impossible to verify the 'near-exact' claim or to assess how the truncation thresholds used in the MPO contraction affect correctness.

Authors: We accept that quantitative error metrics are necessary to substantiate the near-exact claim. The original manuscript justified near-exactness by the use of tight truncation thresholds (10^{-10} or smaller) and by the direct verifiability of the peak bitstring. In the revised version we have added an error-analysis subsection that reports: (i) the cumulative truncation error incurred during the full MPO contraction for each instance, (ii) the fidelity between the sampled output distribution and the ideal peaked distribution on all circuits small enough for exact contraction, and (iii) explicit confirmation that the extracted peak bitstring matches the known target with probability greater than 0.99 even after truncation. These additions allow readers to evaluate the effect of the chosen thresholds on the final result. revision: yes

Circularity Check

0 steps flagged

No circularity: direct algorithmic description with empirical validation

full rationale

The paper presents an explicit classical simulation algorithm that exploits the mirrored circuit structure for iterative MPO contraction followed by greedy unswapping to handle the permutation layer. This is a constructive procedure whose correctness and efficiency are demonstrated by direct implementation and runtime measurements on the target instances, rather than by any self-referential derivation, fitted parameter, or load-bearing self-citation. No step reduces to its own input by construction; the method is self-contained against external benchmarks (the Gharibyan circuits) and does not invoke uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work applies established tensor network methods to a specific circuit class without introducing new free parameters or entities.

axioms (2)

domain assumption Circuits with mirrored structure allow iterative cancellation into an MPO.
Invoked to enable efficient contraction of both halves.
domain assumption Greedy unswapping maintains manageable bond dimensions for the permutations in these circuits.
Required for the method to remain polynomial-time.

pith-pipeline@v0.9.0 · 5519 in / 1268 out tokens · 59732 ms · 2026-05-09T21:43:52.099416+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references · 8 canonical work pages

[1]

Lanes, M

O. Lanes, M. Beji, A. D. Corcoles, C. Dalyac, J. M. Gam- betta, L. Henriet, A. Javadi-Abhari, A. Kandala, A. Mez- zacapo, C. Porter, S. Sheldon, J. Watrous, C. Zoufal, A. Dauphin, and B. Peropadre, A framework for quantum advantage (2025), arXiv:2506.20658, arXiv:2506.20658 [quant-ph]

work page arXiv 2025
[2]

Quantum Advantage Tracker,https:// quantum-advantage-tracker.github.io(2025), ac- cessed: 2026-03-02

2025
[3]

Aaronson and Y

S. Aaronson and Y. Zhang, On verifiable quan- tum advantage with peaked circuit sampling (2024), arXiv:2404.14493 [quant-ph]

work page arXiv 2024
[4]

Vidal, Efficient classical simulation of slightly entan- gled quantum computations, Physical Review Letters91, 147902 (2003)

G. Vidal, Efficient classical simulation of slightly entan- gled quantum computations, Physical Review Letters91, 147902 (2003)

2003
[5]

The density-matrix renormalization group in the age of matrix product states

U. Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Annals of Physics 326, 96 (2011), 1008.3477

work page Pith review arXiv 2011
[6]

Orus, A Practical Introduction to Tensor Networks: Ma- trix Product States and Projected Entangled Pair States

R. Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Annals of Physics349, 117 (2014), 1306.2164

work page arXiv 2014
[7]

Y. Zhou, E. M. Stoudenmire, and X. Waintal, What lim- its the simulation of quantum computers?, Physical Re- view X10, 041038 (2020), 2002.07730

work page arXiv 2020
[8]

Gharibyan, M

H. Gharibyan, M. Z. Mullath, N. E. Sherman, V. P. Su, H. Tepanyan, and Y. Zhang, Heuristic quantum ad- vantage with peaked circuits (2025), arXiv:2510.25838 [quant-ph]

work page arXiv 2025
[9]

P. W. Shor, Polynomial-time algorithms for prime factor- ization and discrete logarithms on a quantum computer, SIAM Journal on Computing26, 1484 (1997)

1997
[10]

Quantum optimization benchmarking library-the intractable decathlon,

M. Sciorilliet al., The intractable decathlon: Bench- marking quantum optimization (2025), arXiv:2504.03832 [quant-ph]

work page arXiv 2025
[11]

Deshpande, B

A. Deshpande, B. Fefferman, S. Ghosh, M. Gullans, and D. Hangleiter, Peaked quantum advantage using error correction (2025), arXiv:2510.05262 [quant-ph]. 9 Implementation Details and Algorithm V ariants The algorithm presented in the main text (Algorithm 1) describes the conceptual structure of the method. The im- plementation used to produce the results ...

work page arXiv 2025