arxiv: 2605.07473 · v1 · submitted 2026-05-08 · 🪐 quant-ph · cond-mat.stat-mech· cs.AI· cs.ET· cs.LG

Recognition: 2 theorem links

· Lean Theorem

Breaking QAOA's Fixed Target Hamiltonian Barrier: A Fully Connected Quantum Boltzmann Machine via Bilevel Optimization

Jun Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:47 UTC · model grok-4.3

classification 🪐 quant-ph cond-mat.stat-mechcs.AIcs.ETcs.LG

keywords fully connected quantum Boltzmann machineQAOAbilevel optimizationnoise robustnesscontrastive divergencequantum machine learningvariational quantum algorithmstarget state preparation

0 comments

The pith

Bilevel optimization on QAOA circuits creates a fully connected quantum Boltzmann machine that reaches target states with one layer and resists noise.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a bilevel training structure for QAOA that builds a fully connected quantum Boltzmann machine. The inner loop runs the standard QAOA circuit to minimize energy for the positive phase, while the outer loop directly optimizes the structural parameters of the target Hamiltonian to perform negative-phase contrastive divergence. This separation lets the model achieve an average 0.9559 probability of measuring the target state with only p=1 under ideal conditions. The same setup maintains the target state as the most probable outcome at realistic and doubled noise levels, and it produces consistent target images even when limited to ten measurement shots per block.

Core claim

By extending the QAOA circuit into a bilevel optimization architecture, the inner loop simulates positive-phase energy minimization through the conventional QAOA process while the outer loop optimizes the structural parameters of the target Hamiltonian to simulate negative-phase contrastive divergence learning, resulting in a fully connected QBM that exhibits superior performance at p=1 with 0.9559 average target-state probability in the noiseless case, retains 0.6047 and 0.3859 probabilities under typical and doubled noise, and generates the target qubit-grid image consistently with only ten shots per block regardless of noise.

What carries the argument

Bilevel optimization architecture on the QAOA circuit, where the inner loop performs positive-phase energy minimization and the outer loop tunes target-Hamiltonian structural parameters to enable negative-phase learning.

If this is right

Only a single QAOA layer suffices for high-fidelity target-state preparation, lowering required circuit depth.
The target state retains the highest measurement probability by a large margin even when noise intensity doubles.
Block-by-block training with ten shots per block produces the desired qubit-grid image under both typical and elevated noise.
The architecture removes the fixed-target-Hamiltonian restriction of standard QAOA while preserving variational training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of positive and negative phases may let similar bilevel structures improve other variational quantum algorithms that currently suffer from limited expressivity.
If the outer-loop optimization scales classically, larger fully connected QBMs could be trained without deepening the quantum circuit.
The demonstrated image-generation stability under low-shot noisy conditions suggests the method could support early hybrid quantum-classical generative tasks on near-term devices.
The noise-robust ranking of the target state implies that post-selection or simple thresholding on measurement outcomes could further improve output quality without extra hardware.

Load-bearing premise

That the outer-loop adjustment of the target Hamiltonian's structural parameters successfully replicates negative-phase contrastive divergence without hidden restrictions imposed by the QAOA ansatz mapping.

What would settle it

Executing the trained model on hardware at doubled noise intensity and finding that any non-target state overtakes the target in measurement probability, or that the target probability falls below 0.3 while losing its clear lead, would disprove the robustness claim.

Figures

Figures reproduced from arXiv: 2605.07473 by Jun Liu.

**Figure 2.** Figure 2: Architecture of the 4-qubit fully connected quantum Boltzmann machine [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

**Figure 3.** Figure 3: Experimental results of model convergence computation ( [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗

**Figure 4.** Figure 4: Experimental results of model convergence computation ( [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗

**Figure 5.** Figure 5: Experimental results of model convergence computation ( [PITH_FULL_IMAGE:figures/full_fig_p028_5.png] view at source ↗

**Figure 6.** Figure 6: Experimental results of model convergence computation ( [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗

**Figure 7.** Figure 7: Grid image of the target distribution In [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 8.** Figure 8: Results of the image generation experiment [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗

read the original abstract

To overcome the limitations of classical partially connected Boltzmann machines and mainstream quantum Boltzmann machines (QBMs), this work extends the conventional circuit of the quantum approximate optimization algorithm (QAOA) to a bilevel optimization architecture and proposes a fully connected QBM. The inner-loop training simulates positive phase energy minimization based on the computational process of the conventional QAOA circuit, whereas the outer-loop training simulates negative phase contrastive divergence learning by optimizing the structural parameters of the target Hamiltonian. It is found that, first, the model exhibits superior performance using only a single layer (p=1) in the QAOA circuit, with an average probability of 0.9559 in measuring the target quantum state under noiseless conditions. Second, the model exhibits notable noise robustness. Under the typical noise level of current mainstream commercial quantum computing devices, the average probability of measuring the target quantum state reaches 0.6047; when the noise rises to a more stringent level with doubled intensity, this probability remains at 0.3859. In both scenarios, the target quantum state maintains the highest measurement probability among all detected states, with a value several times higher than that of the second-ranked state. This indicates that the model retains strong robustness even when noise meets or exceeds the upper limit of current mainstream commercial quantum computing devices. Third, under a block-by-block learning strategy with p=1 and only 10 measurement shots, the model consistently generates the target "qubit" grid image regardless of noise interference, demonstrating strong robustness in image generation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The bilevel QAOA setup for a fully connected QBM is a fresh split of positive and negative phase training, but the abstract gives no equations or derivations so the central claim that it actually does contrastive divergence stays uncheckable.

read the letter

The one or two things to know upfront: this paper splits QAOA training into an inner loop that runs standard QAOA for positive-phase energy minimization and an outer loop that tunes the structural parameters of the target Hamiltonian to handle negative-phase contrastive divergence, aiming at a fully connected QBM. It reports solid-looking numbers with p=1, including 0.9559 target-state probability noiseless and still 0.3859 under doubled noise, plus consistent image generation with only 10 shots. That architecture is new relative to prior QBM or QAOA work. The paper does well by focusing on practical near-term metrics like noise robustness and low-shot performance, which matter for hardware that actually exists. The block-by-block strategy and the fact that the target state stays several times more probable than the next one are concrete enough to be interesting if they hold. The soft spots are in the missing mechanics. The abstract supplies no equations, no circuit details, and no description of how QAOA outputs are turned into the model expectations needed for proper negative-phase subtraction. QAOA approximates low-energy eigenstates rather than thermal samples from the Boltzmann distribution, so the outer loop may simply be driving the circuit toward the target state without enforcing the partition-function normalization or marginals that define a real QBM. Tuning the Hamiltonian parameters directly also risks circular fitting instead of genuine learning. These gaps make the soundness claims hard to evaluate and leave open whether full connectivity is truly achieved or just asserted. This is for people working on quantum generative models and NISQ algorithms who want concrete ideas for bypassing connectivity limits. A reader already familiar with QAOA and contrastive divergence could pull useful architecture hints from it, but would still need the full derivations to build on the work. It deserves a serious referee because the bilevel split is distinct enough to warrant checking the math and code, even if the paper will likely need substantial revision to close the verification holes.

Referee Report

3 major / 2 minor

Summary. The paper proposes extending the QAOA circuit into a bilevel optimization architecture to train a fully connected quantum Boltzmann machine (QBM). The inner loop performs positive-phase energy minimization using standard QAOA, while the outer loop optimizes the structural parameters of the target Hamiltonian to simulate negative-phase contrastive divergence. The authors claim that with only p=1 QAOA layers the model achieves average target-state measurement probabilities of 0.9559 (noiseless), 0.6047 (typical noise), and 0.3859 (doubled noise intensity), maintains the target state as the highest-probability outcome, and generates target qubit-grid images robustly under noise using a block-by-block strategy with only 10 shots.

Significance. If the bilevel construction can be shown to implement proper contrastive-divergence updates for a fully connected QBM (rather than reducing to direct state preparation), the result would be significant for quantum generative modeling: it would remove the fixed-target-Hamiltonian restriction of conventional QAOA and enable scalable training of dense models on NISQ hardware. The reported single-layer performance and noise tolerance would constitute a practical advance if accompanied by rigorous baselines and sampling analysis.

major comments (3)

[Bilevel optimization architecture] Bilevel optimization description: the claim that outer-loop tuning of target-Hamiltonian structural parameters implements negative-phase contrastive divergence is not supported by explicit update equations or pseudocode showing how model expectations (obtained from QAOA samples) are subtracted from data expectations. Without these, the procedure risks reducing to direct parameter fitting of the target state rather than enforcing partition-function normalization or hidden-unit marginals required for a true QBM.
[QAOA circuit and sampling] Inner-loop QAOA usage: QAOA returns a variational approximation to the ground state of the problem Hamiltonian, not thermal samples from the Boltzmann distribution. The manuscript must clarify how this low-energy eigenstate is converted into the negative-phase expectations for a fully connected model; the current description leaves open whether the learned couplings simply drive the circuit toward the target state without correct thermal averaging.
[Results and performance evaluation] Results section: the numerical performance claims (probabilities 0.9559/0.6047/0.3859 and image-generation success) are presented without specifying qubit count, exact Hamiltonian form, training dataset size, number of independent runs, or comparisons against standard QBM training or other QAOA variants. This prevents assessment of whether the results demonstrate generative modeling or optimized state preparation.

minor comments (2)

[Abstract and methods] The term 'block-by-block learning strategy' is used in the abstract and results without a prior definition or reference to its implementation details; this should be introduced explicitly in the methods with pseudocode or a diagram.
[Notation and preliminaries] Notation for the target Hamiltonian structural parameters and the bilevel objective function should be introduced with explicit equations early in the paper to improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive review. We address each major comment below, indicating revisions that will strengthen the manuscript's clarity and rigor without altering its core claims.

read point-by-point responses

Referee: [Bilevel optimization architecture] Bilevel optimization description: the claim that outer-loop tuning of target-Hamiltonian structural parameters implements negative-phase contrastive divergence is not supported by explicit update equations or pseudocode showing how model expectations (obtained from QAOA samples) are subtracted from data expectations. Without these, the procedure risks reducing to direct parameter fitting of the target state rather than enforcing partition-function normalization or hidden-unit marginals required for a true QBM.

Authors: We agree that explicit mathematical details are needed to distinguish the bilevel procedure from direct state preparation. The revised manuscript will include the full bilevel update equations, showing the inner-loop QAOA minimization of the positive-phase energy and the outer-loop gradient updates on the target Hamiltonian parameters that approximate the negative-phase term via contrastive divergence. Pseudocode will be added to Algorithm 1 to illustrate the subtraction of model expectations (from QAOA samples) from data expectations, along with a brief derivation confirming the partition-function normalization is implicitly handled through the outer optimization. revision: yes
Referee: [QAOA circuit and sampling] Inner-loop QAOA usage: QAOA returns a variational approximation to the ground state of the problem Hamiltonian, not thermal samples from the Boltzmann distribution. The manuscript must clarify how this low-energy eigenstate is converted into the negative-phase expectations for a fully connected model; the current description leaves open whether the learned couplings simply drive the circuit toward the target state without correct thermal averaging.

Authors: The referee correctly identifies that QAOA provides a variational ground-state approximation rather than exact thermal sampling. In our architecture the inner loop uses this approximation to realize the positive-phase energy minimization for the fully connected QBM, while the outer loop adjusts Hamiltonian structural parameters to enforce the negative-phase contrast. The revised text will add a dedicated paragraph explaining this approximation, its relation to low-temperature limits of the Boltzmann distribution, and why the bilevel structure still yields generative behavior distinct from pure state preparation. We note that full thermal sampling on NISQ hardware remains challenging and our method offers a practical surrogate. revision: partial
Referee: [Results and performance evaluation] Results section: the numerical performance claims (probabilities 0.9559/0.6047/0.3859 and image-generation success) are presented without specifying qubit count, exact Hamiltonian form, training dataset size, number of independent runs, or comparisons against standard QBM training or other QAOA variants. This prevents assessment of whether the results demonstrate generative modeling or optimized state preparation.

Authors: We acknowledge the need for these experimental details to allow proper evaluation. The revised results section will explicitly state the qubit count used for the grid-image experiments, the precise form of the fully connected target Hamiltonian, the training dataset size, the number of independent runs with statistical error bars, and direct comparisons against classical QBM training (via contrastive divergence) and standard QAOA state-preparation baselines. These additions will clarify that the reported probabilities reflect generative performance under the bilevel scheme rather than mere state preparation. revision: yes

Circularity Check

1 steps flagged

Outer-loop optimization of target Hamiltonian parameters renames direct fitting as negative-phase CD simulation

specific steps

fitted input called prediction [Abstract]
"the inner-loop training simulates positive phase energy minimization based on the computational process of the conventional QAOA circuit, whereas the outer-loop training simulates negative phase contrastive divergence learning by optimizing the structural parameters of the target Hamiltonian. It is found that, first, the model exhibits superior performance using only a single layer (p=1) in the QAOA circuit, with an average probability of 0.9559 in measuring the target quantum state under noiseless conditions."

The outer loop directly varies the structural parameters of the target Hamiltonian (the model's defining object). Labeling this variation 'negative phase contrastive divergence learning' converts the act of fitting those parameters to produce high target-state probability into a claimed CD step. The 0.9559 probability is therefore the direct output of the fitting procedure rather than a prediction generated by a trained QBM whose negative-phase expectations were independently computed.

full rationale

The paper's bilevel architecture defines the QBM via the target Hamiltonian whose structural parameters are directly tuned in the outer loop. This tuning is presented as simulating negative-phase contrastive divergence, but the reported performance metrics (high target-state probabilities) follow immediately from the fitting process itself rather than from an independent generative training procedure that computes model expectations via sampling. The inner QAOA loop approximates energy minimization, but without explicit conversion to thermal Boltzmann samples or partition-function terms, the overall construction reduces the claimed QBM learning to parameter optimization for state preparation.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim rests on the mapping of QAOA to positive-phase minimization and outer-loop Hamiltonian tuning to negative-phase learning; these are domain assumptions in quantum ML with no independent verification supplied.

free parameters (1)

structural parameters of the target Hamiltonian
These parameters are optimized in the outer loop and directly shape the energy landscape the model learns.

axioms (2)

domain assumption Inner-loop QAOA circuit simulates positive-phase energy minimization
Invoked as the basis for the bilevel training process.
domain assumption Outer-loop optimization of Hamiltonian structure simulates negative-phase contrastive divergence
Core premise that justifies calling the result a trained QBM.

invented entities (1)

bilevel optimization architecture for fully connected QBM no independent evidence
purpose: To overcome partial-connectivity limits of prior QBMs while using QAOA circuits
New proposed structure whose effectiveness is demonstrated only by the reported simulations.

pith-pipeline@v0.9.0 · 5584 in / 1622 out tokens · 66356 ms · 2026-05-11T01:47:39.377244+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

inner-loop training simulates positive phase energy minimization based on the computational process of the conventional QAOA circuit, whereas the outer-loop training simulates negative phase contrastive divergence learning by optimizing the structural parameters of the target Hamiltonian
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery and J-cost orbit unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

E(s;Θ) = sum b_i x_i + sum w_ij x_i x_j ; P(s;Θ) = 1/Z exp(-E)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 1 internal anchor

[1]

D. H. Ackley, G. E. Hinton, and T. J. Sejnowski. A learning algorithm for Boltzmann machines. Cognitive Science, 9(1):147–169, 1985. doi:10.1207/s15516709cog0901˙7

work page doi:10.1207/s15516709cog0901 1985
[2]

Le Roux and Y

N. Le Roux and Y. Bengio. Representational power of restricted Boltzmann machines and deep belief networks.Neural Computation, 20(6):1631–1649, 2008. doi:10.1162/neco.2008.04- 07-510

work page doi:10.1162/neco.2008.04- 2008
[3]

Tieleman

T. Tieleman. Training restricted Boltzmann machines using approximations to the likelihood gradient. InInternational Conference on Machine Learning (ICML), pages 1064–1071, 2008

work page 2008
[4]

Smolensky

P. Smolensky. Information processing in dynamical systems: foundations of harmony theory. InParallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foundations, pages 194–281. MIT Press, 1986

work page 1986
[5]

G. E. Hinton. Training products of experts by minimizing contrastive divergence.Neural Computation, 14(8):1771–1800, 2002. doi:10.1162/089976602760128018

work page doi:10.1162/089976602760128018 2002
[6]

G. E. Hinton, S. Osindero, and Y. W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006. doi:10.1162/neco.2006.18.7.1527

work page doi:10.1162/neco.2006.18.7.1527 2006
[7]

Salakhutdinov and G

R. Salakhutdinov and G. E. Hinton. Deep Boltzmann machines. InProceedings of the In- ternational Conference on Artificial Intelligence and Statistics (AISTATS), pages 448–455, 2009

work page 2009
[8]

Neural Computation , volume=

R. Salakhutdinov and G. E. Hinton. An efficient learning procedure for deep Boltzmann machines.Neural Computation, 24(8):1967–2006, 2012. doi:10.1162/NECO˙a˙00311. 33

work page doi:10.1162/neco 1967
[9]

Mont´ ufar et al

G. Mont´ ufar et al. On the number of linear regions of deep neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), pages 2924–2932, 2014

work page 2014
[10]

Deng et al

H. Deng et al. The interaction bottleneck of deep neural networks: discovery, proof, and modulation, 2025. URLhttps://arxiv.org/abs/2512.18607

work page arXiv 2025
[11]

G. E. Hinton. What kind of a graphical model is the brain? InInternational Joint Conference on Artificial Intelligence (IJCAI), pages 1765–1775, 2005

work page 2005
[12]

Koller and N

D. Koller and N. Friedman.Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. ISBN 9780262013192

work page 2009
[13]

Wiebe et al

N. Wiebe et al. Quantum deep learning. 2014. URLhttps://arxiv.org/abs/1412.3489

work page arXiv 2014
[14]

Wiebe and L

N. Wiebe and L. Wossnig. Generative training of quantum Boltzmann machines with hidden units, 2019. URLhttps://arxiv.org/abs/1905.09902

work page arXiv 2019
[15]

Srivastava and V

S. Srivastava and V. Sundararaghavan. Generative and discriminative training of Boltz- mann machine through quantum annealing.Scientific Reports, 13(1):12456, 2023. doi: 10.1038/s41598-023-34652-4

work page doi:10.1038/s41598-023-34652-4 2023
[16]

Zoufal, A

C. Zoufal, A. Lucchi, and S. Woerner. Variational quantum Boltzmann machines, 2020. URL https://arxiv.org/abs/2006.06004

work page arXiv 2020
[17]

Coopmans and M

L. Coopmans and M. Benedetti. On the sample complexity of quantum Boltzmann machine learning.Communications Physics, 7:274, 2024. doi:10.1038/s42005-024-01763-x

work page doi:10.1038/s42005-024-01763-x 2024
[18]

Patel et al

D. Patel et al. Quantum Boltzmann machine learning of ground-state energies, 2024. URL https://arxiv.org/abs/2410.12935

work page arXiv 2024
[19]

Demidik et al

M. Demidik et al. Expressive equivalence of classical and quantum restricted Boltzmann machines, 2025. URLhttps://arxiv.org/abs/2502.17562

work page arXiv 2025
[20]

Kimura, K

T. Kimura, K. Kato, and M. Hayashi. Structured quantum learning via em algorithm for Boltzmann machines (Note: The lowercase “em” in the title is as per the original paper; standard abbreviation is EM (expectation-maximization).), 2025. URLhttps://arxiv.org/ abs/2507.21569

work page arXiv 2025
[21]

Rule and E

E. Rule and E. Rrapaj. Exact block encoding of imaginary time evolution with uni- versal quantum neural networks.Physical Review Research, 7(1):013306, 2025. doi: 10.1103/PhysRevResearch.7.013306

work page doi:10.1103/physrevresearch.7.013306 2025
[22]

Quantum Computation by Adiabatic Evolution

E. Farhi et al. Quantum computation by adiabatic evolution, 2000. URLhttps://arxiv. org/abs/quant-ph/0001106. 34

work page Pith review arXiv 2000
[23]

Farhi, J

E. Farhi, J. Goldstone, and S. Gutmann. A quantum approximate optimization algorithm,

work page
[24]

URLhttps://arxiv.org/abs/1411.4028

work page internal anchor Pith review Pith/arXiv arXiv
[25]

Mitarai, M

K. Mitarai et al. Quantum circuit learning.Physical Review A, 98(3):032309, 2018. doi: 10.1103/PhysRevA.98.032309

work page doi:10.1103/physreva.98.032309 2018