arxiv: 2604.10607 · v1 · submitted 2026-04-12 · 🪐 quant-ph · cs.LG· hep-th

Recognition: unknown

Adaptive H-EFT-VA: A Provably Safe Trajectory Through the Trainability-Expressibility Landscape of Variational Quantum Algorithms

Eyad I. B. Hamid

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:50 UTC · model grok-4.3

classification 🪐 quant-ph cs.LGhep-th

keywords variational quantum algorithmsbarren plateaustrainability-expressibility tradeoffadaptive ansatzhierarchical EFTgradient variancequantum optimization

0 comments

The pith

Adaptive expansion doubles fidelity in VQA without losing trainability

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Adaptive H-EFT-VA to fix a key limitation of the earlier H-EFT-VA method. The static version confines the variational ansatz to a polynomial-sized subspace, leaving a gap for target states far from the initial reference state. The adaptive version grows the reachable space gradually under a controlled schedule. Theorem 1 shows that gradient variance stays Omega(1/poly(N)) when the expansion width sigma(t) is kept at most 0.5 over sqrt(LN). Benchmarks on problems up to 14 qubits confirm the approach doubles fidelity relative to the static baseline while keeping gradients trainable.

Core claim

Adaptive H-EFT-VA navigates the trainability-expressibility landscape of variational quantum algorithms by expanding the reachable Hilbert space along a trajectory controlled by sigma(t) <= 0.5/sqrt(LN). This bound, established in Theorem 1 and supported by the Safe Expansion Corollary and Monotone Growth Lemma, guarantees gradient variance remains Omega(1/poly(N)) with no discontinuous jumps, enabling higher expressibility without sacrificing trainability.

What carries the argument

The time-dependent expansion schedule sigma(t) bounded by 0.5/sqrt(LN), which governs gradual closure of the reference-state gap created by the hierarchical EFT UV-cutoff ansatz.

Load-bearing premise

Gradual expansion can close the reference-state gap without introducing new trainability issues or violating the polynomial subspace restriction on actual quantum hardware.

What would settle it

A concrete simulation or hardware run in which gradient variance drops below Omega(1/poly(N)) even though sigma(t) never exceeds 0.5/sqrt(LN) would disprove Theorem 1.

Figures

Figures reproduced from arXiv: 2604.10607 by Eyad I. B. Hamid.

**Figure 1.** Figure 1: directly validates Theorem 1 and Corollary 1. In Phase I (left panel), ⟨∥∇C∥ 2 ⟩ ≥ 5.2×10−1 at all tested 10 2 × 10 0 3 × 10 0 4 × 10 0 6 × 10 0 1 Number of Qubits (N) 5.2 × 10 1 5.4 × 10 1 5.6 × 10 1 5.8 × 10 1 6 × 10 1 6.2 × 10 1 6.4 × 10 1 6.6 × 10 1 6.8 × 10 1 || C||2 AT1: GV Scaling Phase I (Static Init) L L=2 L=4 L=6 L=8 L=10 L=12 L=14 10 2 × 10 0 3 × 10 0 4 × 10 0 6 × 10 0 1 Number of Qubits (N) 6 ×… view at source ↗

**Figure 2.** Figure 2: FIG. 2 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: provides a real-time confirmation of Corollary 1. Phase I (t ≈ 0–100) exhibits the characteristic damped oscillatory convergence of the Adam optimizer navigating a non-flat loss landscape: the gradient norm decays from ∥∇C∥ ≈ 8 at t = 0 through a series of successively smaller oscillations, reaching a local minimum near 10−2 at t ≈ 95. This oscillatory pattern—in contrast to 0 25 50 75 100 125 150 175 20… view at source ↗

**Figure 6.** Figure 6: FIG. 6 [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7 [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 8.** Figure 8: provides the most direct validation of Lemma 1. At t = 0, the EFT initialization at σ0 = 0.1/(8 × 8) = 1.56 × 10−3 produces a highly localized state: 163 out of 28 = 256 basis states have amplitudes above 10−6 , corresponding to a Hamming-weight ceiling of roughly wmax(0) ≈ 5–6 (from the bound [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 11.** Figure 11: assesses A-H-EFT’s hardware relevance under depolarizing noise. The results show a clear hierarchy of noise tolerance. At p = 10−4 (green), the convergence trajectory is indistinguishable from the noiseless case: the EFT-localized Phase I produces short gate sequences with minimal accumulated error, and Phase II perturbations are noise-resilient because they are drawn from a sub-critical distribution th… view at source ↗

**Figure 12.** Figure 12: confirms that A-H-EFT’s gradient estimates 0 25 50 75 100 125 150 175 200 Optimization Step 10 9 8 7 6 H AT11: Noise Robustness (N=8, L=8) Depolarizing p p=0.0 p=0.0001 p=0.001 p=0.01 FIG. 11. Noise robustness of adaptive training (N = 8, L = 8). Energy vs. step under depolarizing noise p ∈ {0, 10−4 , 10−3 , 10−2 }. A-H-EFT converges to ⟨H⟩ ≈ −10.0 for p ≤ 10−3 (< 1% degradation) and ≈ −7.6 for p = 10−2 (… view at source ↗

**Figure 13.** Figure 13: sweeps δswitch from 10−4 to 10−2 . For δswitch ≤ 5 × 10−3 , final energy is flat at ⟨H⟩ = −10.097 ± 0.005—variation within one standard error, indistinguishable from noise. The modest degradation 4 5 6 7 8 9 10 11 12 Number of Qubits (N) 14 12 10 8 6 4 2 0 M e a n Fin al E n erg y H AT15: Statistical Significance A-H-EFT Static H-EFT-VA HEA p (A-H-EFT vs Static) p (A-H-EFT vs HEA) 10 199 10 172 10 145 10 … view at source ↗

**Figure 14.** Figure 14: sweeps λ from 0.005 to 0.1, an order of magnitude below to 5× above the default of 0.02. Final energy is −10.097±0.006—constant to within noise—across all values. The mechanistic explanation is transparent: the safety clamp at σcrit(8, 8) = 0.0625 (annotated in the figure) means that any λ > 0 causes σ(t) to reach σcrit within the 200-step budget, after which all trajectories are identical under the cla… view at source ↗

**Figure 16.** Figure 16: confirms that Theorem 1 holds for a qualitatively different Hamiltonian family. The Phase I GV (left) follows clean power-law scaling but at lower absolute values (10−3–10−5 ) than TFIM, reflecting the larger operator norm ∥HXXZ∥op = O(3N) versus ∥HTFIM∥op = O(2N). The Phase II GV (right) remains ≥ 10−1 across all depths and system sizes, confirming that σcrit is Hamiltonian-independent (as required by … view at source ↗

**Figure 17.** Figure 17: contains the paper’s most dramatic finding. Static H-EFT-VA converges to positive energies for every tested (N, L) combination: ⟨HXXZ⟩ ≈ +4, +8, +12 at N = 4, 8, 12—above zero, with the wrong sign, indicating the optimizer has settled in an energy maximum rather than a minimum. A-H-EFT achieves ⟨HXXZ⟩ ≈ −5, −11, −17: a qualitative regime shift from positive to deeply negative, correctly identified ground… view at source ↗

read the original abstract

H-EFT-VA established a physics-informed solution to the Barren Plateau (BP) problem via a hierarchical EFT UV-cutoff, guaranteeing gradient variance in Omega(1/poly(N)). However, localization restricts the ansatz to a polynomial subspace, creating a reference-state gap for states distant from |0>^N. We introduce Adaptive H-EFT-VA (A-H-EFT) to navigate the trainability-expressibility tradeoff by expanding the reachable Hilbert space along a safe trajectory. Gradient variance is maintained in Omega(1/poly(N)) if sigma(t) <= 0.5/sqrt(LN) (Theorem 1). A Safe Expansion Corollary and Monotone Growth Lemma confirm expansion without discontinuous jumps. Benchmarking across 16 experiments (up to N=14) shows A-H-EFT achieves fidelity F=0.54, doubling static H-EFT-VA (F=0.27) and outperforming HEA (F~0.01), with gradient variance >= 0.5 throughout. For Heisenberg XXZ (Delta_ref=1), A-H-EFT identifies the negative ground state while static methods fail. Results are statistically significant (p < 10^-37). Robustness over three decades of hyperparameters enables deployment without search. This is the first rigorously bounded trajectory through the VQA landscape.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds an adaptive sigma(t) schedule to H-EFT-VA with claimed safe-expansion lemmas, but the variance bound for the time-varying case is only asserted, not shown.

read the letter

The central claim is that you can start with a localized H-EFT-VA ansatz that keeps gradient variance polynomial and then slowly increase sigma(t) to reach states farther from the reference without reintroducing barren plateaus. Theorem 1, the Safe Expansion Corollary, and the Monotone Growth Lemma are presented as the tools that make the trajectory provably safe. Benchmarks on 16 runs up to 14 qubits report doubled fidelity over the static version and success on the Heisenberg XXZ ground state where the static method fails, with p-values below 10^-37 and gradient variance staying above 0.5. Those numbers are the clearest empirical signal in the abstract. The work is new relative to the base H-EFT-VA paper because the adaptive mechanism and the supporting lemmas are not described there. The benchmarks also give a direct, independent check on whether the claimed improvement materializes on concrete tasks. The main soft spot is that the variance lower bound is derived for the static ansatz. Once sigma(t) becomes time-dependent, the effective layer count L or the localization scale can change at each step, and the abstract does not show a re-derivation that controls the extra terms. The stress-test concern about the bound failing to transfer therefore lands on the central claim. Without the full derivations it is impossible to tell whether the polynomial guarantee survives the expansion or whether the lemmas only enforce continuity while leaving the variance analysis incomplete. This paper is written for people who already know the H-EFT-VA framework and are looking for controlled ways to increase expressibility. A reader who has worked on barren-plateau mitigation will immediately see what is being attempted. It deserves a serious referee to examine the proofs of Theorem 1 and the lemmas and to check whether the dynamic case is actually covered. I would send it to peer review rather than desk-reject it.

Referee Report

2 major / 1 minor

Summary. The paper introduces Adaptive H-EFT-VA (A-H-EFT), an extension of the static H-EFT-VA ansatz that gradually increases the UV-cutoff parameter sigma(t) to expand the reachable subspace and close the reference-state gap while claiming to preserve gradient variance in Omega(1/poly(N)) via Theorem 1 (conditioned on sigma(t) <= 0.5/sqrt(LN)), supported by a Safe Expansion Corollary and Monotone Growth Lemma. It reports benchmark results on up to N=14 qubits across 16 experiments, including Heisenberg XXZ, showing doubled fidelity (F=0.54 vs 0.27) over static H-EFT-VA and outperforming HEA, with gradient variance >=0.5 and statistical significance p<10^-37.

Significance. If the adaptive trajectory rigorously preserves the polynomial gradient variance bound without introducing new trainability issues from the time-dependent ansatz, this would represent a meaningful advance in navigating the trainability-expressibility tradeoff for VQAs, offering a physics-informed, hyperparameter-robust alternative to heuristic ansatz design. The empirical doubling of fidelity on tasks where static methods fail (e.g., identifying the negative ground state of Heisenberg XXZ) and the reported robustness over three decades of hyperparameters are notable strengths; however, the central theoretical claim rests on the transfer of static bounds to the adaptive setting.

major comments (2)

[Theorem 1, Safe Expansion Corollary] Theorem 1 and Safe Expansion Corollary: The Omega(1/poly(N)) lower bound on gradient variance is derived for the static H-EFT-VA with fixed sigma and localization scale. The adaptive schedule varies sigma(t) over time to expand the subspace, but the corollary and Monotone Growth Lemma only assert continuity and absence of discontinuous jumps; they do not explicitly re-derive the variance expression under a time-dependent ansatz. This leaves open whether additional terms arise from changes in effective circuit depth L or the UV-cutoff structure, potentially violating the stated polynomial bound (as the skeptic concern highlights).
[Benchmarking results] Benchmarking section (Heisenberg XXZ and 16 experiments): While the reported fidelities and variance values (>=0.5) are promising and statistically significant, the experiments must confirm that the effective L and localization assumptions remain consistent with the static derivation throughout the trajectory; otherwise the empirical success does not independently validate the theoretical guarantee for the adaptive case.

minor comments (1)

[Abstract] The abstract claims this is 'the first rigorously bounded trajectory'; this should be qualified to specify the scope (e.g., within the H-EFT-VA family) to avoid overstatement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential of Adaptive H-EFT-VA in addressing the trainability-expressibility tradeoff. We address each major comment below with clarifications and proposed revisions.

read point-by-point responses

Referee: Theorem 1 and Safe Expansion Corollary: The Omega(1/poly(N)) lower bound on gradient variance is derived for the static H-EFT-VA with fixed sigma and localization scale. The adaptive schedule varies sigma(t) over time to expand the subspace, but the corollary and Monotone Growth Lemma only assert continuity and absence of discontinuous jumps; they do not explicitly re-derive the variance expression under a time-dependent ansatz. This leaves open whether additional terms arise from changes in effective circuit depth L or the UV-cutoff structure, potentially violating the stated polynomial bound (as the skeptic concern highlights).

Authors: We appreciate this observation. Theorem 1 is formulated for an ansatz with fixed sigma, and the adaptive procedure holds sigma(t) constant during each optimization interval before incrementing it. The Monotone Growth Lemma guarantees that increments preserve the condition sigma(t) <= 0.5/sqrt(LN) without introducing discontinuities in the ansatz structure. To eliminate any ambiguity regarding time dependence, we will revise the manuscript to include an explicit paragraph invoking Theorem 1 at each fixed-t slice and confirming that no additional variance terms arise from the controlled, monotonic change in sigma(t), as the circuit depth L and UV-cutoff structure remain unchanged within each step. revision: yes
Referee: Benchmarking section (Heisenberg XXZ and 16 experiments): While the reported fidelities and variance values (>=0.5) are promising and statistically significant, the experiments must confirm that the effective L and localization assumptions remain consistent with the static derivation throughout the trajectory; otherwise the empirical success does not independently validate the theoretical guarantee for the adaptive case.

Authors: We agree that direct verification of the assumptions strengthens the link between theory and numerics. In the revised manuscript we will add a dedicated paragraph (and, if space permits, a supplementary figure) that explicitly tracks the fixed value of L and the time-dependent sigma(t) for all 16 experiments, confirming that sigma(t) satisfies the bound at every step and that the localization assumptions of the static derivation are preserved. This will demonstrate consistency between the adaptive trajectory and the conditions of Theorem 1. revision: yes

Circularity Check

0 steps flagged

No significant circularity; theorems and lemmas provide independent extension of prior static bounds.

full rationale

The derivation chain rests on explicitly stated Theorem 1 (variance bound conditioned on sigma(t) <= 0.5/sqrt(LN)), Safe Expansion Corollary, and Monotone Growth Lemma, which are presented as new results for the adaptive trajectory rather than tautological redefinitions of inputs. The static H-EFT-VA guarantee is cited as foundation but the adaptive version adds time-dependent controls and continuity assertions without reducing the polynomial lower bound to a fitted parameter or self-citation that itself assumes the target result. Benchmarking (fidelity doubling, gradient variance >=0.5, statistical significance) supplies an external empirical check on Heisenberg XXZ and other tasks, separate from the theoretical conditions. No load-bearing step collapses by construction to renaming or ansatz smuggling.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work inherits the hierarchical EFT UV-cutoff and gradient variance guarantee from prior H-EFT-VA and adds new corollaries for adaptation. No new particles or forces are introduced.

free parameters (1)

sigma(t)
Expansion rate parameter whose upper bound is stated as 0.5/sqrt(LN) to preserve gradient variance.

axioms (1)

domain assumption Hierarchical EFT UV-cutoff guarantees gradient variance in Omega(1/poly(N))
Inherited from the base H-EFT-VA framework referenced in the abstract.

pith-pipeline@v0.9.0 · 5549 in / 1324 out tokens · 52361 ms · 2026-05-10T15:50:37.251013+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 2 canonical work pages · 2 internal anchors

[1]

establishing 2×fidelity improvement over static H-EFT-VA, qualitative resolution of the reference- state gap on Heisenberg (∆ ref = 1), andp <10 −37 statistical significance (50 seeds, Welch’st-test). II. THEORETICAL FRAMEWORK A. Assumptions and Setup We work under the following assumptions, stated ex- plicitly to enable rigorous proof. Assumption 1(Circu...
[2]

productive expansion

forσ=σ crit, rapidly ap- proaching zero forσ≫σ crit. Step 2: Global BP from 2-design.IfU(θ) is anϵ- approximate unitary 2-design, then by Theorem 1 of Ref. [2] (see also Proposition 2 of Ref. [10]): Var[∂θj C]≤ B2 2N−1 + 4B2ϵ≤B 2 ·2 −(N−1) (14) forϵ≤2 −N (satisfied forNsufficiently large andσ > σcrit). This completes Part (b).□ Remark 1(Empirical calibrat...
[3]

HEA stagnates at⟨H⟩ ≈ −1 (N= 4) to≈0 (N= 12), unable to make meaningful progress beyond its barren-plateau initialization

(−14 vs.−12)—from an identical circuit architecture with zero additional gates. HEA stagnates at⟨H⟩ ≈ −1 (N= 4) to≈0 (N= 12), unable to make meaningful progress beyond its barren-plateau initialization. The ad- vantage appears within the first 25 steps across all panels, establishing that Phase II expansion provides fast-acting 7 2 4 6 8 10 12 14 Number o...
[4]

trainability–expressibility tradeoff window,

are essentially product states, providing zero expressibility in the Haar sense and zero access to entangled ground states. HEA (orange) descends from 0.10 atL= 2 to 0.045 atL= 10, approaching the Haar limit (red dotted, 0.032): maxi- mum expressibility, but at the cost of untrainable gra- dient landscapes. A-H-EFT Phase II (blue) stabilizes at purity≈0.8...
[5]

By the standard sub-Gaussian tail bound, Pr[|θ k|> t]≤2e −t2/(2σ2)

Sub-Gaussian Concentration Under Assumption 3,θ k ∼ N(0, σ 2) is sub-Gaussian with parameterσ 2. By the standard sub-Gaussian tail bound, Pr[|θ k|> t]≤2e −t2/(2σ2). Settingt= 3σgives Pr[|θk|>3σ]≤2e −9/2 ≈0.022. By a union bound over Mtot ≤2LNparameters: Pr max k |θk|>3σ ≤4LN·e −9/2.(A1) For (N, L) = (14,14), this is 4×196×e −9/2 ≈4.8, which is not small—i...
[6]

(13)) is derived as follows

Variance Lower Bound: Derivation ofκ lb The gradient variance lower bound (Eq. (13)) is derived as follows. By the parameter-shift rule and Jensen’s in- equality: Var[∂θj C] = 1 4Var C θ+ π 2 ej −C θ− π 2 ej .(A2) The expectation valueC(θ) ranges over [−B, B] on the deff-dimensional subspace. The variance of the differ- ence of two bounded random variable...
[7]

Variational quantum algorithms,

M. Cerezoet al., “Variational quantum algorithms,”Nat. Rev. Phys.3, 625 (2021)

2021
[8]

Barren plateaus in quantum neu- ral network training landscapes,

J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Bab- bush, and H. Neven, “Barren plateaus in quantum neu- ral network training landscapes,”Nat. Commun.9, 4812 (2018)

2018
[9]

Theory of overparametrization in quantum neural networks,

M. Larocca, N. Ju, D. Garc´ ıa-Mart´ ın, P. J. Coles, and M. Cerezo, “Theory of overparametrization in quantum neural networks,”Nat. Comput. Sci.1, 1 (2022)

2022
[10]

Noise-induced barren plateaus in variational quantum algorithms,

S. Wang, E. Fontana, M. Cerezo, K. Sharma, A. Sone, L. Cincio, and P. J. Coles, “Noise-induced barren plateaus in variational quantum algorithms,”Nat. Com- mun.12, 6961 (2021)

2021
[11]

Layerwise learning for quantum neural networks,

A. Skolik, J. R. McClean, M. Mohseni, P. van der Smagt, and M. Leib, “Layerwise learning for quantum neural networks,”Quantum Sci. Technol.6, 025002 (2021)

2021
[12]

An initialization strategy for addressing barren plateaus in parametrized quantum circuits,

E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, “An initialization strategy for addressing barren plateaus in parametrized quantum circuits,”Quantum3, 214 (2019)

2019
[13]

From the quantum ap- proximate optimization algorithm to a quantum alter- nating operator ansatz,

S. Hadfield, Z. Wang, B. O’Gorman, E. G. Rieffel, D. Venturelli, and R. Biswas, “From the quantum ap- proximate optimization algorithm to a quantum alter- nating operator ansatz,”Algorithms12, 34 (2019)

2019
[14]

An adaptive variational algorithm for exact molecular simulations on a quantum computer,

H. R. Grimsley, S. E. Economou, E. Barnes, and N. J. Mayhall, “An adaptive variational algorithm for exact molecular simulations on a quantum computer,” Nat. Commun.10, 3007 (2019)

2019
[15]

H-EFT-VA: An Effective-Field-Theory Variational Ansatz with Provable Barren Plateau Avoidance

E. I. B. Hamid, “H-EFT-VA: An Effective-Field-Theory Variational Ansatz with Provable Barren Plateau Avoid- ance,” arXiv:2601.10479 [quant-ph] (2026), under review

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

Cost function dependent barren plateaus in shallow parametrized quantum circuits,

M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, “Cost function dependent barren plateaus in shallow parametrized quantum circuits,”Nat. Commun.12, 1791 (2021)

2021
[17]

Hardware-efficient variational quan- tum eigensolver for small molecules and quantum mag- nets,

A. Kandalaet al., “Hardware-efficient variational quan- tum eigensolver for small molecules and quantum mag- nets,”Nature549, 242 (2017)

2017
[18]

PennyLane: Automatic differentiation of hybrid quantum-classical computations

V. Bergholmet al., “PennyLane: Automatic differ- entiation of hybrid quantum-classical computations,” arXiv:1811.04968 [quant-ph] (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

Con- necting ansatz expressibility to gradient magnitudes and barren plateaus,

Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, “Con- necting ansatz expressibility to gradient magnitudes and barren plateaus,”PRX Quantum3, 010313 (2022)

2022
[20]

The renormalization group and theϵexpansion,

K. G. Wilson and J. Kogut, “The renormalization group and theϵexpansion,”Phys. Rep.12, 75 (1974)

1974