Low-variance estimators overcome the phase-gradient bottleneck in complex-valued neural quantum states

Baigeng Wang; Chenan Wei; Rui Wang; Yi-Ran Xue

arxiv: 2606.13912 · v2 · pith:RNN7JP6Vnew · submitted 2026-06-11 · ❄️ cond-mat.dis-nn · cond-mat.str-el· cs.LG· physics.comp-ph· quant-ph

Low-variance estimators overcome the phase-gradient bottleneck in complex-valued neural quantum states

Yi-Ran Xue , Rui Wang , Baigeng Wang , Chenan Wei This is my paper

Pith reviewed 2026-06-27 04:48 UTC · model grok-4.3

classification ❄️ cond-mat.dis-nn cond-mat.str-elcs.LGphysics.comp-phquant-ph

keywords complex neural quantum statesphase gradientvariational Monte Carlolow-variance estimatorsquantum many-body systemsoptimization bottleneckamplitude-phase separation

0 comments

The pith

Differentiating the local energy at fixed Monte Carlo samples yields an unbiased low-variance estimator of the phase force for complex neural quantum states.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the main obstacle to optimizing complex-valued neural quantum states with nontrivial phase structure is high variance in the Monte Carlo estimator of the phase gradient rather than insufficient expressivity of the ansatz. For amplitude-phase separated representations, holding the Monte Carlo samples fixed while differentiating the local energy produces a distinct but unbiased estimator of the identical variational phase force. The construction extends to coupled two-head networks by preserving the amplitude contribution and applying the direct derivative only along the phase path, then combining both via an adaptive minimum-variance mixture. Across flux ladders, chiral chains, two-dimensional cylinders, fermion ladders, shared-weight controls, and a fractional quantum Hall benchmark, the resulting estimators lower phase-gradient variance, reduce seed failures, and convert multi-percent plateaus into sub-percent accuracy.

Core claim

For separated amplitude-phase states, differentiating the local energy at fixed samples gives a different unbiased estimator of the same variational Monte Carlo phase force, without changing the objective. The method extends to coupled two-head networks by keeping the amplitude-gradient contribution and applying the direct derivative only to the phase path, then interpolating between the two estimators with an adaptive minimum-variance mixture during training.

What carries the argument

Direct derivative of the local energy with respect to phase parameters at fixed Monte Carlo samples, serving as an alternative unbiased estimator of the variational phase force.

If this is right

The new estimator reduces variance of the phase gradient in variational Monte Carlo training of complex neural quantum states.
It suppresses optimization failures that depend on random seed in systems with gauge, chiral, or topological phase structure.
Training reaches sub-percent accuracy on benchmarks where the standard estimator plateaus at several percent error.
The adaptive mixture construction applies to both separated and shared-weight network architectures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the fixed-sample estimator maintains its variance reduction at larger system sizes, it could allow reliable optimization of neural states for phases that were previously inaccessible due to gradient noise.
The mixture approach might be generalized to reduce variance in estimators for other variational parameters beyond phase.
Applying the estimator to models with explicit anyonic or non-Abelian statistics could test whether the variance benefit persists when phase structure is more intricate.

Load-bearing premise

Fixing the Monte Carlo samples while differentiating the local energy with respect to phase parameters keeps the estimator unbiased for the phase force even after the adaptive mixture is introduced and when amplitude and phase share network weights.

What would settle it

A direct numerical check showing that the expectation of the fixed-sample local-energy derivative deviates from the standard phase-force expectation in a coupled two-head network would disprove unbiasedness.

read the original abstract

Complex neural quantum states are difficult to optimize when their wavefunction phase carries gauge, chiral, fermionic, or topological structure. We show that the major failure mode is not only ansatz expressivity, but the Monte Carlo estimator used to learn this phase. For separated amplitude-phase states, differentiating the local energy at fixed samples gives a different unbiased estimator of the same variational Monte Carlo phase force, without changing the objective. We further extend the construction to coupled two-head networks by keeping the amplitude-gradient contribution and applying the direct derivative only to the phase path. An adaptive minimum-variance mixture interpolates between standard and direct estimators during training. Across flux ladders, chiral chains, two-dimensional flux cylinders, an interacting fermion ladder, shared-network controls, and a fractional quantum Hall benchmark, the resulting estimators reduce phase-gradient variance, suppress seed failures, and often move multi-percent standard-gradient plateaus to sub-percent accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The fixed-sample phase derivative gives a practical low-variance unbiased estimator for separated amplitude-phase networks and the adaptive mixture helps on coupled ones, with solid empirical gains across the benchmarks.

read the letter

The core new piece is the direct differentiation of the local energy at fixed Monte Carlo samples to estimate the phase force. For amplitude-phase separated states this is unbiased by construction and avoids the usual high-variance phase gradient. They then keep the standard amplitude term, apply the direct derivative only on the phase path for coupled networks, and mix the two estimators with an adaptive weight that tracks sample variance. That construction is not in the earlier VMC literature they cite.

The experiments are the strongest part. On flux ladders, chiral chains, 2D cylinders, the interacting fermion ladder, shared-network controls, and the fractional quantum Hall state, the new estimators cut phase-gradient variance, reduce seed-to-seed failures, and push several runs from multi-percent plateaus down to sub-percent accuracy. Those are concrete, reproducible improvements on systems that matter for the subfield.

The soft spot is the bias question on coupled networks. Shared weights mean the local energy mixes amplitude and phase contributions, and the adaptive mixing weights are themselves sample-dependent, so it is not obvious that the expectation of the hybrid estimator still equals the true phase force. The abstract asserts it works, and the benchmarks look clean, but an explicit short derivation or a controlled counter-example check would remove the remaining doubt. It is a moderate rather than fatal gap.

This paper is aimed at people already running variational Monte Carlo with complex neural states. Anyone optimizing phase-sensitive ansatze will want to try the estimator. It is worth sending to referees because the empirical evidence is sharp and the fix is cheap to implement; a referee can ask for the bias clarification without derailing the work.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes low-variance Monte Carlo estimators for the phase gradient in complex-valued neural quantum states. For amplitude-phase separated ansatzes, differentiating the local energy at fixed samples yields an alternative unbiased estimator of the variational phase force. The construction is extended to coupled two-head networks by retaining the amplitude-gradient term, applying the direct derivative only along the phase path, and interpolating via an adaptive minimum-variance mixture. Benchmarks on flux ladders, chiral chains, 2D flux cylinders, an interacting fermion ladder, shared-network controls, and a fractional quantum Hall state report reduced phase-gradient variance, fewer optimization failures, and improved accuracy relative to standard estimators.

Significance. If the unbiasedness of the hybrid estimator is rigorously established, the work targets a recognized practical bottleneck in variational Monte Carlo with complex NQS, offering a route to more stable optimization of states with non-trivial phase structure without altering the variational objective. The breadth of numerical tests across distinct physical systems constitutes a concrete strength.

major comments (1)

[Estimator construction for coupled networks] The central claim that the hybrid estimator (amplitude-gradient term retained plus direct phase-path derivative, combined by adaptive mixture) remains exactly unbiased when amplitude and phase paths share network weights is load-bearing. Shared weights couple amplitude and phase contributions inside the local energy, and the adaptive mixing weights are sample-dependent; it is not obvious that the fixed-sample derivative still commutes with the expectation. An explicit derivation (or counter-example) confirming that the expectation equals the true variational phase force under these conditions is required.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for identifying the need for a rigorous treatment of unbiasedness in the coupled-network hybrid estimator. We address this point below.

read point-by-point responses

Referee: [Estimator construction for coupled networks] The central claim that the hybrid estimator (amplitude-gradient term retained plus direct phase-path derivative, combined by adaptive mixture) remains exactly unbiased when amplitude and phase paths share network weights is load-bearing. Shared weights couple amplitude and phase contributions inside the local energy, and the adaptive mixing weights are sample-dependent; it is not obvious that the fixed-sample derivative still commutes with the expectation. An explicit derivation (or counter-example) confirming that the expectation equals the true variational phase force under these conditions is required.

Authors: We agree that the sample-dependent adaptive mixing weights introduce a subtlety: because the weights correlate with the per-sample estimators, linearity of expectation alone does not immediately guarantee that the mixture remains exactly unbiased. The manuscript asserts unbiasedness for the separated-amplitude-phase case and extends the construction to coupled networks, but does not supply the explicit derivation requested. In the revised manuscript we will add a dedicated subsection (or appendix) that either (i) derives the conditions under which the hybrid estimator remains exactly unbiased or (ii) clarifies that the estimator is approximately unbiased in practice, with the numerical evidence across multiple systems serving as empirical support. We will also report any additional assumptions required for exact unbiasedness. revision: yes

Circularity Check

0 steps flagged

No circularity: phase estimator derived directly from local-energy differentiation

full rationale

The paper presents the low-variance estimator as obtained by differentiating the local energy with respect to phase parameters while holding Monte Carlo samples fixed, yielding an unbiased estimator of the variational phase force for separated amplitude-phase states; the extension to coupled networks retains the amplitude gradient term and mixes via an adaptive minimum-variance combination. No quoted equation or claim reduces this construction to a fitted parameter renamed as a prediction, a self-citation chain, an ansatz smuggled from prior work, or any other enumerated circular pattern. The derivation is therefore self-contained against the stated variational Monte Carlo objective.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The abstract supplies no explicit free parameters beyond the adaptive mixture weight, no new axioms beyond standard VMC sampling assumptions, and no invented entities.

free parameters (1)

adaptive mixture weight
The minimum-variance mixture interpolates between standard and direct estimators; its instantaneous value is presumably chosen or learned during training.

axioms (1)

domain assumption Fixing Monte Carlo samples while differentiating the local energy produces an unbiased estimator of the phase force
This is the load-bearing step that allows the new estimator to be substituted without altering the variational objective.

pith-pipeline@v0.9.1-grok · 5706 in / 1431 out tokens · 40280 ms · 2026-06-27T04:48:13.398615+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 30 canonical work pages

[1]

Science355(6325), 602–606 (2017) https://doi.org/10.1126/ science.aag2302

Carleo, G., Troyer, M.: Solving the quantum many-body problem with artifi- cial neural networks. Science355(6325), 602–606 (2017) https://doi.org/10.1126/ science.aag2302

2017
[2]

Nature Communications8, 662 (2017) https://doi.org/10

Gao, X., Duan, L.-M.: Efficient representation of quantum many-body states with deep neural networks. Nature Communications8, 662 (2017) https://doi.org/10. 1038/s41467-017-00705-2

2017
[3]

Physical Review Letters121, 167204 (2018) https://doi.org/10.1103/PhysRevLett.121.167204

Choo, K., Carleo, G., Regnault, N., Neupert, T.: Symmetries and many-body excitations with neural-network quantum states. Physical Review Letters121, 167204 (2018) https://doi.org/10.1103/PhysRevLett.121.167204

work page doi:10.1103/physrevlett.121.167204 2018
[4]

Physical Review Letters124, 020503 (2020) https://doi.org/10.1103/ PhysRevLett.124.020503

Sharir, O., Levine, Y., Wies, N., Carleo, G., Shashua, A.: Deep autoregres- sive models for the efficient variational simulation of many-body quantum systems. Physical Review Letters124, 020503 (2020) https://doi.org/10.1103/ PhysRevLett.124.020503

2020
[5]

Physical Review Research2, 023358 (2020) https://doi.org/10.1103/PhysRevResearch.2.023358

Hibat-Allah, M., Ganahl, M., Hayward, L.E., Melko, R.G., Carrasquilla, J.: Recurrent neural network wave functions. Physical Review Research2, 023358 (2020) https://doi.org/10.1103/PhysRevResearch.2.023358

work page doi:10.1103/physrevresearch.2.023358 2020
[6]

Physical Review X11, 031034 (2021) https://doi.org/10.1103/ PhysRevX.11.031034

Nomura, Y., Imada, M.: Dirac-type nodal spin liquid revealed by refined quan- tum many-body solver using neural-network wave function, correlation ratio, and level spectroscopy. Physical Review X11, 031034 (2021) https://doi.org/10.1103/ PhysRevX.11.031034

2021
[7]

SciPost Physics Codebases, 7 (2022) https://doi.org/10.21468/SciPostPhysCodeb.7

Vicentini, F., Hofmann, D., Szab´ o, A., Wu, D., Roth, C., Giuliani, C., Pescia, G., Nys, J., Vargas-Calder´ on, V., Astrakhantsev, N., Carleo, G.: NetKet 3: Machine 29 learning toolbox for many-body quantum systems. SciPost Physics Codebases, 7 (2022) https://doi.org/10.21468/SciPostPhysCodeb.7

work page doi:10.21468/scipostphyscodeb.7 2022
[8]

The European Physical Journal Plus139, 631 (2024)

Medvidovi´ c, M., Robledo Moreno, J.: Neural-network quantum states for many- body physics. The European Physical Journal Plus139, 631 (2024)

2024
[9]

Quantum Science and Technology9(4), 040501 (2024) https://doi.org/10.1088/2058-9565/ad7168

Lange, H., Walle, A., Abedinnia, A., Bohrdt, A.: From architectures to applica- tions: A review of neural quantum states. Quantum Science and Technology9(4), 040501 (2024) https://doi.org/10.1088/2058-9565/ad7168

work page doi:10.1088/2058-9565/ad7168 2024
[10]

Journal of Computational Physics399, 108929 (2019) https: //doi.org/10.1016/j.jcp.2019.108929

Han, J., Zhang, L., E, W.: Solving many-electron schr¨ odinger equation using deep neural networks. Journal of Computational Physics399, 108929 (2019) https: //doi.org/10.1016/j.jcp.2019.108929

work page doi:10.1016/j.jcp.2019.108929 2019
[11]

Physical Review Research2, 033429 (2020) https://doi.org/10.1103/PhysRevResearch.2

Pfau, D., Spencer, J.S., Matthews, A.G.D.G., Foulkes, W.M.C.: Ab initio solution of the many-electron Schr¨ odinger equation with deep neural networks. Physical Review Research2, 033429 (2020) https://doi.org/10.1103/PhysRevResearch.2. 033429

work page doi:10.1103/physrevresearch.2 2020
[12]

Nature Chemistry12, 891–897 (2020) https://doi.org/10

Hermann, J., Sch¨ atzle, Z., No´ e, F.: Deep-neural-network solution of the electronic Schr¨ odinger equation. Nature Chemistry12, 891–897 (2020) https://doi.org/10. 1038/s41557-020-0544-y

2020
[13]

Nature Communications13, 7895 (2022) https://doi.org/10.1038/ s41467-022-35627-1

Li, X., Li, Z., Chen, J.: Ab initio calculation of real solids via neural net- work ansatz. Nature Communications13, 7895 (2022) https://doi.org/10.1038/ s41467-022-35627-1

2022
[14]

Nature Computational Science2(5), 331–341 (2022) https://doi.org/10.1038/s43588-022-00228-x

Scherbela, M., Reisenhofer, R., Gerard, L., Marquetand, P., Grohs, P.: Solving the electronic schr¨ odinger equation for multiple nuclear geometries with weight- sharing deep neural networks. Nature Computational Science2(5), 331–341 (2022) https://doi.org/10.1038/s43588-022-00228-x

work page doi:10.1038/s43588-022-00228-x 2022
[15]

Nature Machine Intelligence6(2), 209–219 (2024) https://doi.org/10.1038/s42256-024-00794-x

Li, R., Ye, H., Jiang, D., Wen, X., Wang, C., Li, Z., Li, X., He, D., Chen, J., Ren, W., Wang, L.: A computational framework for neural network-based variational monte carlo with forward laplacian. Nature Machine Intelligence6(2), 209–219 (2024) https://doi.org/10.1038/s42256-024-00794-x

work page doi:10.1038/s42256-024-00794-x 2024
[16]

Nature Computational Science4(12), 910–919 (2024) https: //doi.org/10.1038/s43588-024-00730-4

Li, Z., Lu, Z., Li, R., Wen, X., Li, X., Wang, L., Chen, J., Ren, W.: Spin- symmetry-enforced solution of the many-body schr¨ odinger equation with a deep neural network. Nature Computational Science4(12), 910–919 (2024) https: //doi.org/10.1038/s43588-024-00730-4

work page doi:10.1038/s43588-024-00730-4 2024
[17]

Nature Computational Science5(12), 1147–1157 (2025) https://doi.org/10.1038/s43588-025-00872-z

Gerard, L., Scherbela, M., Sutterud, H., Foulkes, W.M.C., Grohs, P.: Transferable neural wavefunctions for solids. Nature Computational Science5(12), 1147–1157 (2025) https://doi.org/10.1038/s43588-025-00872-z

work page doi:10.1038/s43588-025-00872-z 2025
[18]

Nature Computational Science5(12), 1133–1146 (2025) https://doi.org/10.1038/ s43588-025-00932-4

Tang, Z., Chen, H., Li, Y., Qian, Y., Wang, Y., Fu, W., Li, J., Si, C., 30 Duan, W., Chen, J., Xu, Y.: Deep-learning electronic structure calculations. Nature Computational Science5(12), 1133–1146 (2025) https://doi.org/10.1038/ s43588-025-00932-4

2025
[19]

arXiv preprint arXiv:2311.02143 (2023) arXiv:2311.02143 [cond-mat.str-el]

Luo, D., Dai, D.D., Fu, L.: Pairing-based graph neural network for simulat- ing quantum materials. arXiv preprint arXiv:2311.02143 (2023) arXiv:2311.02143 [cond-mat.str-el]

arXiv 2023
[20]

Physical Review B111, 205117 (2025) https:// doi.org/10.1103/PhysRevB.111.205117

Teng, Y., Dai, D.D., Fu, L.: Solving the fractional quantum hall problem with self-attention neural networks. Physical Review B111, 205117 (2025) https:// doi.org/10.1103/PhysRevB.111.205117

work page doi:10.1103/physrevb.111.205117 2025
[21]

Physical Review Letters134, 176503 (2025) https://doi.org/10.1103/PhysRevLett.134.176503

Qian, Y., Zhao, T., Zhang, J., Xiang, T., Li, X., Chen, J.: Describing landau level mixing in fractional quantum hall states with deep learning. Physical Review Letters134, 176503 (2025) https://doi.org/10.1103/PhysRevLett.134.176503

work page doi:10.1103/physrevlett.134.176503 2025
[22]

arXiv:2512.11962 (2025)

Zaklama, T., Guerci, D., Fu, L.: Attention-based foundation model for quantum states. arXiv:2512.11962 (2025)

Pith/arXiv arXiv 2025
[23]

arXiv preprint arXiv:2603.02346 (2026) arXiv:2603.02346 [cond- mat.str-el]

Zaklama, T., Geier, M., Fu, L.: Large electron model: A universal ground state predictor. arXiv preprint arXiv:2603.02346 (2026) arXiv:2603.02346 [cond- mat.str-el]

Pith/arXiv arXiv 2026
[24]

arXiv preprint arXiv:2604.26018 (2026) arXiv:2604.26018 [cond-mat.str-el]

Nazaryan, K., Fu, L.: QERNEL: A scalable large electron model. arXiv preprint arXiv:2604.26018 (2026) arXiv:2604.26018 [cond-mat.str-el]

Pith/arXiv arXiv 2026
[25]

Troyer \ and\ author U.-J

Troyer, M., Wiese, U.-J.: Computational complexity and fundamental limita- tions to fermionic quantum Monte Carlo simulations. Physical Review Letters 94, 170201 (2005) https://doi.org/10.1103/PhysRevLett.94.170201

work page doi:10.1103/physrevlett.94.170201 2005
[26]

Marshall, Antiferromagnetism, Proc

Marshall, W.: Antiferromagnetism. Proceedings of the Royal Society of London. Series A232, 48–68 (1955) https://doi.org/10.1098/rspa.1955.0200

work page doi:10.1098/rspa.1955.0200 1955
[27]

Nature Communications11, 1593 (2020) https://doi.org/ 10.1038/s41467-020-15402-w

Westerhout, T., Astrakhantsev, N., Tikhonov, K.S., Katsnelson, M.I., Bagrov, A.A.: Generalization properties of neural network approximations to frustrated magnet ground states. Nature Communications11, 1593 (2020) https://doi.org/ 10.1038/s41467-020-15402-w

work page doi:10.1038/s41467-020-15402-w 2020
[28]

Physical Review Research2, 033075 (2020) https://doi.org/10.1103/ PhysRevResearch.2.033075

Szab´ o, A., Castelnovo, C.: Neural network wave functions and the sign prob- lem. Physical Review Research2, 033075 (2020) https://doi.org/10.1103/ PhysRevResearch.2.033075

2020
[29]

SciPost Physics10, 147 (2021) https://doi.org/10.21468/SciPostPhys.10.6.147

Bukov, M., Schmitt, M., Dupont, M.: Learning the ground state of a non- stoquastic quantum Hamiltonian in a rugged neural network landscape. SciPost Physics10, 147 (2021) https://doi.org/10.21468/SciPostPhys.10.6.147

work page doi:10.21468/scipostphys.10.6.147 2021
[30]

Physical Review Research4, 022026 (2022) https://doi.org/10.1103/PhysRevResearch.4.L022026

Chen, A., Choo, K., Astrakhantsev, N., Neupert, T.: Neural network evolution 31 strategy for solving quantum sign structures. Physical Review Research4, 022026 (2022) https://doi.org/10.1103/PhysRevResearch.4.L022026

work page doi:10.1103/physrevresearch.4.l022026 2022
[31]

Physical Review B64, 144515 (2001) https://doi.org/10.1103/PhysRevB.64.144515

Orignac, E., Giamarchi, T.: Meissner effect in a bosonic ladder. Physical Review B64, 144515 (2001) https://doi.org/10.1103/PhysRevB.64.144515

work page doi:10.1103/physrevb.64.144515 2001
[32]

Nature Physics10, 588–593 (2014) https://doi.org/10.1038/nphys2998

Atala, M., Aidelsburger, M., Lohse, M., Barreiro, J.T., Paredes, B., Bloch, I.: Observation of chiral currents with ultracold atoms in bosonic ladders. Nature Physics10, 588–593 (2014) https://doi.org/10.1038/nphys2998

work page doi:10.1038/nphys2998 2014
[33]

Cavity electro-optic circuit for microwave-to-optical conversion in the quantum ground state

H¨ ugel, D., Paredes, B.: Chiral ladders and the edges of quantum Hall insula- tors. Physical Review A89, 023619 (2014) https://doi.org/10.1103/PhysRevA. 89.023619

work page doi:10.1103/physreva 2014
[34]

SciPost Physics18, 011 (2025) https://doi.org/10.21468/SciPostPhys.18.1.011

Ledinauskas, E., Anisimovas, E.: Universal performance gap of neural quantum states applied to the Hofstadter–Bose–Hubbard model. SciPost Physics18, 011 (2025) https://doi.org/10.21468/SciPostPhys.18.1.011

work page doi:10.21468/scipostphys.18.1.011 2025
[35]

Physical Review B111, 045408 (2025) https://doi.org/10.1103/PhysRevB.111.045408

D¨ oschl, F., Palm, F.A., Lange, H., Grusdt, F., Bohrdt, A.: Neural network quantum states for the interacting Hofstadter model with higher local occu- pations and long-range interactions. Physical Review B111, 045408 (2025) https://doi.org/10.1103/PhysRevB.111.045408

work page doi:10.1103/physrevb.111.045408 2025
[36]

Journal of High Energy Physics2024(6), 125 (2024) https://doi.org/10

Wei, C., Mkhitaryan, V.V., Sedrakyan, T.A.: Unveiling chiral states in the XXZ chain: Finite-size scaling probing symmetry-enrichedc= 1 conformal field the- ories. Journal of High Energy Physics2024(6), 125 (2024) https://doi.org/10. 1007/JHEP06(2024)125

2024
[37]

Physical Review Letters125, 100503 (2020) https: //doi.org/10.1103/PhysRevLett.125.100503

Schmitt, M., Heyl, M.: Quantum many-body dynamics in two dimensions with artificial neural networks. Physical Review Letters125, 100503 (2020) https: //doi.org/10.1103/PhysRevLett.125.100503

work page doi:10.1103/physrevlett.125.100503 2020
[38]

Nature Physics20, 1476–1481 (2024) https://doi.org/10.1038/ s41567-024-02566-1

Chen, A., Heyl, M.: Empowering deep neural quantum states through efficient optimization. Nature Physics20, 1476–1481 (2024) https://doi.org/10.1038/ s41567-024-02566-1

2024
[39]

Physical Review B107, 075147 (2023) https: //doi.org/10.1103/PhysRevB.107.075147

Zhang, Y.-H., Di Ventra, M.: Transformer quantum state: A multipurpose model for quantum many-body problems. Physical Review B107, 075147 (2023) https: //doi.org/10.1103/PhysRevB.107.075147

work page doi:10.1103/physrevb.107.075147 2023
[40]

Physical Review B112, 165122 (2025) https://doi.org/ 10.1103/fqxr-r8vw

Ou, X., Huang, T., Ozoli¸ nˇ s, V.: Improving neural network performance for solving quantum sign structure. Physical Review B112, 165122 (2025) https://doi.org/ 10.1103/fqxr-r8vw . arXiv:2510.02051

work page doi:10.1103/fqxr-r8vw 2025
[41]

arXiv preprint arXiv:2507.05352 (2025) https://doi.org/10.48550/arXiv.2507.05352 32 arXiv:2507.05352 [quant-ph]

Misery, A., Gravina, L., Santini, A., Vicentini, F.: Looking elsewhere: improving variational monte carlo gradients by importance sampling. arXiv preprint arXiv:2507.05352 (2025) https://doi.org/10.48550/arXiv.2507.05352 32 arXiv:2507.05352 [quant-ph]

work page doi:10.48550/arxiv.2507.05352 2025
[42]

Physical Review Letters80, 4558–4561 (1998) https://doi.org/10.1103/PhysRevLett.80

Sorella, S.: Green function Monte Carlo with stochastic reconfiguration. Physical Review Letters80, 4558–4561 (1998) https://doi.org/10.1103/PhysRevLett.80. 4558

work page doi:10.1103/physrevlett.80 1998
[43]

Cambridge University Press, Cambridge (2017)

Becca, F., Sorella, S.: Quantum Monte Carlo Approaches for Correlated Sys- tems. Cambridge University Press, Cambridge (2017). https://doi.org/10.1017/ 9781316417041

2017
[44]

Quantum 4, 269 (2020) https://doi.org/10.22331/q-2020-05-25-269

Stokes, J., Izaac, J., Killoran, N., Carleo, G.: Quantum natural gradient. Quantum 4, 269 (2020) https://doi.org/10.22331/q-2020-05-25-269

work page doi:10.22331/q-2020-05-25-269 2020
[45]

Machine Learning8, 229–256 (1992) https://doi.org/10

Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning8, 229–256 (1992) https://doi.org/10. 1007/BF00992696

1992
[46]

Journal of Machine Learning Research21(132), 1–62 (2020)

Mohamed, S., Rosca, M., Figurnov, M., Mnih, A.: Monte Carlo gradient estima- tion in machine learning. Journal of Machine Learning Research21(132), 1–62 (2020)

2020
[47]

In: International Conference on Learning Representations (ICLR) (2014)

Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (ICLR) (2014). arXiv:1312.6114

Pith/arXiv arXiv 2014
[48]

Physical Review Letters83, 4682–4685 (1999) https://doi.org/10.1103/ PhysRevLett.83.4682

Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algo- rithms. Physical Review Letters83, 4682–4685 (1999) https://doi.org/10.1103/ PhysRevLett.83.4682

1999
[49]

Physical Review Letters69, 2863–2866 (1992) https://doi.org/10.1103/ PhysRevLett.69.2863

White, S.R.: Density matrix formulation for quantum renormalization groups. Physical Review Letters69, 2863–2866 (1992) https://doi.org/10.1103/ PhysRevLett.69.2863

1992
[50]

An equivalence between generalized Maxwell model and fractional Zener model, Mechanics of Materials 100:148-153 (2016)

Schollw¨ ock, U.: The density-matrix renormalization group in the age of matrix product states. Annals of Physics326, 96–192 (2011) https://doi.org/10.1016/j. aop.2010.09.012

work page doi:10.1016/j 2011
[51]

Stanford University, (2013)

Owen, A.B.: Monte Carlo Theory, Methods and Examples. Stanford University, (2013). Available at https://artowen.su.domains/mc/

2013
[52]

Journal of Machine Learning Research 5, 1471–1530 (2004)

Greensmith, E., Bartlett, P.L., Baxter, J.: Variance reduction techniques for gra- dient estimates in reinforcement learning. Journal of Machine Learning Research 5, 1471–1530 (2004)

2004
[53]

In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp

Tucker, G., Mnih, A., Maddison, C.J., Lawson, J., Sohl-Dickstein, J.: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp. 2627–2636 (2017) 33

2017
[54]

In: International Conference on Learning Representations (ICLR) (2018)

Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: Optimizing control variates for black-box gradient estima- tion. In: International Conference on Learning Representations (ICLR) (2018). arXiv:1711.00123

Pith/arXiv arXiv 2018
[55]

In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS)

Ranganath, R., Gerrish, S., Blei, D.M.: Black box variational inference. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS). Proceedings of Machine Learning Research, vol. 33, pp. 814–822 (2014)

2014
[56]

SIAM Review60(2), 223–311 (2018) https://doi.org/10.1137/ 16M1080173

Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Review60(2), 223–311 (2018) https://doi.org/10.1137/ 16M1080173

2018
[57]

In: 29th Annual Conference on Learning Theory (COLT)

Lee, J.D., Simchowitz, M., Jordan, M.I., Recht, B.: Gradient descent only con- verges to minimizers. In: 29th Annual Conference on Learning Theory (COLT). Proceedings of Machine Learning Research, vol. 49, pp. 1246–1257 (2016)

2016
[58]

SIAM Journal on Optimization16(2), 531– 547 (2005) https://doi.org/10.1137/040605266

Absil, P.-A., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM Journal on Optimization16(2), 531– 547 (2005) https://doi.org/10.1137/040605266

work page doi:10.1137/040605266 2005
[59]

Communications and Control Engineering

Helmke, U., Moore, J.B.: Optimization and Dynamical Systems. Communications and Control Engineering. Springer, London (1994) 34 Extended Data Fig. 1 Generality to interacting fermions.A 100-site spinless-fermion two- leg flux ladder (nearest-neighbour interactionV= 2, Φ = 0.5π, ten seeds), Jordan–Wigner mapped to spins: relative-error training curves, tai...

1994

[1] [1]

Science355(6325), 602–606 (2017) https://doi.org/10.1126/ science.aag2302

Carleo, G., Troyer, M.: Solving the quantum many-body problem with artifi- cial neural networks. Science355(6325), 602–606 (2017) https://doi.org/10.1126/ science.aag2302

2017

[2] [2]

Nature Communications8, 662 (2017) https://doi.org/10

Gao, X., Duan, L.-M.: Efficient representation of quantum many-body states with deep neural networks. Nature Communications8, 662 (2017) https://doi.org/10. 1038/s41467-017-00705-2

2017

[3] [3]

Physical Review Letters121, 167204 (2018) https://doi.org/10.1103/PhysRevLett.121.167204

Choo, K., Carleo, G., Regnault, N., Neupert, T.: Symmetries and many-body excitations with neural-network quantum states. Physical Review Letters121, 167204 (2018) https://doi.org/10.1103/PhysRevLett.121.167204

work page doi:10.1103/physrevlett.121.167204 2018

[4] [4]

Physical Review Letters124, 020503 (2020) https://doi.org/10.1103/ PhysRevLett.124.020503

Sharir, O., Levine, Y., Wies, N., Carleo, G., Shashua, A.: Deep autoregres- sive models for the efficient variational simulation of many-body quantum systems. Physical Review Letters124, 020503 (2020) https://doi.org/10.1103/ PhysRevLett.124.020503

2020

[5] [5]

Physical Review Research2, 023358 (2020) https://doi.org/10.1103/PhysRevResearch.2.023358

Hibat-Allah, M., Ganahl, M., Hayward, L.E., Melko, R.G., Carrasquilla, J.: Recurrent neural network wave functions. Physical Review Research2, 023358 (2020) https://doi.org/10.1103/PhysRevResearch.2.023358

work page doi:10.1103/physrevresearch.2.023358 2020

[6] [6]

Physical Review X11, 031034 (2021) https://doi.org/10.1103/ PhysRevX.11.031034

Nomura, Y., Imada, M.: Dirac-type nodal spin liquid revealed by refined quan- tum many-body solver using neural-network wave function, correlation ratio, and level spectroscopy. Physical Review X11, 031034 (2021) https://doi.org/10.1103/ PhysRevX.11.031034

2021

[7] [7]

SciPost Physics Codebases, 7 (2022) https://doi.org/10.21468/SciPostPhysCodeb.7

Vicentini, F., Hofmann, D., Szab´ o, A., Wu, D., Roth, C., Giuliani, C., Pescia, G., Nys, J., Vargas-Calder´ on, V., Astrakhantsev, N., Carleo, G.: NetKet 3: Machine 29 learning toolbox for many-body quantum systems. SciPost Physics Codebases, 7 (2022) https://doi.org/10.21468/SciPostPhysCodeb.7

work page doi:10.21468/scipostphyscodeb.7 2022

[8] [8]

The European Physical Journal Plus139, 631 (2024)

Medvidovi´ c, M., Robledo Moreno, J.: Neural-network quantum states for many- body physics. The European Physical Journal Plus139, 631 (2024)

2024

[9] [9]

Quantum Science and Technology9(4), 040501 (2024) https://doi.org/10.1088/2058-9565/ad7168

Lange, H., Walle, A., Abedinnia, A., Bohrdt, A.: From architectures to applica- tions: A review of neural quantum states. Quantum Science and Technology9(4), 040501 (2024) https://doi.org/10.1088/2058-9565/ad7168

work page doi:10.1088/2058-9565/ad7168 2024

[10] [10]

Journal of Computational Physics399, 108929 (2019) https: //doi.org/10.1016/j.jcp.2019.108929

Han, J., Zhang, L., E, W.: Solving many-electron schr¨ odinger equation using deep neural networks. Journal of Computational Physics399, 108929 (2019) https: //doi.org/10.1016/j.jcp.2019.108929

work page doi:10.1016/j.jcp.2019.108929 2019

[11] [11]

Physical Review Research2, 033429 (2020) https://doi.org/10.1103/PhysRevResearch.2

Pfau, D., Spencer, J.S., Matthews, A.G.D.G., Foulkes, W.M.C.: Ab initio solution of the many-electron Schr¨ odinger equation with deep neural networks. Physical Review Research2, 033429 (2020) https://doi.org/10.1103/PhysRevResearch.2. 033429

work page doi:10.1103/physrevresearch.2 2020

[12] [12]

Nature Chemistry12, 891–897 (2020) https://doi.org/10

Hermann, J., Sch¨ atzle, Z., No´ e, F.: Deep-neural-network solution of the electronic Schr¨ odinger equation. Nature Chemistry12, 891–897 (2020) https://doi.org/10. 1038/s41557-020-0544-y

2020

[13] [13]

Nature Communications13, 7895 (2022) https://doi.org/10.1038/ s41467-022-35627-1

Li, X., Li, Z., Chen, J.: Ab initio calculation of real solids via neural net- work ansatz. Nature Communications13, 7895 (2022) https://doi.org/10.1038/ s41467-022-35627-1

2022

[14] [14]

Nature Computational Science2(5), 331–341 (2022) https://doi.org/10.1038/s43588-022-00228-x

Scherbela, M., Reisenhofer, R., Gerard, L., Marquetand, P., Grohs, P.: Solving the electronic schr¨ odinger equation for multiple nuclear geometries with weight- sharing deep neural networks. Nature Computational Science2(5), 331–341 (2022) https://doi.org/10.1038/s43588-022-00228-x

work page doi:10.1038/s43588-022-00228-x 2022

[15] [15]

Nature Machine Intelligence6(2), 209–219 (2024) https://doi.org/10.1038/s42256-024-00794-x

Li, R., Ye, H., Jiang, D., Wen, X., Wang, C., Li, Z., Li, X., He, D., Chen, J., Ren, W., Wang, L.: A computational framework for neural network-based variational monte carlo with forward laplacian. Nature Machine Intelligence6(2), 209–219 (2024) https://doi.org/10.1038/s42256-024-00794-x

work page doi:10.1038/s42256-024-00794-x 2024

[16] [16]

Nature Computational Science4(12), 910–919 (2024) https: //doi.org/10.1038/s43588-024-00730-4

Li, Z., Lu, Z., Li, R., Wen, X., Li, X., Wang, L., Chen, J., Ren, W.: Spin- symmetry-enforced solution of the many-body schr¨ odinger equation with a deep neural network. Nature Computational Science4(12), 910–919 (2024) https: //doi.org/10.1038/s43588-024-00730-4

work page doi:10.1038/s43588-024-00730-4 2024

[17] [17]

Nature Computational Science5(12), 1147–1157 (2025) https://doi.org/10.1038/s43588-025-00872-z

Gerard, L., Scherbela, M., Sutterud, H., Foulkes, W.M.C., Grohs, P.: Transferable neural wavefunctions for solids. Nature Computational Science5(12), 1147–1157 (2025) https://doi.org/10.1038/s43588-025-00872-z

work page doi:10.1038/s43588-025-00872-z 2025

[18] [18]

Nature Computational Science5(12), 1133–1146 (2025) https://doi.org/10.1038/ s43588-025-00932-4

Tang, Z., Chen, H., Li, Y., Qian, Y., Wang, Y., Fu, W., Li, J., Si, C., 30 Duan, W., Chen, J., Xu, Y.: Deep-learning electronic structure calculations. Nature Computational Science5(12), 1133–1146 (2025) https://doi.org/10.1038/ s43588-025-00932-4

2025

[19] [19]

arXiv preprint arXiv:2311.02143 (2023) arXiv:2311.02143 [cond-mat.str-el]

Luo, D., Dai, D.D., Fu, L.: Pairing-based graph neural network for simulat- ing quantum materials. arXiv preprint arXiv:2311.02143 (2023) arXiv:2311.02143 [cond-mat.str-el]

arXiv 2023

[20] [20]

Physical Review B111, 205117 (2025) https:// doi.org/10.1103/PhysRevB.111.205117

Teng, Y., Dai, D.D., Fu, L.: Solving the fractional quantum hall problem with self-attention neural networks. Physical Review B111, 205117 (2025) https:// doi.org/10.1103/PhysRevB.111.205117

work page doi:10.1103/physrevb.111.205117 2025

[21] [21]

Physical Review Letters134, 176503 (2025) https://doi.org/10.1103/PhysRevLett.134.176503

Qian, Y., Zhao, T., Zhang, J., Xiang, T., Li, X., Chen, J.: Describing landau level mixing in fractional quantum hall states with deep learning. Physical Review Letters134, 176503 (2025) https://doi.org/10.1103/PhysRevLett.134.176503

work page doi:10.1103/physrevlett.134.176503 2025

[22] [22]

arXiv:2512.11962 (2025)

Zaklama, T., Guerci, D., Fu, L.: Attention-based foundation model for quantum states. arXiv:2512.11962 (2025)

Pith/arXiv arXiv 2025

[23] [23]

arXiv preprint arXiv:2603.02346 (2026) arXiv:2603.02346 [cond- mat.str-el]

Zaklama, T., Geier, M., Fu, L.: Large electron model: A universal ground state predictor. arXiv preprint arXiv:2603.02346 (2026) arXiv:2603.02346 [cond- mat.str-el]

Pith/arXiv arXiv 2026

[24] [24]

arXiv preprint arXiv:2604.26018 (2026) arXiv:2604.26018 [cond-mat.str-el]

Nazaryan, K., Fu, L.: QERNEL: A scalable large electron model. arXiv preprint arXiv:2604.26018 (2026) arXiv:2604.26018 [cond-mat.str-el]

Pith/arXiv arXiv 2026

[25] [25]

Troyer \ and\ author U.-J

Troyer, M., Wiese, U.-J.: Computational complexity and fundamental limita- tions to fermionic quantum Monte Carlo simulations. Physical Review Letters 94, 170201 (2005) https://doi.org/10.1103/PhysRevLett.94.170201

work page doi:10.1103/physrevlett.94.170201 2005

[26] [26]

Marshall, Antiferromagnetism, Proc

Marshall, W.: Antiferromagnetism. Proceedings of the Royal Society of London. Series A232, 48–68 (1955) https://doi.org/10.1098/rspa.1955.0200

work page doi:10.1098/rspa.1955.0200 1955

[27] [27]

Nature Communications11, 1593 (2020) https://doi.org/ 10.1038/s41467-020-15402-w

Westerhout, T., Astrakhantsev, N., Tikhonov, K.S., Katsnelson, M.I., Bagrov, A.A.: Generalization properties of neural network approximations to frustrated magnet ground states. Nature Communications11, 1593 (2020) https://doi.org/ 10.1038/s41467-020-15402-w

work page doi:10.1038/s41467-020-15402-w 2020

[28] [28]

Physical Review Research2, 033075 (2020) https://doi.org/10.1103/ PhysRevResearch.2.033075

Szab´ o, A., Castelnovo, C.: Neural network wave functions and the sign prob- lem. Physical Review Research2, 033075 (2020) https://doi.org/10.1103/ PhysRevResearch.2.033075

2020

[29] [29]

SciPost Physics10, 147 (2021) https://doi.org/10.21468/SciPostPhys.10.6.147

Bukov, M., Schmitt, M., Dupont, M.: Learning the ground state of a non- stoquastic quantum Hamiltonian in a rugged neural network landscape. SciPost Physics10, 147 (2021) https://doi.org/10.21468/SciPostPhys.10.6.147

work page doi:10.21468/scipostphys.10.6.147 2021

[30] [30]

Physical Review Research4, 022026 (2022) https://doi.org/10.1103/PhysRevResearch.4.L022026

Chen, A., Choo, K., Astrakhantsev, N., Neupert, T.: Neural network evolution 31 strategy for solving quantum sign structures. Physical Review Research4, 022026 (2022) https://doi.org/10.1103/PhysRevResearch.4.L022026

work page doi:10.1103/physrevresearch.4.l022026 2022

[31] [31]

Physical Review B64, 144515 (2001) https://doi.org/10.1103/PhysRevB.64.144515

Orignac, E., Giamarchi, T.: Meissner effect in a bosonic ladder. Physical Review B64, 144515 (2001) https://doi.org/10.1103/PhysRevB.64.144515

work page doi:10.1103/physrevb.64.144515 2001

[32] [32]

Nature Physics10, 588–593 (2014) https://doi.org/10.1038/nphys2998

Atala, M., Aidelsburger, M., Lohse, M., Barreiro, J.T., Paredes, B., Bloch, I.: Observation of chiral currents with ultracold atoms in bosonic ladders. Nature Physics10, 588–593 (2014) https://doi.org/10.1038/nphys2998

work page doi:10.1038/nphys2998 2014

[33] [33]

Cavity electro-optic circuit for microwave-to-optical conversion in the quantum ground state

H¨ ugel, D., Paredes, B.: Chiral ladders and the edges of quantum Hall insula- tors. Physical Review A89, 023619 (2014) https://doi.org/10.1103/PhysRevA. 89.023619

work page doi:10.1103/physreva 2014

[34] [34]

SciPost Physics18, 011 (2025) https://doi.org/10.21468/SciPostPhys.18.1.011

Ledinauskas, E., Anisimovas, E.: Universal performance gap of neural quantum states applied to the Hofstadter–Bose–Hubbard model. SciPost Physics18, 011 (2025) https://doi.org/10.21468/SciPostPhys.18.1.011

work page doi:10.21468/scipostphys.18.1.011 2025

[35] [35]

Physical Review B111, 045408 (2025) https://doi.org/10.1103/PhysRevB.111.045408

D¨ oschl, F., Palm, F.A., Lange, H., Grusdt, F., Bohrdt, A.: Neural network quantum states for the interacting Hofstadter model with higher local occu- pations and long-range interactions. Physical Review B111, 045408 (2025) https://doi.org/10.1103/PhysRevB.111.045408

work page doi:10.1103/physrevb.111.045408 2025

[36] [36]

Journal of High Energy Physics2024(6), 125 (2024) https://doi.org/10

Wei, C., Mkhitaryan, V.V., Sedrakyan, T.A.: Unveiling chiral states in the XXZ chain: Finite-size scaling probing symmetry-enrichedc= 1 conformal field the- ories. Journal of High Energy Physics2024(6), 125 (2024) https://doi.org/10. 1007/JHEP06(2024)125

2024

[37] [37]

Physical Review Letters125, 100503 (2020) https: //doi.org/10.1103/PhysRevLett.125.100503

Schmitt, M., Heyl, M.: Quantum many-body dynamics in two dimensions with artificial neural networks. Physical Review Letters125, 100503 (2020) https: //doi.org/10.1103/PhysRevLett.125.100503

work page doi:10.1103/physrevlett.125.100503 2020

[38] [38]

Nature Physics20, 1476–1481 (2024) https://doi.org/10.1038/ s41567-024-02566-1

Chen, A., Heyl, M.: Empowering deep neural quantum states through efficient optimization. Nature Physics20, 1476–1481 (2024) https://doi.org/10.1038/ s41567-024-02566-1

2024

[39] [39]

Physical Review B107, 075147 (2023) https: //doi.org/10.1103/PhysRevB.107.075147

Zhang, Y.-H., Di Ventra, M.: Transformer quantum state: A multipurpose model for quantum many-body problems. Physical Review B107, 075147 (2023) https: //doi.org/10.1103/PhysRevB.107.075147

work page doi:10.1103/physrevb.107.075147 2023

[40] [40]

Physical Review B112, 165122 (2025) https://doi.org/ 10.1103/fqxr-r8vw

Ou, X., Huang, T., Ozoli¸ nˇ s, V.: Improving neural network performance for solving quantum sign structure. Physical Review B112, 165122 (2025) https://doi.org/ 10.1103/fqxr-r8vw . arXiv:2510.02051

work page doi:10.1103/fqxr-r8vw 2025

[41] [41]

arXiv preprint arXiv:2507.05352 (2025) https://doi.org/10.48550/arXiv.2507.05352 32 arXiv:2507.05352 [quant-ph]

Misery, A., Gravina, L., Santini, A., Vicentini, F.: Looking elsewhere: improving variational monte carlo gradients by importance sampling. arXiv preprint arXiv:2507.05352 (2025) https://doi.org/10.48550/arXiv.2507.05352 32 arXiv:2507.05352 [quant-ph]

work page doi:10.48550/arxiv.2507.05352 2025

[42] [42]

Physical Review Letters80, 4558–4561 (1998) https://doi.org/10.1103/PhysRevLett.80

Sorella, S.: Green function Monte Carlo with stochastic reconfiguration. Physical Review Letters80, 4558–4561 (1998) https://doi.org/10.1103/PhysRevLett.80. 4558

work page doi:10.1103/physrevlett.80 1998

[43] [43]

Cambridge University Press, Cambridge (2017)

Becca, F., Sorella, S.: Quantum Monte Carlo Approaches for Correlated Sys- tems. Cambridge University Press, Cambridge (2017). https://doi.org/10.1017/ 9781316417041

2017

[44] [44]

Quantum 4, 269 (2020) https://doi.org/10.22331/q-2020-05-25-269

Stokes, J., Izaac, J., Killoran, N., Carleo, G.: Quantum natural gradient. Quantum 4, 269 (2020) https://doi.org/10.22331/q-2020-05-25-269

work page doi:10.22331/q-2020-05-25-269 2020

[45] [45]

Machine Learning8, 229–256 (1992) https://doi.org/10

Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning8, 229–256 (1992) https://doi.org/10. 1007/BF00992696

1992

[46] [46]

Journal of Machine Learning Research21(132), 1–62 (2020)

Mohamed, S., Rosca, M., Figurnov, M., Mnih, A.: Monte Carlo gradient estima- tion in machine learning. Journal of Machine Learning Research21(132), 1–62 (2020)

2020

[47] [47]

In: International Conference on Learning Representations (ICLR) (2014)

Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (ICLR) (2014). arXiv:1312.6114

Pith/arXiv arXiv 2014

[48] [48]

Physical Review Letters83, 4682–4685 (1999) https://doi.org/10.1103/ PhysRevLett.83.4682

Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algo- rithms. Physical Review Letters83, 4682–4685 (1999) https://doi.org/10.1103/ PhysRevLett.83.4682

1999

[49] [49]

Physical Review Letters69, 2863–2866 (1992) https://doi.org/10.1103/ PhysRevLett.69.2863

White, S.R.: Density matrix formulation for quantum renormalization groups. Physical Review Letters69, 2863–2866 (1992) https://doi.org/10.1103/ PhysRevLett.69.2863

1992

[50] [50]

An equivalence between generalized Maxwell model and fractional Zener model, Mechanics of Materials 100:148-153 (2016)

Schollw¨ ock, U.: The density-matrix renormalization group in the age of matrix product states. Annals of Physics326, 96–192 (2011) https://doi.org/10.1016/j. aop.2010.09.012

work page doi:10.1016/j 2011

[51] [51]

Stanford University, (2013)

Owen, A.B.: Monte Carlo Theory, Methods and Examples. Stanford University, (2013). Available at https://artowen.su.domains/mc/

2013

[52] [52]

Journal of Machine Learning Research 5, 1471–1530 (2004)

Greensmith, E., Bartlett, P.L., Baxter, J.: Variance reduction techniques for gra- dient estimates in reinforcement learning. Journal of Machine Learning Research 5, 1471–1530 (2004)

2004

[53] [53]

In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp

Tucker, G., Mnih, A., Maddison, C.J., Lawson, J., Sohl-Dickstein, J.: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp. 2627–2636 (2017) 33

2017

[54] [54]

In: International Conference on Learning Representations (ICLR) (2018)

Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: Optimizing control variates for black-box gradient estima- tion. In: International Conference on Learning Representations (ICLR) (2018). arXiv:1711.00123

Pith/arXiv arXiv 2018

[55] [55]

In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS)

Ranganath, R., Gerrish, S., Blei, D.M.: Black box variational inference. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS). Proceedings of Machine Learning Research, vol. 33, pp. 814–822 (2014)

2014

[56] [56]

SIAM Review60(2), 223–311 (2018) https://doi.org/10.1137/ 16M1080173

Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Review60(2), 223–311 (2018) https://doi.org/10.1137/ 16M1080173

2018

[57] [57]

In: 29th Annual Conference on Learning Theory (COLT)

Lee, J.D., Simchowitz, M., Jordan, M.I., Recht, B.: Gradient descent only con- verges to minimizers. In: 29th Annual Conference on Learning Theory (COLT). Proceedings of Machine Learning Research, vol. 49, pp. 1246–1257 (2016)

2016

[58] [58]

SIAM Journal on Optimization16(2), 531– 547 (2005) https://doi.org/10.1137/040605266

Absil, P.-A., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM Journal on Optimization16(2), 531– 547 (2005) https://doi.org/10.1137/040605266

work page doi:10.1137/040605266 2005

[59] [59]

Communications and Control Engineering

Helmke, U., Moore, J.B.: Optimization and Dynamical Systems. Communications and Control Engineering. Springer, London (1994) 34 Extended Data Fig. 1 Generality to interacting fermions.A 100-site spinless-fermion two- leg flux ladder (nearest-neighbour interactionV= 2, Φ = 0.5π, ten seeds), Jordan–Wigner mapped to spins: relative-error training curves, tai...

1994