Low-variance estimators overcome the phase-gradient bottleneck in complex-valued neural quantum states
Pith reviewed 2026-06-27 04:48 UTC · model grok-4.3
The pith
Differentiating the local energy at fixed Monte Carlo samples yields an unbiased low-variance estimator of the phase force for complex neural quantum states.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For separated amplitude-phase states, differentiating the local energy at fixed samples gives a different unbiased estimator of the same variational Monte Carlo phase force, without changing the objective. The method extends to coupled two-head networks by keeping the amplitude-gradient contribution and applying the direct derivative only to the phase path, then interpolating between the two estimators with an adaptive minimum-variance mixture during training.
What carries the argument
Direct derivative of the local energy with respect to phase parameters at fixed Monte Carlo samples, serving as an alternative unbiased estimator of the variational phase force.
If this is right
- The new estimator reduces variance of the phase gradient in variational Monte Carlo training of complex neural quantum states.
- It suppresses optimization failures that depend on random seed in systems with gauge, chiral, or topological phase structure.
- Training reaches sub-percent accuracy on benchmarks where the standard estimator plateaus at several percent error.
- The adaptive mixture construction applies to both separated and shared-weight network architectures.
Where Pith is reading between the lines
- If the fixed-sample estimator maintains its variance reduction at larger system sizes, it could allow reliable optimization of neural states for phases that were previously inaccessible due to gradient noise.
- The mixture approach might be generalized to reduce variance in estimators for other variational parameters beyond phase.
- Applying the estimator to models with explicit anyonic or non-Abelian statistics could test whether the variance benefit persists when phase structure is more intricate.
Load-bearing premise
Fixing the Monte Carlo samples while differentiating the local energy with respect to phase parameters keeps the estimator unbiased for the phase force even after the adaptive mixture is introduced and when amplitude and phase share network weights.
What would settle it
A direct numerical check showing that the expectation of the fixed-sample local-energy derivative deviates from the standard phase-force expectation in a coupled two-head network would disprove unbiasedness.
read the original abstract
Complex neural quantum states are difficult to optimize when their wavefunction phase carries gauge, chiral, fermionic, or topological structure. We show that the major failure mode is not only ansatz expressivity, but the Monte Carlo estimator used to learn this phase. For separated amplitude-phase states, differentiating the local energy at fixed samples gives a different unbiased estimator of the same variational Monte Carlo phase force, without changing the objective. We further extend the construction to coupled two-head networks by keeping the amplitude-gradient contribution and applying the direct derivative only to the phase path. An adaptive minimum-variance mixture interpolates between standard and direct estimators during training. Across flux ladders, chiral chains, two-dimensional flux cylinders, an interacting fermion ladder, shared-network controls, and a fractional quantum Hall benchmark, the resulting estimators reduce phase-gradient variance, suppress seed failures, and often move multi-percent standard-gradient plateaus to sub-percent accuracy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes low-variance Monte Carlo estimators for the phase gradient in complex-valued neural quantum states. For amplitude-phase separated ansatzes, differentiating the local energy at fixed samples yields an alternative unbiased estimator of the variational phase force. The construction is extended to coupled two-head networks by retaining the amplitude-gradient term, applying the direct derivative only along the phase path, and interpolating via an adaptive minimum-variance mixture. Benchmarks on flux ladders, chiral chains, 2D flux cylinders, an interacting fermion ladder, shared-network controls, and a fractional quantum Hall state report reduced phase-gradient variance, fewer optimization failures, and improved accuracy relative to standard estimators.
Significance. If the unbiasedness of the hybrid estimator is rigorously established, the work targets a recognized practical bottleneck in variational Monte Carlo with complex NQS, offering a route to more stable optimization of states with non-trivial phase structure without altering the variational objective. The breadth of numerical tests across distinct physical systems constitutes a concrete strength.
major comments (1)
- [Estimator construction for coupled networks] The central claim that the hybrid estimator (amplitude-gradient term retained plus direct phase-path derivative, combined by adaptive mixture) remains exactly unbiased when amplitude and phase paths share network weights is load-bearing. Shared weights couple amplitude and phase contributions inside the local energy, and the adaptive mixing weights are sample-dependent; it is not obvious that the fixed-sample derivative still commutes with the expectation. An explicit derivation (or counter-example) confirming that the expectation equals the true variational phase force under these conditions is required.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for identifying the need for a rigorous treatment of unbiasedness in the coupled-network hybrid estimator. We address this point below.
read point-by-point responses
-
Referee: [Estimator construction for coupled networks] The central claim that the hybrid estimator (amplitude-gradient term retained plus direct phase-path derivative, combined by adaptive mixture) remains exactly unbiased when amplitude and phase paths share network weights is load-bearing. Shared weights couple amplitude and phase contributions inside the local energy, and the adaptive mixing weights are sample-dependent; it is not obvious that the fixed-sample derivative still commutes with the expectation. An explicit derivation (or counter-example) confirming that the expectation equals the true variational phase force under these conditions is required.
Authors: We agree that the sample-dependent adaptive mixing weights introduce a subtlety: because the weights correlate with the per-sample estimators, linearity of expectation alone does not immediately guarantee that the mixture remains exactly unbiased. The manuscript asserts unbiasedness for the separated-amplitude-phase case and extends the construction to coupled networks, but does not supply the explicit derivation requested. In the revised manuscript we will add a dedicated subsection (or appendix) that either (i) derives the conditions under which the hybrid estimator remains exactly unbiased or (ii) clarifies that the estimator is approximately unbiased in practice, with the numerical evidence across multiple systems serving as empirical support. We will also report any additional assumptions required for exact unbiasedness. revision: yes
Circularity Check
No circularity: phase estimator derived directly from local-energy differentiation
full rationale
The paper presents the low-variance estimator as obtained by differentiating the local energy with respect to phase parameters while holding Monte Carlo samples fixed, yielding an unbiased estimator of the variational phase force for separated amplitude-phase states; the extension to coupled networks retains the amplitude gradient term and mixes via an adaptive minimum-variance combination. No quoted equation or claim reduces this construction to a fitted parameter renamed as a prediction, a self-citation chain, an ansatz smuggled from prior work, or any other enumerated circular pattern. The derivation is therefore self-contained against the stated variational Monte Carlo objective.
Axiom & Free-Parameter Ledger
free parameters (1)
- adaptive mixture weight
axioms (1)
- domain assumption Fixing Monte Carlo samples while differentiating the local energy produces an unbiased estimator of the phase force
Reference graph
Works this paper leans on
-
[1]
Science355(6325), 602–606 (2017) https://doi.org/10.1126/ science.aag2302
Carleo, G., Troyer, M.: Solving the quantum many-body problem with artifi- cial neural networks. Science355(6325), 602–606 (2017) https://doi.org/10.1126/ science.aag2302
2017
-
[2]
Nature Communications8, 662 (2017) https://doi.org/10
Gao, X., Duan, L.-M.: Efficient representation of quantum many-body states with deep neural networks. Nature Communications8, 662 (2017) https://doi.org/10. 1038/s41467-017-00705-2
2017
-
[3]
Physical Review Letters121, 167204 (2018) https://doi.org/10.1103/PhysRevLett.121.167204
Choo, K., Carleo, G., Regnault, N., Neupert, T.: Symmetries and many-body excitations with neural-network quantum states. Physical Review Letters121, 167204 (2018) https://doi.org/10.1103/PhysRevLett.121.167204
-
[4]
Physical Review Letters124, 020503 (2020) https://doi.org/10.1103/ PhysRevLett.124.020503
Sharir, O., Levine, Y., Wies, N., Carleo, G., Shashua, A.: Deep autoregres- sive models for the efficient variational simulation of many-body quantum systems. Physical Review Letters124, 020503 (2020) https://doi.org/10.1103/ PhysRevLett.124.020503
2020
-
[5]
Physical Review Research2, 023358 (2020) https://doi.org/10.1103/PhysRevResearch.2.023358
Hibat-Allah, M., Ganahl, M., Hayward, L.E., Melko, R.G., Carrasquilla, J.: Recurrent neural network wave functions. Physical Review Research2, 023358 (2020) https://doi.org/10.1103/PhysRevResearch.2.023358
-
[6]
Physical Review X11, 031034 (2021) https://doi.org/10.1103/ PhysRevX.11.031034
Nomura, Y., Imada, M.: Dirac-type nodal spin liquid revealed by refined quan- tum many-body solver using neural-network wave function, correlation ratio, and level spectroscopy. Physical Review X11, 031034 (2021) https://doi.org/10.1103/ PhysRevX.11.031034
2021
-
[7]
SciPost Physics Codebases, 7 (2022) https://doi.org/10.21468/SciPostPhysCodeb.7
Vicentini, F., Hofmann, D., Szab´ o, A., Wu, D., Roth, C., Giuliani, C., Pescia, G., Nys, J., Vargas-Calder´ on, V., Astrakhantsev, N., Carleo, G.: NetKet 3: Machine 29 learning toolbox for many-body quantum systems. SciPost Physics Codebases, 7 (2022) https://doi.org/10.21468/SciPostPhysCodeb.7
-
[8]
The European Physical Journal Plus139, 631 (2024)
Medvidovi´ c, M., Robledo Moreno, J.: Neural-network quantum states for many- body physics. The European Physical Journal Plus139, 631 (2024)
2024
-
[9]
Quantum Science and Technology9(4), 040501 (2024) https://doi.org/10.1088/2058-9565/ad7168
Lange, H., Walle, A., Abedinnia, A., Bohrdt, A.: From architectures to applica- tions: A review of neural quantum states. Quantum Science and Technology9(4), 040501 (2024) https://doi.org/10.1088/2058-9565/ad7168
-
[10]
Journal of Computational Physics399, 108929 (2019) https: //doi.org/10.1016/j.jcp.2019.108929
Han, J., Zhang, L., E, W.: Solving many-electron schr¨ odinger equation using deep neural networks. Journal of Computational Physics399, 108929 (2019) https: //doi.org/10.1016/j.jcp.2019.108929
-
[11]
Physical Review Research2, 033429 (2020) https://doi.org/10.1103/PhysRevResearch.2
Pfau, D., Spencer, J.S., Matthews, A.G.D.G., Foulkes, W.M.C.: Ab initio solution of the many-electron Schr¨ odinger equation with deep neural networks. Physical Review Research2, 033429 (2020) https://doi.org/10.1103/PhysRevResearch.2. 033429
-
[12]
Nature Chemistry12, 891–897 (2020) https://doi.org/10
Hermann, J., Sch¨ atzle, Z., No´ e, F.: Deep-neural-network solution of the electronic Schr¨ odinger equation. Nature Chemistry12, 891–897 (2020) https://doi.org/10. 1038/s41557-020-0544-y
2020
-
[13]
Nature Communications13, 7895 (2022) https://doi.org/10.1038/ s41467-022-35627-1
Li, X., Li, Z., Chen, J.: Ab initio calculation of real solids via neural net- work ansatz. Nature Communications13, 7895 (2022) https://doi.org/10.1038/ s41467-022-35627-1
2022
-
[14]
Nature Computational Science2(5), 331–341 (2022) https://doi.org/10.1038/s43588-022-00228-x
Scherbela, M., Reisenhofer, R., Gerard, L., Marquetand, P., Grohs, P.: Solving the electronic schr¨ odinger equation for multiple nuclear geometries with weight- sharing deep neural networks. Nature Computational Science2(5), 331–341 (2022) https://doi.org/10.1038/s43588-022-00228-x
-
[15]
Nature Machine Intelligence6(2), 209–219 (2024) https://doi.org/10.1038/s42256-024-00794-x
Li, R., Ye, H., Jiang, D., Wen, X., Wang, C., Li, Z., Li, X., He, D., Chen, J., Ren, W., Wang, L.: A computational framework for neural network-based variational monte carlo with forward laplacian. Nature Machine Intelligence6(2), 209–219 (2024) https://doi.org/10.1038/s42256-024-00794-x
-
[16]
Nature Computational Science4(12), 910–919 (2024) https: //doi.org/10.1038/s43588-024-00730-4
Li, Z., Lu, Z., Li, R., Wen, X., Li, X., Wang, L., Chen, J., Ren, W.: Spin- symmetry-enforced solution of the many-body schr¨ odinger equation with a deep neural network. Nature Computational Science4(12), 910–919 (2024) https: //doi.org/10.1038/s43588-024-00730-4
-
[17]
Nature Computational Science5(12), 1147–1157 (2025) https://doi.org/10.1038/s43588-025-00872-z
Gerard, L., Scherbela, M., Sutterud, H., Foulkes, W.M.C., Grohs, P.: Transferable neural wavefunctions for solids. Nature Computational Science5(12), 1147–1157 (2025) https://doi.org/10.1038/s43588-025-00872-z
-
[18]
Nature Computational Science5(12), 1133–1146 (2025) https://doi.org/10.1038/ s43588-025-00932-4
Tang, Z., Chen, H., Li, Y., Qian, Y., Wang, Y., Fu, W., Li, J., Si, C., 30 Duan, W., Chen, J., Xu, Y.: Deep-learning electronic structure calculations. Nature Computational Science5(12), 1133–1146 (2025) https://doi.org/10.1038/ s43588-025-00932-4
2025
-
[19]
arXiv preprint arXiv:2311.02143 (2023) arXiv:2311.02143 [cond-mat.str-el]
Luo, D., Dai, D.D., Fu, L.: Pairing-based graph neural network for simulat- ing quantum materials. arXiv preprint arXiv:2311.02143 (2023) arXiv:2311.02143 [cond-mat.str-el]
arXiv 2023
-
[20]
Physical Review B111, 205117 (2025) https:// doi.org/10.1103/PhysRevB.111.205117
Teng, Y., Dai, D.D., Fu, L.: Solving the fractional quantum hall problem with self-attention neural networks. Physical Review B111, 205117 (2025) https:// doi.org/10.1103/PhysRevB.111.205117
-
[21]
Physical Review Letters134, 176503 (2025) https://doi.org/10.1103/PhysRevLett.134.176503
Qian, Y., Zhao, T., Zhang, J., Xiang, T., Li, X., Chen, J.: Describing landau level mixing in fractional quantum hall states with deep learning. Physical Review Letters134, 176503 (2025) https://doi.org/10.1103/PhysRevLett.134.176503
-
[22]
Zaklama, T., Guerci, D., Fu, L.: Attention-based foundation model for quantum states. arXiv:2512.11962 (2025)
Pith/arXiv arXiv 2025
-
[23]
arXiv preprint arXiv:2603.02346 (2026) arXiv:2603.02346 [cond- mat.str-el]
Zaklama, T., Geier, M., Fu, L.: Large electron model: A universal ground state predictor. arXiv preprint arXiv:2603.02346 (2026) arXiv:2603.02346 [cond- mat.str-el]
Pith/arXiv arXiv 2026
-
[24]
arXiv preprint arXiv:2604.26018 (2026) arXiv:2604.26018 [cond-mat.str-el]
Nazaryan, K., Fu, L.: QERNEL: A scalable large electron model. arXiv preprint arXiv:2604.26018 (2026) arXiv:2604.26018 [cond-mat.str-el]
Pith/arXiv arXiv 2026
-
[25]
Troyer, M., Wiese, U.-J.: Computational complexity and fundamental limita- tions to fermionic quantum Monte Carlo simulations. Physical Review Letters 94, 170201 (2005) https://doi.org/10.1103/PhysRevLett.94.170201
-
[26]
Marshall, Antiferromagnetism, Proc
Marshall, W.: Antiferromagnetism. Proceedings of the Royal Society of London. Series A232, 48–68 (1955) https://doi.org/10.1098/rspa.1955.0200
-
[27]
Nature Communications11, 1593 (2020) https://doi.org/ 10.1038/s41467-020-15402-w
Westerhout, T., Astrakhantsev, N., Tikhonov, K.S., Katsnelson, M.I., Bagrov, A.A.: Generalization properties of neural network approximations to frustrated magnet ground states. Nature Communications11, 1593 (2020) https://doi.org/ 10.1038/s41467-020-15402-w
-
[28]
Physical Review Research2, 033075 (2020) https://doi.org/10.1103/ PhysRevResearch.2.033075
Szab´ o, A., Castelnovo, C.: Neural network wave functions and the sign prob- lem. Physical Review Research2, 033075 (2020) https://doi.org/10.1103/ PhysRevResearch.2.033075
2020
-
[29]
SciPost Physics10, 147 (2021) https://doi.org/10.21468/SciPostPhys.10.6.147
Bukov, M., Schmitt, M., Dupont, M.: Learning the ground state of a non- stoquastic quantum Hamiltonian in a rugged neural network landscape. SciPost Physics10, 147 (2021) https://doi.org/10.21468/SciPostPhys.10.6.147
-
[30]
Physical Review Research4, 022026 (2022) https://doi.org/10.1103/PhysRevResearch.4.L022026
Chen, A., Choo, K., Astrakhantsev, N., Neupert, T.: Neural network evolution 31 strategy for solving quantum sign structures. Physical Review Research4, 022026 (2022) https://doi.org/10.1103/PhysRevResearch.4.L022026
-
[31]
Physical Review B64, 144515 (2001) https://doi.org/10.1103/PhysRevB.64.144515
Orignac, E., Giamarchi, T.: Meissner effect in a bosonic ladder. Physical Review B64, 144515 (2001) https://doi.org/10.1103/PhysRevB.64.144515
-
[32]
Nature Physics10, 588–593 (2014) https://doi.org/10.1038/nphys2998
Atala, M., Aidelsburger, M., Lohse, M., Barreiro, J.T., Paredes, B., Bloch, I.: Observation of chiral currents with ultracold atoms in bosonic ladders. Nature Physics10, 588–593 (2014) https://doi.org/10.1038/nphys2998
-
[33]
Cavity electro-optic circuit for microwave-to-optical conversion in the quantum ground state
H¨ ugel, D., Paredes, B.: Chiral ladders and the edges of quantum Hall insula- tors. Physical Review A89, 023619 (2014) https://doi.org/10.1103/PhysRevA. 89.023619
-
[34]
SciPost Physics18, 011 (2025) https://doi.org/10.21468/SciPostPhys.18.1.011
Ledinauskas, E., Anisimovas, E.: Universal performance gap of neural quantum states applied to the Hofstadter–Bose–Hubbard model. SciPost Physics18, 011 (2025) https://doi.org/10.21468/SciPostPhys.18.1.011
-
[35]
Physical Review B111, 045408 (2025) https://doi.org/10.1103/PhysRevB.111.045408
D¨ oschl, F., Palm, F.A., Lange, H., Grusdt, F., Bohrdt, A.: Neural network quantum states for the interacting Hofstadter model with higher local occu- pations and long-range interactions. Physical Review B111, 045408 (2025) https://doi.org/10.1103/PhysRevB.111.045408
-
[36]
Journal of High Energy Physics2024(6), 125 (2024) https://doi.org/10
Wei, C., Mkhitaryan, V.V., Sedrakyan, T.A.: Unveiling chiral states in the XXZ chain: Finite-size scaling probing symmetry-enrichedc= 1 conformal field the- ories. Journal of High Energy Physics2024(6), 125 (2024) https://doi.org/10. 1007/JHEP06(2024)125
2024
-
[37]
Physical Review Letters125, 100503 (2020) https: //doi.org/10.1103/PhysRevLett.125.100503
Schmitt, M., Heyl, M.: Quantum many-body dynamics in two dimensions with artificial neural networks. Physical Review Letters125, 100503 (2020) https: //doi.org/10.1103/PhysRevLett.125.100503
-
[38]
Nature Physics20, 1476–1481 (2024) https://doi.org/10.1038/ s41567-024-02566-1
Chen, A., Heyl, M.: Empowering deep neural quantum states through efficient optimization. Nature Physics20, 1476–1481 (2024) https://doi.org/10.1038/ s41567-024-02566-1
2024
-
[39]
Physical Review B107, 075147 (2023) https: //doi.org/10.1103/PhysRevB.107.075147
Zhang, Y.-H., Di Ventra, M.: Transformer quantum state: A multipurpose model for quantum many-body problems. Physical Review B107, 075147 (2023) https: //doi.org/10.1103/PhysRevB.107.075147
-
[40]
Physical Review B112, 165122 (2025) https://doi.org/ 10.1103/fqxr-r8vw
Ou, X., Huang, T., Ozoli¸ nˇ s, V.: Improving neural network performance for solving quantum sign structure. Physical Review B112, 165122 (2025) https://doi.org/ 10.1103/fqxr-r8vw . arXiv:2510.02051
-
[41]
Misery, A., Gravina, L., Santini, A., Vicentini, F.: Looking elsewhere: improving variational monte carlo gradients by importance sampling. arXiv preprint arXiv:2507.05352 (2025) https://doi.org/10.48550/arXiv.2507.05352 32 arXiv:2507.05352 [quant-ph]
-
[42]
Physical Review Letters80, 4558–4561 (1998) https://doi.org/10.1103/PhysRevLett.80
Sorella, S.: Green function Monte Carlo with stochastic reconfiguration. Physical Review Letters80, 4558–4561 (1998) https://doi.org/10.1103/PhysRevLett.80. 4558
-
[43]
Cambridge University Press, Cambridge (2017)
Becca, F., Sorella, S.: Quantum Monte Carlo Approaches for Correlated Sys- tems. Cambridge University Press, Cambridge (2017). https://doi.org/10.1017/ 9781316417041
2017
-
[44]
Quantum 4, 269 (2020) https://doi.org/10.22331/q-2020-05-25-269
Stokes, J., Izaac, J., Killoran, N., Carleo, G.: Quantum natural gradient. Quantum 4, 269 (2020) https://doi.org/10.22331/q-2020-05-25-269
-
[45]
Machine Learning8, 229–256 (1992) https://doi.org/10
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning8, 229–256 (1992) https://doi.org/10. 1007/BF00992696
1992
-
[46]
Journal of Machine Learning Research21(132), 1–62 (2020)
Mohamed, S., Rosca, M., Figurnov, M., Mnih, A.: Monte Carlo gradient estima- tion in machine learning. Journal of Machine Learning Research21(132), 1–62 (2020)
2020
-
[47]
In: International Conference on Learning Representations (ICLR) (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (ICLR) (2014). arXiv:1312.6114
Pith/arXiv arXiv 2014
-
[48]
Physical Review Letters83, 4682–4685 (1999) https://doi.org/10.1103/ PhysRevLett.83.4682
Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algo- rithms. Physical Review Letters83, 4682–4685 (1999) https://doi.org/10.1103/ PhysRevLett.83.4682
1999
-
[49]
Physical Review Letters69, 2863–2866 (1992) https://doi.org/10.1103/ PhysRevLett.69.2863
White, S.R.: Density matrix formulation for quantum renormalization groups. Physical Review Letters69, 2863–2866 (1992) https://doi.org/10.1103/ PhysRevLett.69.2863
1992
-
[50]
Schollw¨ ock, U.: The density-matrix renormalization group in the age of matrix product states. Annals of Physics326, 96–192 (2011) https://doi.org/10.1016/j. aop.2010.09.012
work page doi:10.1016/j 2011
-
[51]
Stanford University, (2013)
Owen, A.B.: Monte Carlo Theory, Methods and Examples. Stanford University, (2013). Available at https://artowen.su.domains/mc/
2013
-
[52]
Journal of Machine Learning Research 5, 1471–1530 (2004)
Greensmith, E., Bartlett, P.L., Baxter, J.: Variance reduction techniques for gra- dient estimates in reinforcement learning. Journal of Machine Learning Research 5, 1471–1530 (2004)
2004
-
[53]
In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp
Tucker, G., Mnih, A., Maddison, C.J., Lawson, J., Sohl-Dickstein, J.: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. In: Advances in Neural Information Processing Systems 30 (NeurIPS), pp. 2627–2636 (2017) 33
2017
-
[54]
In: International Conference on Learning Representations (ICLR) (2018)
Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: Optimizing control variates for black-box gradient estima- tion. In: International Conference on Learning Representations (ICLR) (2018). arXiv:1711.00123
Pith/arXiv arXiv 2018
-
[55]
In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS)
Ranganath, R., Gerrish, S., Blei, D.M.: Black box variational inference. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS). Proceedings of Machine Learning Research, vol. 33, pp. 814–822 (2014)
2014
-
[56]
SIAM Review60(2), 223–311 (2018) https://doi.org/10.1137/ 16M1080173
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Review60(2), 223–311 (2018) https://doi.org/10.1137/ 16M1080173
2018
-
[57]
In: 29th Annual Conference on Learning Theory (COLT)
Lee, J.D., Simchowitz, M., Jordan, M.I., Recht, B.: Gradient descent only con- verges to minimizers. In: 29th Annual Conference on Learning Theory (COLT). Proceedings of Machine Learning Research, vol. 49, pp. 1246–1257 (2016)
2016
-
[58]
SIAM Journal on Optimization16(2), 531– 547 (2005) https://doi.org/10.1137/040605266
Absil, P.-A., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM Journal on Optimization16(2), 531– 547 (2005) https://doi.org/10.1137/040605266
-
[59]
Communications and Control Engineering
Helmke, U., Moore, J.B.: Optimization and Dynamical Systems. Communications and Control Engineering. Springer, London (1994) 34 Extended Data Fig. 1 Generality to interacting fermions.A 100-site spinless-fermion two- leg flux ladder (nearest-neighbour interactionV= 2, Φ = 0.5π, ten seeds), Jordan–Wigner mapped to spins: relative-error training curves, tai...
1994
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.