Recognition: unknown
Enhancing Neural-Network Variational Monte Carlo through Basis Transformation
Pith reviewed 2026-05-10 07:36 UTC · model grok-4.3
The pith
A learnable Gaussian basis with one locality parameter α makes neural-network variational Monte Carlo represent the ground state more accurately without enlarging the network.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By writing the many-body wave function in a Gaussian basis controlled by a single learnable locality parameter α, the authors reshape the target ground state so that standard neural-network ansatzes achieve lower variational energies on the three-dimensional homogeneous electron gas. The transformation preserves the variational upper bound while introducing negligible overhead, and it yields a sharper estimate of the density at which the Fermi liquid gives way to the Wigner crystal when message-passing networks are employed.
What carries the argument
a Gaussian basis transformation controlled by one learnable locality parameter α that rescales the spatial extent of the orbitals and thereby alters the representation of the ground-state wave function
If this is right
- The variational energy is lowered for both FermiNet and message-passing neural networks on the three-dimensional homogeneous electron gas.
- Message-passing networks locate the Fermi-liquid to Wigner-crystal transition density more precisely when the basis parameter is optimized.
- The method adds only one extra variational parameter and combines directly with any existing neural-network architecture.
- Accuracy improvements can be obtained by making the target state easier to represent rather than by increasing the complexity of the neural ansatz.
Where Pith is reading between the lines
- The same single-parameter basis change could be tested on molecular systems to check whether optimal α tracks physical length scales such as bond distances.
- Joint optimization of α together with neural-network weights may produce larger gains than optimizing either alone.
- Analogous locality parameters might be introduced in other basis sets, such as plane waves or Slater-type orbitals, for continuous-space variational calculations.
Load-bearing premise
The optimized Gaussian basis must still span the exact ground state and must not introduce any uncontrolled bias into the variational energy estimate.
What would settle it
An independent variational Monte Carlo run on the three-dimensional homogeneous electron gas at fixed density that finds the energy with optimized α to be higher than the energy obtained with the identical neural network in the original basis.
Figures
read the original abstract
Neural-network variational Monte Carlo (NNVMC) has emerged as a powerful tool for solving quantum many-body problems, yet systematic pathways for improving its accuracy remain largely heuristic. Here, we introduce a physically motivated basis transformation for NNVMC that enhances variational expressivity without increasing the complexity of the neural-network ansatz itself. By formulating the many-body wave function in a Gaussian basis, we introduce a single learnable locality parameter, $\alpha$, that reshapes the target ground state into a more learnable representation. This approach introduces minimal computational overhead and can be readily combined with existing neural-network architectures. Using the three-dimensional homogeneous electron gas as a benchmark, we show that the optimized basis transformation consistently lowers the variational energy for both FermiNet and message-passing neural-network architectures. Notably, for the latter, it enables a more precise determination of the Fermi liquid to Wigner crystal phase transition. More broadly, our results highlight basis transformation as a new route to improving NNVMC in continuous space, showing that accuracy can be enhanced not only by refining the ansatz but also by making the target ground state easier to represent.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a basis transformation for neural-network variational Monte Carlo (NNVMC) in which the many-body wave function is expressed in a Gaussian basis controlled by a single learnable locality parameter α. This transformation is applied to the 3D homogeneous electron gas and is shown to lower variational energies for both FermiNet and message-passing neural-network ansatzes while improving resolution of the Fermi-liquid to Wigner-crystal transition, all without increasing the complexity of the neural network itself.
Significance. If the reported energy reductions and improved phase-transition resolution hold under the stated conditions, the work supplies a lightweight, physically motivated route to enhancing NNVMC expressivity by reshaping the target state rather than enlarging the ansatz. The approach preserves the variational upper bound because α is optimized jointly inside the Monte Carlo loop, introduces only one extra parameter, and is compatible with existing architectures. These features constitute a genuine addition to the toolkit for continuous-space quantum many-body calculations.
major comments (2)
- [§2.3, Eq. (12)] §2.3, Eq. (12): the chain-rule expression for the kinetic-energy operator after the Gaussian transformation is given, but the manuscript does not explicitly verify that the resulting local energy remains real-valued and that the Monte Carlo estimator remains unbiased when α is updated on the fly; a short numerical check or analytic argument confirming this would strengthen the central claim.
- [§4.1, Table 1] §4.1, Table 1: the energy differences with and without the basis transformation are reported to several decimal places, yet no breakdown is provided of the statistical uncertainty arising from finite Monte Carlo sampling versus the uncertainty in the optimized α itself; this makes it hard to judge whether the quoted improvements exceed the combined error bars.
minor comments (3)
- [Abstract] The abstract states that energies are lowered and the phase transition is better resolved but supplies no numerical values or error bars; a single sentence quantifying the typical energy gain (e.g., “∼0.001 Ha per electron”) would improve readability.
- [Figure 3] Figure 3 caption: the color scale for the order-parameter plot is not labeled with units or a numerical range, making it difficult to compare the sharpness of the transition with and without the transformation.
- [§3.2] §3.2: the optimization schedule for α (learning rate, update frequency, initialization) is described only qualitatively; a concise table or paragraph listing the hyper-parameters used for all reported runs would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the recommendation for minor revision. The comments identify useful opportunities to strengthen the presentation of the kinetic-energy implementation and the error analysis. We address each point below and have incorporated revisions to the manuscript.
read point-by-point responses
-
Referee: [§2.3, Eq. (12)] §2.3, Eq. (12): the chain-rule expression for the kinetic-energy operator after the Gaussian transformation is given, but the manuscript does not explicitly verify that the resulting local energy remains real-valued and that the Monte Carlo estimator remains unbiased when α is updated on the fly; a short numerical check or analytic argument confirming this would strengthen the central claim.
Authors: We thank the referee for highlighting this point. The Gaussian transformation is a real multiplicative prefactor, and the neural-network ansatz for the homogeneous electron gas is real-valued by construction. Application of the chain rule therefore yields a strictly real local energy. When α is optimized jointly with the network parameters inside the Monte Carlo loop, the estimator remains unbiased because each sample is drawn from the instantaneous wave function and the variational principle applies at every step, exactly as in standard VMC with variational parameters. To make this explicit we have added a short analytic paragraph in the revised §2.3 proving that the imaginary part vanishes identically, together with a brief numerical check (now included in the main text) confirming that the local-energy variance is consistent with real sampling and that the energy estimator converges without detectable bias. revision: yes
-
Referee: [§4.1, Table 1] §4.1, Table 1: the energy differences with and without the basis transformation are reported to several decimal places, yet no breakdown is provided of the statistical uncertainty arising from finite Monte Carlo sampling versus the uncertainty in the optimized α itself; this makes it hard to judge whether the quoted improvements exceed the combined error bars.
Authors: We agree that separating the two sources of uncertainty improves interpretability. The original Table 1 reported Monte Carlo statistical errors obtained via blocking analysis, but did not propagate the uncertainty arising from the finite optimization of α. In the revised manuscript we have updated Table 1 to display combined error bars: the Monte Carlo error is retained, while the uncertainty in α is estimated from the curvature of the energy surface at the optimum and added in quadrature. A new paragraph in §4.1 now describes the full error-propagation procedure. With these changes the reported energy reductions are shown to lie outside the combined uncertainties. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper's central construction introduces an explicit Gaussian basis transformation with one additional variational parameter α that is optimized jointly with the neural-network weights inside the standard NNVMC loop. The energy lowering follows directly from the enlarged variational manifold while the physical Hamiltonian and the variational upper-bound property remain unchanged; the coordinate change and chain-rule kinetic-energy evaluation are given explicitly. No load-bearing self-citation, self-definitional reparameterization, or fitted quantity renamed as a prediction appears in the derivation. The benchmark results on the 3D HEG are therefore independent evidence rather than a tautology.
Axiom & Free-Parameter Ledger
free parameters (1)
- alpha
axioms (1)
- domain assumption The transformed wave function remains a valid variational ansatz whose energy is an upper bound to the true ground-state energy.
Reference graph
Works this paper leans on
-
[1]
Hohenberg and W
P. Hohenberg and W. Kohn, Inhomogeneous electron gas, Phys. Rev.136, B864 (1964)
1964
-
[2]
Kohn and L
W. Kohn and L. J. Sham, Self-consistent equations in- cluding exchange and correlation effects, Phys. Rev.140, A1133 (1965)
1965
-
[3]
R. O. Jones and O. Gunnarsson, The density functional formalism, its applications and prospects, Rev. Mod. Phys.61, 689 (1989)
1989
-
[4]
S. R. White, Density matrix formulation for quantum renormalization groups, Phys. Rev. Lett.69, 2863 (1992)
1992
-
[5]
Schollw¨ ock, The density-matrix renormalization group, Rev
U. Schollw¨ ock, The density-matrix renormalization group, Rev. Mod. Phys.77, 259 (2005)
2005
-
[6]
Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann
U. Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann. Phys.326, 96 (2011)
2011
-
[7]
Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann
R. Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann. Phys.349, 117 (2014)
2014
-
[8]
J. I. Cirac, D. P´ erez-Garc´ ıa, N. Schuch, and F. Ver- straete, Matrix product states and projected entangled pair states: Concepts, symmetries, theorems, Rev. Mod. Phys.93, 045003 (2021)
2021
-
[9]
B. L. Hammond, W. A. Lester, and P. J. Reynolds,Monte Carlo methods in ab initio quantum chemistry, Vol. 1 (World Scientific, 1994)
1994
-
[10]
W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Ra- jagopal, Quantum monte carlo simulations of solids, Rev. Mod. Phys.73, 33 (2001)
2001
-
[11]
E. Y. Loh, J. E. Gubernatis, R. T. Scalettar, S. R. White, D. J. Scalapino, and R. L. Sugar, Sign problem in the nu- merical simulation of many-electron systems, Phys. Rev. B41, 9301 (1990)
1990
-
[12]
Li and H
Z.-X. Li and H. Yao, Sign-problem-free fermionic quan- tum monte carlo: Developments and applications, Annu. Rev. Condens. Matter Phys.10, 337 (2019)
2019
-
[13]
W. L. McMillan, Ground state of liquid he 4, Phys. Rev. 138, A442 (1965)
1965
-
[14]
Toulouse and C
J. Toulouse and C. J. Umrigar, Optimization of quantum monte carlo wave functions by energy minimization, J. Chem. Phys.126, 084102 (2007)
2007
-
[15]
Carleo and M
G. Carleo and M. Troyer, Solving the quantum many- body problem with artificial neural networks, Science 355, 602 (2017)
2017
-
[16]
D.-L. Deng, X. Li, and S. Das Sarma, Quantum entan- glement in neural network states, Phys. Rev. X7, 021021 (2017)
2017
-
[17]
Nomura, A
Y. Nomura, A. S. Darmawan, Y. Yamaji, and M. Imada, 6 Restricted boltzmann machine learning for solving strongly correlated quantum systems, Phys. Rev. B96, 205152 (2017)
2017
-
[18]
X.-Q. Sun, T. Nebabu, X. Han, M. O. Flynn, and X.- L. Qi, Entanglement features of random neural network quantum states, Phys. Rev. B106, 115138 (2022)
2022
-
[19]
K. Choo, G. Carleo, N. Regnault, and T. Neupert, Sym- metries and many-body excitations with neural-network quantum states, Phys. Rev. Lett.121, 167204 (2018)
2018
-
[20]
Ferrari, F
F. Ferrari, F. Becca, and J. Carrasquilla, Neural gutzwiller-projected variational wave functions, Phys. Rev. B100, 125131 (2019)
2019
-
[21]
Luo and B
D. Luo and B. K. Clark, Backflow transformations via neural networks for quantum many-body wave functions, Phys. Rev. Lett.122, 226401 (2019)
2019
-
[22]
K. Choo, A. Mezzacapo, and G. Carleo, Fermionic neural-network states for ab-initio electronic structure, Nat. Commun.11, 2368 (2020)
2020
-
[23]
Hibat-Allah, M
M. Hibat-Allah, M. Ganahl, L. E. Hayward, R. G. Melko, and J. Carrasquilla, Recurrent neural network wave func- tions, Phys. Rev. Res.2, 023358 (2020)
2020
-
[24]
J. R. Moreno, G. Carleo, A. Georges, and J. Stokes, Fermionic wave functions from neural-network con- strained hidden states, Proc. Natl. Acad. Sci. U.S.A.119, e2122059119 (2022)
2022
-
[25]
Pescia, J
G. Pescia, J. Han, A. Lovato, J. Lu, and G. Carleo, Neural-network quantum states for periodic systems in continuous space, Phys. Rev. Res.4, 023138 (2022)
2022
-
[26]
X. Li, Z. Li, and J. Chen, Ab initio calculation of real solids via neural network ansatz, Nat. Commun.13, 7895 (2022)
2022
-
[27]
L. L. Viteritti, R. Rende, and F. Becca, Transformer vari- ational wave functions for frustrated quantum spin sys- tems, Phys. Rev. Lett.130, 236401 (2023)
2023
-
[28]
B. Fore, J. M. Kim, G. Carleo, M. Hjorth-Jensen, A. Lovato, and M. Piarulli, Dilute neutron star matter from neural-network quantum states, Phys. Rev. Res.5, 033062 (2023)
2023
-
[29]
Hermann, J
J. Hermann, J. Spencer, K. Choo, A. Mezzacapo, W. M. C. Foulkes, D. Pfau, G. Carleo, and F. No´ e, Ab initio quantum chemistry with neural-network wavefunc- tions, Nat. Rev. Chem.7, 692 (2023)
2023
-
[30]
Wilson, S
M. Wilson, S. Moroni, M. Holzmann, N. Gao, F. Wu- darski, T. Vegge, and A. Bhowmik, Neural network ansatz for periodic wave functions and the homogeneous electron gas, Phys. Rev. B107, 235139 (2023)
2023
-
[31]
J. Lin, G. Goldshlager, and L. Lin, Explicitly antisym- metrized neural network layers for variational monte carlo simulation, J. Comput. Phys.474, 111765 (2023)
2023
-
[32]
Lange, A
H. Lange, A. Van de Walle, A. Abedinnia, and A. Bohrdt, From architectures to applications: a review of neural quantum states, Quantum Sci. Technol.9, 040501 (2024)
2024
-
[33]
Chen and M
A. Chen and M. Heyl, Empowering deep neural quantum states through efficient optimization, Nat. Phys.20, 1476 (2024)
2024
-
[34]
R. Li, H. Ye, D. Jiang, X. Wen, C. Wang, Z. Li, X. Li, D. He, J. Chen, W. Ren,et al., A computational frame- work for neural network-based variational monte carlo with forward laplacian, Nat. Mach. Intell.6, 209 (2024)
2024
-
[35]
Sprague and S
K. Sprague and S. Czischek, Variational monte carlo with large patched transformers, Commun. Phys.7, 90 (2024)
2024
-
[36]
Rende, S
R. Rende, S. Goldt, F. Becca, and L. L. Viteritti, Fine- tuning neural network quantum states, Phys. Rev. Res. 6, 043280 (2024)
2024
-
[37]
Zhang, R.-S
Q. Zhang, R.-S. Wang, and L. Wang, Neural canonical transformations for vibrational spectra of molecules, J. Chem. Phys.161, 024103 (2024)
2024
-
[38]
Y. Qian, T. Zhao, J. Zhang, T. Xiang, X. Li, and J. Chen, Describing landau level mixing in fractional quantum hall states with deep learning, Phys. Rev. Lett.134, 176503 (2025)
2025
- [39]
- [40]
-
[41]
A. Valenti, Y. Vituri, Y. Yang, D. E. Parker, T. Soe- jima, J. Dong, M. A. Morales, A. Vishwanath, E. Berg, and S. Zhang, Quantum geometry driven crystallization: A neural-network variational monte carlo study (2025), arXiv:2512.07947 [cond-mat.str-el]
-
[42]
J. A. Sobral, M. Perle, and M. S. Scheurer, Physics- informed transformers for electronic quantum states, Nat. Commun.16, 10811 (2025)
2025
-
[43]
L. Zhang and D. Luo, Neural transformer backflow for solving momentum-resolved ground states of strongly correlated materials (2025), arXiv:2509.09275 [cond- mat.str-el]
-
[44]
A. Chen, Z.-Q. Wan, A. Sengupta, A. Georges, and C. Roth, Neural network-augmented pfaffian wave- functions for scalable simulations of interacting fermions (2025), arXiv:2507.10705 [cond-mat.str-el]
-
[45]
Gerard, M
L. Gerard, M. Scherbela, H. Sutterud, W. M. C. Foulkes, and P. Grohs, Transferable neural wavefunctions for solids, Nat. Comput. Sci.5, 1147–1157 (2025)
2025
-
[46]
T. Zaklama, M. Geier, and L. Fu, Large electron model: A universal ground state predictor (2026), arXiv:2603.02346 [cond-mat.str-el]
-
[47]
D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes, Ab initio solution of the many-electron schr¨ odinger equation with deep neural networks, Phys. Rev. Res.2, 033429 (2020)
2020
- [48]
-
[49]
Cassella, H
G. Cassella, H. Sutterud, S. Azadi, N. D. Drummond, D. Pfau, J. S. Spencer, and W. M. C. Foulkes, Discov- ering quantum phase transitions with fermionic neural networks, Phys. Rev. Lett.130, 036401 (2023)
2023
-
[50]
Hermann, Z
J. Hermann, Z. Sch¨ atzle, and F. No´ e, Deep-neural- network solution of the electronic schr¨ odinger equation, Nat. Chem.12, 891 (2020)
2020
-
[51]
I. von Glehn, J. S. Spencer, and D. Pfau, A self- attention ansatz for ab-initio quantum chemistry (2023), arXiv:2211.13672 [physics.chem-ph]
-
[52]
Y. Teng, D. D. Dai, and L. Fu, Solving the fractional quantum hall problem with self-attention neural network, Phys. Rev. B111, 205117 (2025)
2025
-
[53]
Geier, K
M. Geier, K. Nazaryan, T. Zaklama, and L. Fu, Self- attention neural network for solving correlated electron problems in solids, Phys. Rev. B112, 045119 (2025)
2025
-
[54]
C.-T. Li, T. Ong, M. Geier, H. Lin, and L. Fu, Attention is all you need to solve chiral superconductivity (2025), arXiv:2509.03683 [cond-mat.supr-con]
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[55]
Geier, K
M. Geier, K. Nazaryan, T. Zaklama, and L. Fu, Self- 7 attention neural network for solving correlated electron problems in solids, Phys. Rev. B112, 045119 (2025)
2025
-
[56]
S. Dash, L. Gravina, F. Vicentini, M. Ferrero, and A. Georges, Efficiency of neural quantum states in light of the quantum geometric tensor, Commun. Phys.8, 92 (2025)
2025
-
[57]
M. S. Moss, A. Orfi, C. Roth, A. M. Sengupta, A. Georges, D. Sels, A. Dawid, and A. Valenti, Double de- scent: When do neural quantum states generalize?, Phys. Rev. E113, 045303 (2026)
2026
- [58]
-
[59]
R. S. Cortes, A. S. Shankar, M. Dalmonte, R. Verdel, and N. Niggemann, Basis dependence of neural quan- tum states for the transverse field ising model (2026), arXiv:2512.11632 [quant-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2026
- [60]
-
[61]
M. S. Moss, R. Wiersema, M. Hibat-Allah, J. Car- rasquilla, and R. G. Melko, Leveraging recurrence in neu- ral network wavefunctions for large-scale simulations of heisenberg antiferromagnets on the square lattice, Phys. Rev. B112, 134450 (2025)
2025
-
[62]
Pescia, J
G. Pescia, J. Nys, J. Kim, A. Lovato, and G. Carleo, Message-passing neural quantum states for the homoge- neous electron gas, Phys. Rev. B110, 035108 (2024)
2024
-
[63]
Smith, Y
C. Smith, Y. Chen, R. Levy, Y. Yang, M. A. Morales, and S. Zhang, Unified variational approach description of ground-state phases of the two-dimensional electron gas, Phys. Rev. Lett.133, 266504 (2024)
2024
-
[64]
Wigner, On the interaction of electrons in metals, Phys
E. Wigner, On the interaction of electrons in metals, Phys. Rev.46, 1002 (1934)
1934
-
[65]
N. D. Drummond, Z. Radnai, J. R. Trail, M. D. Towler, and R. J. Needs, Diffusion quantum monte carlo study of three-dimensional wigner crystals, Phys. Rev. B69, 085116 (2004)
2004
-
[66]
Giuliani and G
G. Giuliani and G. Vignale,Quantum theory of the elec- tron liquid(Cambridge University Press, 2008)
2008
-
[67]
Azadi, N
S. Azadi, N. D. Drummond, and S. M. Vinko, Correlation energy of the paramagnetic electron gas at the thermo- dynamic limit, Phys. Rev. B107, L121105 (2023)
2023
-
[68]
Sorella, Generalized lanczos algorithm for variational quantum monte carlo, Phys
S. Sorella, Generalized lanczos algorithm for variational quantum monte carlo, Phys. Rev. B64, 024512 (2001)
2001
-
[69]
Goldshlager, N
G. Goldshlager, N. Abrahamsen, and L. Lin, A kaczmarz- inspired approach to accelerate the optimization of neural network wavefunctions, J. Comput. Phys.516, 113351 (2024)
2024
-
[70]
See Supplemental Material for more details
-
[71]
G. D. Mahan,Many-particle physics(Springer Science & Business Media, 2013)
2013
-
[72]
Y. Kwon, D. M. Ceperley, and R. M. Martin, Effects of backflow correlation in the three-dimensional electron gas: Quantum monte carlo study, Phys. Rev. B58, 6800 (1998)
1998
-
[73]
P. P. Ewald, Die berechnung optischer und elektro- statischer gitterpotentiale, Annalen der Physik369, 253 (1921)
1921
-
[74]
L. M. Fraser, W. M. C. Foulkes, G. Rajagopal, R. J. Needs, S. D. Kenny, and A. J. Williamson, Finite- size effects and coulomb interactions in quantum monte carlo calculations for homogeneous systems with periodic boundary conditions, Phys. Rev. B53, 1814 (1996)
1996
-
[75]
A. Y. Toukmaji and J. A. Board, Ewald summation tech- niques in perspective: a survey, Comput. Phys. Commun. 95, 73 (1996)
1996
-
[76]
X. Li, C. Fan, W. Ren, and J. Chen, Fermionic neural network with effective core potential, Phys. Rev. Res.4, 013021 (2022)
2022
- [77]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.