Recognition: unknown
QERNEL: a Scalable Large Electron Model
Pith reviewed 2026-05-07 14:20 UTC · model grok-4.3
The pith
A single neural wavefunction solves the many-electron Schrödinger equation across moiré potential depths and locates the sharp liquid-to-crystal transition.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
QERNEL is a weight-shared neural wavefunction that uses FiLM conditioning on the moiré potential depth together with mixture-of-experts and grouped-query attention layers to variationally solve the many-electron Schrödinger equation across a continuous parameter range. When applied to interacting electrons in moiré heterobilayers it reproduces both the liquid and crystal ground states and detects the first-order phase transition between them through discontinuous jumps in interaction energy and charge density.
What carries the argument
QERNEL, a parameter-conditioned neural wavefunction that employs FiLM layers to modulate features according to moiré potential depth and mixture-of-experts layers to improve expressivity while keeping computational cost low.
If this is right
- One trained model can be queried for ground states at any moiré potential depth instead of requiring separate trainings for each value.
- The same architecture can locate phase boundaries by monitoring energy and density discontinuities without prior specification of the order parameter.
- The approach scales to at least 150 electrons while remaining computationally tractable through grouped-query attention and expert routing.
- Moiré quantum materials can be explored by conditioning on additional parameters such as twist angle or dielectric screening within the same framework.
Where Pith is reading between the lines
- The conditioning mechanism could be extended to other tunable parameters such as magnetic field or doping level to map broader phase diagrams with one model.
- If the network truly captures the wavefunction across parameter space it could serve as a fast surrogate for initializing or refining calculations on larger or more complex lattices.
- The mixture-of-experts design may allow the model to allocate different experts to liquid-like versus crystal-like regimes, providing an interpretable decomposition of the phase transition.
Load-bearing premise
A single weight-shared neural network with FiLM conditioning and mixture-of-experts layers can faithfully represent the true ground-state wavefunctions for every value of the moiré potential without systematic bias that would hide or distort the reported phase transition.
What would settle it
Separate variational calculations or quantum Monte Carlo runs performed independently at each potential depth that produce a smooth rather than abrupt change in interaction energy or charge density across the reported transition point.
Figures
read the original abstract
We introduce QERNEL, a foundational neural wavefunction that variationally solves families of parameterized many-electron Hamiltonians and captures their ground states throughout parameter space within a single model. QERNEL combines FiLM-based parameter conditioning with scale-efficient architectural elements -- mixture of experts and grouped-query attention, substantially improving expressivity at low computational cost. We apply QERNEL to interacting electrons in semiconductor moir\'e heterobilayers, training a single weight-shared model for systems of up to 150 electrons. By solving the many-electron Schr\"odinger equation conditioned on moir\'e potential depth, QERNEL captures both quantum liquid and crystal states and discovers the sharp phase transition between them, marked by abrupt changes in interaction energy and charge density. Our work establishes a foundation model for moir\'e quantum materials and a scalable architecture toward a Large Electron Model for solids.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces QERNEL, a neural wavefunction ansatz combining FiLM-based parameter conditioning, mixture-of-experts layers, and grouped-query attention to variationally solve families of parameterized many-electron Hamiltonians. Applied to interacting electrons in semiconductor moiré heterobilayers, a single weight-shared model is trained for systems up to 150 electrons; by conditioning on moiré potential depth, the work claims to capture both quantum liquid and crystal ground states and to discover a sharp phase transition between them, identified via abrupt changes in interaction energy and charge density.
Significance. If the ansatz fidelity is uniform and the reported discontinuities are free of systematic bias, the result would be significant for condensed-matter many-body physics: it demonstrates a scalable, conditioned variational approach that efficiently explores parameter space in moiré systems and points toward foundation-model-style architectures for electron problems in solids.
major comments (2)
- [Abstract and Results] Abstract and central results: the headline claim of discovering a sharp liquid-crystal transition rests on abrupt jumps in interaction energy and charge density, yet no benchmarks against exact diagonalization or diffusion Monte Carlo for small-N systems, no variational error estimates, and no finite-size scaling are supplied; without these, it is impossible to confirm that the discontinuities are physical rather than artifacts of the conditioned ansatz.
- [Methods (architecture)] Architecture and training section: the single weight-shared network (FiLM + MoE + grouped-query attention) is asserted to faithfully represent both delocalized liquid and localized crystal states across the full parameter range, but no diagnostic tests (e.g., overlap with known trial states, phase-specific energy variance, or conditioning ablation) are reported; any representational bias that favors one phase would directly shift or soften the apparent transition location.
minor comments (1)
- [Figures] Figure captions and axis labels should explicitly state the system size, twist angle, and dielectric constant used for each data point so that the transition can be reproduced.
Simulated Author's Rebuttal
We thank the referee for the detailed review and valuable suggestions. We address each major comment below and have updated the manuscript to incorporate additional benchmarks and diagnostics as requested.
read point-by-point responses
-
Referee: [Abstract and Results] Abstract and central results: the headline claim of discovering a sharp liquid-crystal transition rests on abrupt jumps in interaction energy and charge density, yet no benchmarks against exact diagonalization or diffusion Monte Carlo for small-N systems, no variational error estimates, and no finite-size scaling are supplied; without these, it is impossible to confirm that the discontinuities are physical rather than artifacts of the conditioned ansatz.
Authors: We agree that direct benchmarks against exact methods for small systems would provide stronger validation. In the revised manuscript, we have added comparisons with exact diagonalization (ED) for systems up to N=12 electrons at selected moiré depths, showing that QERNEL energies agree within 1-2% of ED results. We also include variational energy variance estimates across the parameter range, which remain low and do not show discontinuities at the transition point. Finite-size scaling is limited by the computational demands for very large N, but we demonstrate consistency of the transition location for N=50, 100, and 150, with the jump in interaction energy becoming sharper with increasing system size, supporting its physical nature. revision: yes
-
Referee: [Methods (architecture)] Architecture and training section: the single weight-shared network (FiLM + MoE + grouped-query attention) is asserted to faithfully represent both delocalized liquid and localized crystal states across the full parameter range, but no diagnostic tests (e.g., overlap with known trial states, phase-specific energy variance, or conditioning ablation) are reported; any representational bias that favors one phase would directly shift or soften the apparent transition location.
Authors: We appreciate this point and have added several diagnostic tests in the revised Methods section. Specifically, we report the overlap of the learned wavefunctions with simple trial states (e.g., Slater determinant for liquid phase and localized Gaussian for crystal) at representative points in parameter space. Additionally, we include an ablation study removing the conditioning (FiLM layers) and show that the transition disappears or shifts, confirming the role of conditioning. Phase-specific energy variances are now plotted, remaining comparable in both phases. These additions address potential bias concerns. revision: yes
Circularity Check
No significant circularity
full rationale
The derivation relies on variational minimization of the many-electron Schrödinger equation using a conditioned neural ansatz (FiLM + MoE + attention) trained across moiré potential depths. The reported liquid-crystal transition emerges from abrupt changes in interaction energy and charge density computed from the optimized wavefunctions, rather than being presupposed or defined by the training procedure itself. No self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or described methodology. The central claims remain independent of the specific architectural choices beyond the standard variational upper-bound guarantee.
Axiom & Free-Parameter Ledger
free parameters (1)
- Architectural hyperparameters such as number of experts and attention heads
axioms (2)
- standard math The ground state minimizes the expectation value of the energy.
- domain assumption Moiré potential depth is a sufficient parameter to interpolate between liquid and crystal regimes.
Reference graph
Works this paper leans on
-
[1]
D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes, Ab initio solution of the many-electron Schr\”odinger equation with deep neural networks, Phys. Rev. Res.2, 033429 (2020)
2020
-
[2]
Hermann, Z
J. Hermann, Z. Sch¨ atzle, and F. No´ e, Deep-neural- network solution of the electronic Schr¨ odinger equation, Nat. Chem.12, 891 (2020)
2020
-
[3]
von Glehn, J
I. von Glehn, J. S. Spencer, and D. Pfau, A self-attention ansatz for ab-initio quantum chemistry (2023)
2023
-
[4]
R. Li, H. Ye, D. Jiang, X. Wen, C. Wang, Z. Li, X. Li, D. He, J. Chen, W. Ren, and L. Wang, For- ward Laplacian: A New Computational Framework for Neural Network-based Variational Monte Carlo, arXiv 10.48550/arXiv.2307.08214 (2023), 2307.08214
-
[5]
Cassella, H
G. Cassella, H. Sutterud, S. Azadi, N. D. Drummond, D. Pfau, J. S. Spencer, and W. M. C. Foulkes, Discover- ing Quantum Phase Transitions with Fermionic Neural Networks, Phys. Rev. Lett.130, 036401 (2023)
2023
-
[6]
Wilson, S
M. Wilson, S. Moroni, M. Holzmann, N. Gao, F. Wu- darski, T. Vegge, and A. Bhowmik, Neural network ansatz for periodic wave functions and the homogeneous electron gas, Phys. Rev. B107, 235139 (2023)
2023
- [7]
-
[8]
J. Kim, G. Pescia, B. Fore, J. Nys, G. Carleo, S. Gandolfi, M. Hjorth-Jensen, and A. Lovato, Neural-network quan- tum states for ultra-cold Fermi gases, Commun. Phys.7, 1 (2024)
2024
-
[9]
Smith, Y
C. Smith, Y. Chen, R. Levy, Y. Yang, M. A. Morales, and S. Zhang, Unified variational approach description of ground-state phases of the two-dimensional electron gas, Phys. Rev. Lett.133, 266504 (2024)
2024
-
[10]
Pescia, J
G. Pescia, J. Nys, J. Kim, A. Lovato, and G. Carleo, Message-passing neural quantum states for the homoge- neous electron gas, Phys. Rev. B110, 035108 (2024)
2024
- [11]
- [12]
-
[13]
Geier, K
M. Geier, K. Nazaryan, T. Zaklama, and L. Fu, Self- attention neural network for solving correlated electron problems in solids, Phys. Rev. B112, 045119 (2025)
2025
-
[14]
Y. Teng, D. D. Dai, and L. Fu, Solving the fractional quantum Hall problem with self-attention neural net- work, Phys. Rev. B111, 205117 (2025)
2025
-
[15]
Y. Qian, T. Zhao, J. Zhang, T. Xiang, X. Li, and J. Chen, Describing Landau Level Mixing in Fractional Quantum Hall States with Deep Learning, Phys. Rev. Lett.134, 176503 (2025)
2025
-
[16]
K. Nazaryan, F. Gaggioli, Y. Teng, and L. Fu, Artificial intelligence for quantum matter: Finding a needle in a haystack, arXiv preprint arXiv:2507.13322 [cond-mat.str- el] 10.48550/arXiv.2507.13322 (2025)
-
[17]
L. Fu, A minimal and universal representation of fermionic wavefunctions (fermions = bosons + one) (2025), arXiv:2510.11431 [cond-mat.str-el]
-
[18]
Fermi Sets: Universal and interpretable neural architectures for fermions
L. Fu, Fermi sets: Universal and interpretable neural ar- chitectures for fermions (2026), arXiv:2601.02508 [cond- mat.str-el]
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[19]
Rende, L
R. Rende, L. L. Viteritti, F. Becca, A. Scardicchio, A. Laio, and G. Carleo, Foundation neural-networks quantum states as a unified ansatz for multiple hamil- tonians, Nature Communications16, 7213 (2025)
2025
-
[20]
An ab initio foundation model of wavefunctions that accu- rately describes chemical bond breaking,
A. Foster, Z. Sch¨ atzle, P. B. Szab´ o, L. Cheng, J. K¨ ohler, G. Cassella, N. Gao, J. Li, F. No´ e, and J. Her- mann, An ab initio foundation model of wavefunctions that accurately describes chemical bond breaking, arXiv preprint arXiv:2506.19960 10.48550/arXiv.2506.19960 (2025), arXiv:2506.19960 [physics.chem-ph]
-
[21]
T. Zaklama, D. Guerci, and L. Fu, Attention- based foundation model for quantum states, arXiv preprint arXiv:2512.11962 10.48550/arXiv.2512.11962 (2025), arXiv:2512.11962 [cond-mat.str-el]
-
[22]
T. Zaklama, M. Geier, and L. Fu, Large electron model: A universal ground state predictor, arXiv preprint arXiv:2603.02346 10.48550/arXiv.2603.02346 (2026), arXiv:2603.02346 [cond-mat.str-el]
-
[23]
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyan- skiy, F. Lebr´ on, and S. Sanghai, Gqa: Train- 6 ing generalized multi-query transformer models from multi-head checkpoints, arXiv preprint arXiv:2305.13245 10.48550/arXiv.2305.13245 (2023)
work page internal anchor Pith review doi:10.48550/arxiv.2305.13245 2023
-
[24]
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. V. Le, G. E. Hinton, and J. Dean, Outrageously large neural networks: The sparsely-gated mixture- of-experts layer, arXiv preprint arXiv:1701.06538 10.48550/arXiv.1701.06538 (2017)
work page internal anchor Pith review doi:10.48550/arxiv.1701.06538 2017
-
[25]
Perez, F
E. Perez, F. Strub, H. de Vries, V. Dumoulin, and A. C. Courville, Film: Visual reasoning with a general condi- tioning layer, in AAAI (2018)
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.