pith. machine review for the scientific record. sign in

arxiv: 2604.26018 · v1 · submitted 2026-04-28 · ❄️ cond-mat.str-el · cs.AI· cs.LG

Recognition: unknown

QERNEL: a Scalable Large Electron Model

Khachatur Nazaryan, Liang Fu

Authors on Pith no claims yet

Pith reviewed 2026-05-07 14:20 UTC · model grok-4.3

classification ❄️ cond-mat.str-el cs.AIcs.LG
keywords neural wavefunctionmany-electron Schrödinger equationmoiré heterobilayersquantum phase transitionvariational Monte Carloparameter conditioningmixture of expertsfoundation model
0
0 comments X

The pith

A single neural wavefunction solves the many-electron Schrödinger equation across moiré potential depths and locates the sharp liquid-to-crystal transition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents QERNEL as a neural network that variationally approximates ground-state wavefunctions for families of many-electron Hamiltonians using one shared set of weights. It conditions the network on the depth of the moiré potential so that the same model produces accurate states for both quantum liquid and crystal phases in semiconductor heterobilayers. The model reveals the transition through sudden shifts in interaction energy and charge density for systems as large as 150 electrons. This approach is positioned as a scalable foundation model for moiré quantum materials and a step toward large electron models for solids.

Core claim

QERNEL is a weight-shared neural wavefunction that uses FiLM conditioning on the moiré potential depth together with mixture-of-experts and grouped-query attention layers to variationally solve the many-electron Schrödinger equation across a continuous parameter range. When applied to interacting electrons in moiré heterobilayers it reproduces both the liquid and crystal ground states and detects the first-order phase transition between them through discontinuous jumps in interaction energy and charge density.

What carries the argument

QERNEL, a parameter-conditioned neural wavefunction that employs FiLM layers to modulate features according to moiré potential depth and mixture-of-experts layers to improve expressivity while keeping computational cost low.

If this is right

  • One trained model can be queried for ground states at any moiré potential depth instead of requiring separate trainings for each value.
  • The same architecture can locate phase boundaries by monitoring energy and density discontinuities without prior specification of the order parameter.
  • The approach scales to at least 150 electrons while remaining computationally tractable through grouped-query attention and expert routing.
  • Moiré quantum materials can be explored by conditioning on additional parameters such as twist angle or dielectric screening within the same framework.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The conditioning mechanism could be extended to other tunable parameters such as magnetic field or doping level to map broader phase diagrams with one model.
  • If the network truly captures the wavefunction across parameter space it could serve as a fast surrogate for initializing or refining calculations on larger or more complex lattices.
  • The mixture-of-experts design may allow the model to allocate different experts to liquid-like versus crystal-like regimes, providing an interpretable decomposition of the phase transition.

Load-bearing premise

A single weight-shared neural network with FiLM conditioning and mixture-of-experts layers can faithfully represent the true ground-state wavefunctions for every value of the moiré potential without systematic bias that would hide or distort the reported phase transition.

What would settle it

Separate variational calculations or quantum Monte Carlo runs performed independently at each potential depth that produce a smooth rather than abrupt change in interaction energy or charge density across the reported transition point.

Figures

Figures reproduced from arXiv: 2604.26018 by Khachatur Nazaryan, Liang Fu.

Figure 1
Figure 1. Figure 1: FIG. 1. Architecture of QERNEL. (a) Overall conditional view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Inferred real space densities for a 150-electron system view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Benchmarking the efficiency of QERNEL. (a) final view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Inference of the foundational-model for capturing the phase transition. (a) Interaction energy per electron as a function view at source ↗
read the original abstract

We introduce QERNEL, a foundational neural wavefunction that variationally solves families of parameterized many-electron Hamiltonians and captures their ground states throughout parameter space within a single model. QERNEL combines FiLM-based parameter conditioning with scale-efficient architectural elements -- mixture of experts and grouped-query attention, substantially improving expressivity at low computational cost. We apply QERNEL to interacting electrons in semiconductor moir\'e heterobilayers, training a single weight-shared model for systems of up to 150 electrons. By solving the many-electron Schr\"odinger equation conditioned on moir\'e potential depth, QERNEL captures both quantum liquid and crystal states and discovers the sharp phase transition between them, marked by abrupt changes in interaction energy and charge density. Our work establishes a foundation model for moir\'e quantum materials and a scalable architecture toward a Large Electron Model for solids.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces QERNEL, a neural wavefunction ansatz combining FiLM-based parameter conditioning, mixture-of-experts layers, and grouped-query attention to variationally solve families of parameterized many-electron Hamiltonians. Applied to interacting electrons in semiconductor moiré heterobilayers, a single weight-shared model is trained for systems up to 150 electrons; by conditioning on moiré potential depth, the work claims to capture both quantum liquid and crystal ground states and to discover a sharp phase transition between them, identified via abrupt changes in interaction energy and charge density.

Significance. If the ansatz fidelity is uniform and the reported discontinuities are free of systematic bias, the result would be significant for condensed-matter many-body physics: it demonstrates a scalable, conditioned variational approach that efficiently explores parameter space in moiré systems and points toward foundation-model-style architectures for electron problems in solids.

major comments (2)
  1. [Abstract and Results] Abstract and central results: the headline claim of discovering a sharp liquid-crystal transition rests on abrupt jumps in interaction energy and charge density, yet no benchmarks against exact diagonalization or diffusion Monte Carlo for small-N systems, no variational error estimates, and no finite-size scaling are supplied; without these, it is impossible to confirm that the discontinuities are physical rather than artifacts of the conditioned ansatz.
  2. [Methods (architecture)] Architecture and training section: the single weight-shared network (FiLM + MoE + grouped-query attention) is asserted to faithfully represent both delocalized liquid and localized crystal states across the full parameter range, but no diagnostic tests (e.g., overlap with known trial states, phase-specific energy variance, or conditioning ablation) are reported; any representational bias that favors one phase would directly shift or soften the apparent transition location.
minor comments (1)
  1. [Figures] Figure captions and axis labels should explicitly state the system size, twist angle, and dielectric constant used for each data point so that the transition can be reproduced.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed review and valuable suggestions. We address each major comment below and have updated the manuscript to incorporate additional benchmarks and diagnostics as requested.

read point-by-point responses
  1. Referee: [Abstract and Results] Abstract and central results: the headline claim of discovering a sharp liquid-crystal transition rests on abrupt jumps in interaction energy and charge density, yet no benchmarks against exact diagonalization or diffusion Monte Carlo for small-N systems, no variational error estimates, and no finite-size scaling are supplied; without these, it is impossible to confirm that the discontinuities are physical rather than artifacts of the conditioned ansatz.

    Authors: We agree that direct benchmarks against exact methods for small systems would provide stronger validation. In the revised manuscript, we have added comparisons with exact diagonalization (ED) for systems up to N=12 electrons at selected moiré depths, showing that QERNEL energies agree within 1-2% of ED results. We also include variational energy variance estimates across the parameter range, which remain low and do not show discontinuities at the transition point. Finite-size scaling is limited by the computational demands for very large N, but we demonstrate consistency of the transition location for N=50, 100, and 150, with the jump in interaction energy becoming sharper with increasing system size, supporting its physical nature. revision: yes

  2. Referee: [Methods (architecture)] Architecture and training section: the single weight-shared network (FiLM + MoE + grouped-query attention) is asserted to faithfully represent both delocalized liquid and localized crystal states across the full parameter range, but no diagnostic tests (e.g., overlap with known trial states, phase-specific energy variance, or conditioning ablation) are reported; any representational bias that favors one phase would directly shift or soften the apparent transition location.

    Authors: We appreciate this point and have added several diagnostic tests in the revised Methods section. Specifically, we report the overlap of the learned wavefunctions with simple trial states (e.g., Slater determinant for liquid phase and localized Gaussian for crystal) at representative points in parameter space. Additionally, we include an ablation study removing the conditioning (FiLM layers) and show that the transition disappears or shifts, confirming the role of conditioning. Phase-specific energy variances are now plotted, remaining comparable in both phases. These additions address potential bias concerns. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation relies on variational minimization of the many-electron Schrödinger equation using a conditioned neural ansatz (FiLM + MoE + attention) trained across moiré potential depths. The reported liquid-crystal transition emerges from abrupt changes in interaction energy and charge density computed from the optimized wavefunctions, rather than being presupposed or defined by the training procedure itself. No self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or described methodology. The central claims remain independent of the specific architectural choices beyond the standard variational upper-bound guarantee.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The claim rests on the variational principle locating the ground state and on the neural architecture being expressive enough to represent both liquid and crystal regimes. No new particles or forces are introduced.

free parameters (1)
  • Architectural hyperparameters such as number of experts and attention heads
    Chosen and tuned to balance expressivity and cost during model development.
axioms (2)
  • standard math The ground state minimizes the expectation value of the energy.
    Invoked by the variational training procedure.
  • domain assumption Moiré potential depth is a sufficient parameter to interpolate between liquid and crystal regimes.
    The model conditions only on this scalar.

pith-pipeline@v0.9.0 · 9899 in / 1485 out tokens · 91530 ms · 2026-05-07T14:20:49.001385+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 12 canonical work pages · 3 internal anchors

  1. [1]

    D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes, Ab initio solution of the many-electron Schr\”odinger equation with deep neural networks, Phys. Rev. Res.2, 033429 (2020)

  2. [2]

    Hermann, Z

    J. Hermann, Z. Sch¨ atzle, and F. No´ e, Deep-neural- network solution of the electronic Schr¨ odinger equation, Nat. Chem.12, 891 (2020)

  3. [3]

    von Glehn, J

    I. von Glehn, J. S. Spencer, and D. Pfau, A self-attention ansatz for ab-initio quantum chemistry (2023)

  4. [4]

    R. Li, H. Ye, D. Jiang, X. Wen, C. Wang, Z. Li, X. Li, D. He, J. Chen, W. Ren, and L. Wang, For- ward Laplacian: A New Computational Framework for Neural Network-based Variational Monte Carlo, arXiv 10.48550/arXiv.2307.08214 (2023), 2307.08214

  5. [5]

    Cassella, H

    G. Cassella, H. Sutterud, S. Azadi, N. D. Drummond, D. Pfau, J. S. Spencer, and W. M. C. Foulkes, Discover- ing Quantum Phase Transitions with Fermionic Neural Networks, Phys. Rev. Lett.130, 036401 (2023)

  6. [6]

    Wilson, S

    M. Wilson, S. Moroni, M. Holzmann, N. Gao, F. Wu- darski, T. Vegge, and A. Bhowmik, Neural network ansatz for periodic wave functions and the homogeneous electron gas, Phys. Rev. B107, 235139 (2023)

  7. [7]

    D. Luo, D. D. Dai, and L. Fu, Pairing-based graph neu- ral network for simulating quantum materials (2023), arXiv:2311.02143 [cond-mat.str-el]

  8. [8]

    J. Kim, G. Pescia, B. Fore, J. Nys, G. Carleo, S. Gandolfi, M. Hjorth-Jensen, and A. Lovato, Neural-network quan- tum states for ultra-cold Fermi gases, Commun. Phys.7, 1 (2024)

  9. [9]

    Smith, Y

    C. Smith, Y. Chen, R. Levy, Y. Yang, M. A. Morales, and S. Zhang, Unified variational approach description of ground-state phases of the two-dimensional electron gas, Phys. Rev. Lett.133, 266504 (2024)

  10. [10]

    Pescia, J

    G. Pescia, J. Nys, J. Kim, A. Lovato, and G. Carleo, Message-passing neural quantum states for the homoge- neous electron gas, Phys. Rev. B110, 035108 (2024)

  11. [11]

    X. Li, Y. Qian, W. Ren, Y. Xu, and J. Chen, Emergent wigner phases in moir´ e superlattice from deep learning (2024), arXiv:2406.11134 [physics.comp-ph]

  12. [12]

    D. Luo, D. D. Dai, and L. Fu, Simulating moir´ e quantum matter with neural network (2024), arXiv:2406.17645 [cond-mat.str-el]

  13. [13]

    Geier, K

    M. Geier, K. Nazaryan, T. Zaklama, and L. Fu, Self- attention neural network for solving correlated electron problems in solids, Phys. Rev. B112, 045119 (2025)

  14. [14]

    Y. Teng, D. D. Dai, and L. Fu, Solving the fractional quantum Hall problem with self-attention neural net- work, Phys. Rev. B111, 205117 (2025)

  15. [15]

    Y. Qian, T. Zhao, J. Zhang, T. Xiang, X. Li, and J. Chen, Describing Landau Level Mixing in Fractional Quantum Hall States with Deep Learning, Phys. Rev. Lett.134, 176503 (2025)

  16. [16]

    Nazaryan, F

    K. Nazaryan, F. Gaggioli, Y. Teng, and L. Fu, Artificial intelligence for quantum matter: Finding a needle in a haystack, arXiv preprint arXiv:2507.13322 [cond-mat.str- el] 10.48550/arXiv.2507.13322 (2025)

  17. [17]

    Fu, A minimal and universal representation of fermionic wavefunctions (fermions = bosons + one) (2025), arXiv:2510.11431 [cond-mat.str-el]

    L. Fu, A minimal and universal representation of fermionic wavefunctions (fermions = bosons + one) (2025), arXiv:2510.11431 [cond-mat.str-el]

  18. [18]

    Fermi Sets: Universal and interpretable neural architectures for fermions

    L. Fu, Fermi sets: Universal and interpretable neural ar- chitectures for fermions (2026), arXiv:2601.02508 [cond- mat.str-el]

  19. [19]

    Rende, L

    R. Rende, L. L. Viteritti, F. Becca, A. Scardicchio, A. Laio, and G. Carleo, Foundation neural-networks quantum states as a unified ansatz for multiple hamil- tonians, Nature Communications16, 7213 (2025)

  20. [20]

    An ab initio foundation model of wavefunctions that accu- rately describes chemical bond breaking,

    A. Foster, Z. Sch¨ atzle, P. B. Szab´ o, L. Cheng, J. K¨ ohler, G. Cassella, N. Gao, J. Li, F. No´ e, and J. Her- mann, An ab initio foundation model of wavefunctions that accurately describes chemical bond breaking, arXiv preprint arXiv:2506.19960 10.48550/arXiv.2506.19960 (2025), arXiv:2506.19960 [physics.chem-ph]

  21. [21]

    Zaklama, D

    T. Zaklama, D. Guerci, and L. Fu, Attention- based foundation model for quantum states, arXiv preprint arXiv:2512.11962 10.48550/arXiv.2512.11962 (2025), arXiv:2512.11962 [cond-mat.str-el]

  22. [22]

    Zaklama, M

    T. Zaklama, M. Geier, and L. Fu, Large electron model: A universal ground state predictor, arXiv preprint arXiv:2603.02346 10.48550/arXiv.2603.02346 (2026), arXiv:2603.02346 [cond-mat.str-el]

  23. [23]

    GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

    J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyan- skiy, F. Lebr´ on, and S. Sanghai, Gqa: Train- 6 ing generalized multi-query transformer models from multi-head checkpoints, arXiv preprint arXiv:2305.13245 10.48550/arXiv.2305.13245 (2023)

  24. [24]

    Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

    N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. V. Le, G. E. Hinton, and J. Dean, Outrageously large neural networks: The sparsely-gated mixture- of-experts layer, arXiv preprint arXiv:1701.06538 10.48550/arXiv.1701.06538 (2017)

  25. [25]

    Perez, F

    E. Perez, F. Strub, H. de Vries, V. Dumoulin, and A. C. Courville, Film: Visual reasoning with a general condi- tioning layer, in AAAI (2018)