Recognition: unknown
Autonomous Emergence of Hamiltonian in Deep Generative Models
Pith reviewed 2026-05-09 22:17 UTC · model grok-4.3
The pith
Deep generative models recover the Hamiltonian of a spin glass from equilibrium data without physical priors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By establishing an exact equivalence between the zero-noise limit of the Riemannian diffusion score field and the thermodynamic restoring force, the trained neural network serves as a direct force estimator. Applying overdetermined linear inversion to the force estimates from an O(3)-equivariant attention architecture, trained solely on equilibrium configurations of a sequence-dependent frustrated 1D O(3) spin glass, recovers the microscopic Hamiltonian parameters with 99.7% cosine similarity to ground truth. These sparse parameters alone account for 87% of the variance in the continuous force field.
What carries the argument
The exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force, which turns the generative network into a force estimator for Hamiltonian recovery via linear inversion.
If this is right
- The recovered parameters alone explain 87 percent of the variance in the network's predicted continuous force field.
- Generative models can internalize the microscopic physical rules from data without any energetic priors supplied.
- The recovery succeeds on a sequence-dependent frustrated 1D O(3) spin glass using only thermal equilibrium snapshots.
- An overdetermined linear inversion applied to the network's force estimates is sufficient to extract the interaction parameters.
Where Pith is reading between the lines
- The same inversion technique could be applied to experimental snapshots to infer unknown Hamiltonians in other many-body systems.
- The result offers a concrete test for whether other generative architectures have learned dynamics rather than surface statistics.
- If the recovered parameters reproduce measurable thermodynamic quantities such as specific heat or correlation functions, the claim gains independent support.
Load-bearing premise
The zero-noise limit of the Riemannian diffusion score field exactly equals the thermodynamic restoring force for the spin system.
What would settle it
If the Hamiltonian parameters obtained from the linear inversion do not closely match the known ground-truth interactions when tested on new spin configurations, the equivalence and recovery claim would be falsified.
Figures
read the original abstract
The unprecedented predictive success of deep generative models in complex many-body systems, such as AlphaFold3, raises an epistemological question: do these networks merely memorize data distributions via high-dimensional interpolation, or do they autonomously deduce the underlying physical laws? To address this, we introduce a rigorous algebraic framework to extract the implicit physical interactions learned by generative models. By establishing an exact equivalence between the zero-noise limit of a Riemannian diffusion score field and the thermodynamic restoring force, we utilize the trained neural network as a direct force estimator. Applying this framework to a sequence-dependent, frustrated 1D $O(3)$ spin glass, we probe the latent representations of an $O(3)$-equivariant attention architecture trained solely on thermal equilibrium snapshots. Without incorporating any energetic priors, an overdetermined linear inversion successfully recovers the microscopic Hamiltonian parameters of the spin system. The inferred Hamiltonian parameters exhibit a $99.7\%$ cosine similarity with the ground-truth interaction parameters. Furthermore, these sparse local parameters alone are sufficient to explain $87\%$ of the variance in the continuous force field predicted by the network. Our results provide quantitative, falsifiable evidence that deep generative architectures do not merely perform statistical pattern matching, but autonomously discover and internalize the underlying physical rules.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that an O(3)-equivariant attention network trained solely on thermal equilibrium snapshots of a sequence-dependent frustrated 1D O(3) spin glass autonomously discovers the underlying Hamiltonian. It establishes an algebraic equivalence between the zero-noise limit of the Riemannian diffusion score field and the thermodynamic restoring force −∇H, allowing the network to serve as a direct force estimator. An overdetermined linear inversion then recovers the microscopic couplings J with 99.7% cosine similarity to ground truth and shows that these sparse parameters explain 87% of the variance in the network-predicted force field, all without energetic priors.
Significance. If the zero-noise equivalence holds rigorously, the work supplies quantitative, falsifiable evidence that deep generative models internalize physical laws rather than performing high-dimensional interpolation. The specific numerical agreement (99.7% similarity, 87% variance) and the parameter-free linear inversion are strengths that would make the result noteworthy for interpretability studies in machine learning for condensed-matter systems.
major comments (2)
- [Algebraic framework / Methods (equivalence derivation)] The central claim rests on an exact equivalence between the zero-noise limit of the Riemannian diffusion score and the mean restoring force −∇H (abstract). This mapping must be derived explicitly for the constrained O(3) manifold, including tangent-space projections and any discretization or finite-noise residuals; without a robustness check, the linear inversion may recover an effective rather than microscopic Hamiltonian.
- [Results (linear inversion and variance analysis)] Table or figure reporting the 99.7% cosine similarity and 87% variance explained: these quantities lack error bars, cross-validation details, or sensitivity tests to training-distribution mismatch. Because the network is trained on data generated by the target Hamiltonian, the high agreement could arise from distribution matching rather than autonomous rule discovery.
minor comments (2)
- [Methods] Clarify the precise definition of the overdetermined linear system (number of equations vs. unknowns) and the regularization (if any) used in the inversion.
- [Abstract] The abstract states the architecture is O(3)-equivariant but does not specify the attention mechanism or network depth; a short methods paragraph would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments, which have helped clarify several aspects of our work. We respond point by point to the major comments below.
read point-by-point responses
-
Referee: [Algebraic framework / Methods (equivalence derivation)] The central claim rests on an exact equivalence between the zero-noise limit of the Riemannian diffusion score and the mean restoring force −∇H (abstract). This mapping must be derived explicitly for the constrained O(3) manifold, including tangent-space projections and any discretization or finite-noise residuals; without a robustness check, the linear inversion may recover an effective rather than microscopic Hamiltonian.
Authors: We appreciate the call for greater explicitness in the derivation. The equivalence follows from the zero-noise limit of the Riemannian score-matching objective on the product manifold of O(3) spheres, where the score is projected onto the tangent space at each spin via the Riemannian metric; this directly yields the mean restoring force −∇H. In the revised manuscript we will expand the Methods section to include the full algebraic steps, explicit tangent-space projections, discretization analysis, and finite-noise residuals. We have performed additional robustness checks by varying the noise schedule and manifold constraints; these confirm recovery of the microscopic rather than an effective Hamiltonian and will be reported. revision: yes
-
Referee: [Results (linear inversion and variance analysis)] Table or figure reporting the 99.7% cosine similarity and 87% variance explained: these quantities lack error bars, cross-validation details, or sensitivity tests to training-distribution mismatch. Because the network is trained on data generated by the target Hamiltonian, the high agreement could arise from distribution matching rather than autonomous rule discovery.
Authors: We agree that statistical robustness should be strengthened. The 99.7% cosine similarity and 87% variance explained are obtained from an overdetermined least-squares inversion of the network-predicted force field against the sparse coupling basis, with no energetic priors supplied during training. In the revision we will add error bars from multiple independent training runs, cross-validation across held-out temperatures, and sensitivity tests using training distributions generated from perturbed Hamiltonians. These additions will demonstrate that the parameter recovery is not reducible to distribution matching alone. revision: yes
Circularity Check
No significant circularity; derivation is self-contained via standard statistical mechanics and post-hoc fitting.
full rationale
The central chain proceeds as: (1) train O(3)-equivariant network on equilibrium snapshots to approximate the Riemannian diffusion score; (2) invoke the zero-noise limit where score equals thermodynamic force -∇H (standard Boltzmann relation on the manifold, not redefined in the paper); (3) perform overdetermined linear solve for microscopic couplings J from the network-estimated force field at many configurations. None of these steps reduces to its own inputs by construction. The equivalence is externally motivated by equilibrium statistical mechanics rather than asserted as an ansatz or self-definition; the linear inversion is an extraction step whose success is measured against ground-truth J that generated the training data, but the mapping itself is not tautological. No self-citation load-bearing, no fitted quantity renamed as prediction, and no uniqueness theorem imported from the authors' prior work. The 99.7 % cosine similarity and 87 % variance explanation are empirical outcomes of a well-trained score approximator, not forced by the framework.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Zero-noise limit of Riemannian diffusion score field equals thermodynamic restoring force
Reference graph
Works this paper leans on
-
[1]
Abramson, J
J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Bal- lard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C.-C. Hung, M. O’Neill, D. Reiman, K. Tunyasuvu- nakool, Z. Wu, A. ˇZemgulyt˙ e, E. Arvaniti, C. Beat- tie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Con- greve, A. I. Cowen-Rivers, A. Cowie, M. Fig...
2024
-
[2]
J. Ho, A. Jain, and P. Abbeel, Denoising diffusion prob- abilistic models (2020)
2020
-
[3]
Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-based generative modeling through stochastic differential equations (2020)
2020
-
[4]
Bahri, J
Y. Bahri, J. Kadmon, J. Pennington, S. S. Schoenholz, J. Sohl-Dickstein, and S. Ganguli, Annual Review of Con- densed Matter Physics11, 501–528 (2020)
2020
-
[6]
Science Advances6(16), eaay2631 (2020).https: //doi.org/10.1126/sciadv.aay2631
S.-M. Udrescu and M. Tegmark, Science Advances6, 10.1126/sciadv.aay2631 (2020)
-
[7]
E. T. Jaynes, Physical Review106, 620–630 (1957)
1957
-
[8]
H. C. Nguyen, R. Zecchina, and J. Berg, Advances in Physics66, 197–261 (2017)
2017
-
[9]
ACKLEY, G
D. ACKLEY, G. HINTON, and T. SEJNOWSKI, Cog- nitive Science9, 147–169 (1985)
1985
-
[10]
Hyv¨ arinen, Journal of Machine Learning Research6, 695 (2005)
A. Hyv¨ arinen, Journal of Machine Learning Research6, 695 (2005)
2005
-
[11]
Vincent, Neural Computation23, 1661–1674 (2011)
P. Vincent, Neural Computation23, 1661–1674 (2011)
2011
-
[12]
J. Sohl-Dickstein, P. B. Battaglino, and M. R. De- Weese, Physical Review Letters107, 10.1103/phys- revlett.107.220601 (2011)
-
[13]
De Bortoli, E
V. De Bortoli, E. Mathieu, M. Hutchinson, J. Thorn- ton, Y. W. Teh, and A. Doucet, Riemannian score-based generative modelling (2022)
2022
-
[14]
M. Arts, V. Garcia Satorras, C.-W. Huang, D. Z¨ ugner, M. Federici, C. Clementi, F. No´ e, R. Pinsler, and R. van den Berg, Journal of Chemical Theory and Com- putation19, 6151–6159 (2023)
2023
-
[15]
Zaidi, M
S. Zaidi, M. Schaarschmidt, J. Martens, H. Kim, Y. W. Teh, A. Sanchez-Gonzalez, P. Battaglia, R. Pascanu, and J. Godwin, Pre-training via denoising for molecular prop- erty prediction (2022)
2022
-
[16]
Holderrieth, Y
P. Holderrieth, Y. Xu, and T. Jaakkola, Hamiltonian score matching and generative flows (2024)
2024
-
[17]
Park, inThe Thirty-ninth Annual Conference on Neu- ral Information Processing Systems(2026)
S. Park, inThe Thirty-ninth Annual Conference on Neu- ral Information Processing Systems(2026)
2026
-
[18]
Binder and A
K. Binder and A. P. Young, Reviews of Modern Physics 58, 801–976 (1986)
1986
-
[19]
Bhattacharjee and S.-C
S. Bhattacharjee and S.-C. Lee, Testing the spin-bath view of self-attention: A hamiltonian analysis of gpt-2 transformer (2025)
2025
-
[20]
A. P. Ramirez, Annual Review of Materials Science24, 453–480 (1994)
1994
-
[21]
Villar, D
S. Villar, D. W. Hogg, K. Storey-Fisher, W. Yao, and B. Blum-Smith, inAdvances in Neural Informa- tion Processing Systems, Vol. 34, edited by M. Ran- zato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan (Curran Associates, Inc., 2021) pp. 28848– 28863
2021
-
[22]
V. G. Satorras, E. Hoogeboom, and M. Welling, E(n) equivariant graph neural networks (2021)
2021
-
[23]
Andreas Bender, Nadine Schneider, Marwin Segler, W
S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt, and B. Kozinsky, Nature Communications13, 10.1038/s41467-022-29939-5 (2022). 8
-
[24]
Hukushima and K
K. Hukushima and K. Nemoto, Journal of the Physical Society of Japan65, 1604–1608 (1996)
1996
-
[25]
F. R. Brown and T. J. Woch, Physical Review Letters 58, 2394–2396 (1987)
1987
-
[26]
Creutz, Physical Review D36, 515–519 (1987)
M. Creutz, Physical Review D36, 515–519 (1987)
1987
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.