When Do Local Score Models Extrapolate Across Size? A Diagnostic Theory and Benchmark
Pith reviewed 2026-06-27 16:55 UTC · model grok-4.3
The pith
Local score models extrapolate across sizes only when their receptive field covers the quasi-locality range of the Gaussian-smoothed score.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Stable extrapolation is governed by the quasi-locality of the Gaussian-smoothed score: a local model succeeds only if its receptive field covers the smoothed score's response range, formalized by a size-uniform comparison theorem for local marginals under reverse diffusion. Through Tweedie's formula, far-away perturbations can influence local score components via posterior covariance. Under spatial mixing the smoothed score remains quasi-local relative to the receptive field, enabling stable extrapolation; when spatial mixing weakens the score's locality rapidly degrades and size transfer fails.
What carries the argument
Quasi-locality of the Gaussian-smoothed score, which sets the required receptive-field size through posterior covariance in Tweedie's formula.
If this is right
- Under sufficient spatial mixing the smoothed score stays quasi-local relative to any fixed receptive field, permitting stable size extrapolation.
- Weakening spatial mixing causes the smoothed score's effective range to grow, so models with fixed receptive fields lose extrapolation capability.
- The Finite-Depth Local Flow construction supplies exact scores, densities, and tunable response ranges for isolating the quasi-locality mechanism.
- Architectural translation invariance is necessary but not sufficient; receptive-field width must also match the quasi-local range.
Where Pith is reading between the lines
- Design of scalable models for physical systems should estimate mixing length first and size receptive fields accordingly rather than relying on invariance alone.
- The same quasi-locality diagnostic could be applied to other generative settings that assume local score or density approximations.
- Controlled benchmarks with tunable mixing could be used to test whether the size-uniform comparison theorem holds for discrete or graph-structured data.
Load-bearing premise
The influence of far-away perturbations on local score components occurs exclusively through posterior covariance as given by Tweedie's formula, and spatial mixing properties remain stationary enough for the quasi-locality range to be well-defined independently of system size.
What would settle it
An experiment in which a local model whose receptive field is smaller than the measured response range of the smoothed score still produces accurate extrapolation on systems where spatial mixing is deliberately weakened.
Figures
read the original abstract
Scientific generative modeling often requires size transfer, where models trained on small systems are evaluated on larger ones. While translation-invariant architectures enable this evaluation, we show that architectural locality alone does not guarantee stable size extrapolation. Instead, stable extrapolation is governed by the quasi-locality of the Gaussian-smoothed score. Through Tweedie's formula, far-away perturbations can influence local score components via posterior covariance, meaning a local model succeeds only if its receptive field covers the smoothed score's response range. We formalize this mechanism, proving a size-uniform comparison theorem for local marginals under reverse diffusion. We also introduce Finite-Depth Local Flow (FDLF), a white-box diagnostic benchmark with exact scores, densities, and controllable response ranges. Empirically, we validate the interplay between spatial mixing, smoothed-score quasi-locality, and model receptive fields. Under spatial mixing, the smoothed score remains quasi-local relative to the receptive field, enabling stable extrapolation. Conversely, when spatial mixing weakens, the score's locality rapidly degrades, causing size transfer to fail.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that stable size extrapolation in local score-based generative models is governed by the quasi-locality of the Gaussian-smoothed score (via Tweedie's formula and posterior covariance) rather than architectural locality alone. It proves a size-uniform comparison theorem for local marginals under reverse diffusion, introduces the white-box Finite-Depth Local Flow (FDLF) benchmark with exact scores/densities and controllable response ranges, and empirically validates that stable extrapolation occurs under spatial mixing but fails when mixing weakens.
Significance. If the theorem holds with non-vacuous assumptions and the FDLF benchmark isolates the claimed mechanism, the work supplies a useful diagnostic theory and reproducible testbed for size transfer in scientific diffusion models. The exact, controllable quantities in FDLF are a clear strength for falsifiability and reproducibility.
major comments (1)
- [Abstract / size-uniform comparison theorem] The size-uniform comparison theorem (abstract) routes all far-field influence exclusively through posterior covariance and requires spatial mixing properties to remain stationary enough for the quasi-locality range to be independent of system size. This assumption is load-bearing; the paper should explicitly delineate the regimes (e.g., fixed vs. diverging correlation length) where it holds and verify it does not fail in the FDLF construction.
minor comments (1)
- The abstract refers to the theorem and empirical validation on FDLF but does not provide section or equation numbers for the full derivation or error analysis, making it difficult to confirm that the theorem's assumptions are satisfied in the benchmark.
Simulated Author's Rebuttal
We thank the referee for the constructive comment on the size-uniform comparison theorem. The observation correctly identifies that the theorem's applicability depends on spatial mixing properties remaining stationary with system size. We will revise the manuscript to explicitly delineate the relevant regimes and confirm the FDLF construction satisfies the required conditions.
read point-by-point responses
-
Referee: [Abstract / size-uniform comparison theorem] The size-uniform comparison theorem (abstract) routes all far-field influence exclusively through posterior covariance and requires spatial mixing properties to remain stationary enough for the quasi-locality range to be independent of system size. This assumption is load-bearing; the paper should explicitly delineate the regimes (e.g., fixed vs. diverging correlation length) where it holds and verify it does not fail in the FDLF construction.
Authors: The theorem indeed channels far-field effects solely through the posterior covariance (via Tweedie's formula) and requires that the correlation structure of the data distribution yields a quasi-locality range independent of system size. This holds under the regime where the correlation length remains fixed (or sub-linear) as system size grows, which is the setting we consider throughout the paper and in the FDLF benchmark. In the revised manuscript we will add a dedicated paragraph in Section 3 that (i) states the stationarity assumption on the mixing properties, (ii) contrasts the fixed-correlation-length regime (where the theorem applies) with the diverging-correlation-length regime (where quasi-locality may degrade), and (iii) verifies that every FDLF instance is constructed with a fixed finite interaction depth, ensuring the correlation length does not diverge with system size. This clarification does not alter the theorem statement but makes its scope explicit. revision: yes
Circularity Check
No circularity; central theorem derived from standard Tweedie's formula with independent benchmark
full rationale
The paper's derivation chain relies on Tweedie's formula (a standard statistical identity) to relate far-field perturbations to local scores via posterior covariance, then proves a new size-uniform comparison theorem for local marginals under reverse diffusion. The Finite-Depth Local Flow benchmark is presented as white-box with exact scores and densities, providing an independent diagnostic. No self-citations are load-bearing for the uniqueness or validity of the theorem, no parameters are fitted and then renamed as predictions, and no ansatz or known result is smuggled in via citation. The argument is therefore self-contained against external mathematical facts rather than reducing to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Tweedie's formula relating the score to posterior covariance under Gaussian smoothing
Reference graph
Works this paper leans on
-
[1]
Estimation of non-normalized statistical models by score matching https://jmlr.org/papers/v6/hyvarinen05a.html
Aapo Hyvarinen. Estimation of non-normalized statistical models by score matching https://jmlr.org/papers/v6/hyvarinen05a.html. Journal of Machine Learning Research, 6:695--709, 2005
2005
-
[2]
Pascal Vincent. A connection between score matching and denoising autoencoders https://doi.org/10.1162/NECO_a_00142. Neural Computation, 23(7):1661--1674, 2011
-
[3]
Deep unsupervised learning using nonequilibrium thermodynamics https://proceedings.mlr.press/v37/sohl-dickstein15.html
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics https://proceedings.mlr.press/v37/sohl-dickstein15.html. In International Conference on Machine Learning, pages 2256--2265, 2015
2015
-
[4]
Generative modeling by estimating gradients of the data distribution https://proceedings.neurips.cc/paper/2019/hash/3001ef257407d5a371a96dcd947c7d93-Abstract.html
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution https://proceedings.neurips.cc/paper/2019/hash/3001ef257407d5a371a96dcd947c7d93-Abstract.html. In Advances in Neural Information Processing Systems, 2019
2019
-
[5]
Denoising diffusion probabilistic models https://proceedings.neurips.cc/paper/2020/hash/4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models https://proceedings.neurips.cc/paper/2020/hash/4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html. In Advances in Neural Information Processing Systems, 2020
2020
-
[6]
Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations https://openreview.net/forum?id=PxTIG12RRHS. International Conference on Learning Representations, 2021
2021
-
[7]
Holden Lee, Jianfeng Lu, and Yixin Tan. Convergence for score-based generative modeling with polynomial complexity https://arxiv.org/abs/2206.06227. In Advances in Neural Information Processing Systems, 2022
arXiv 2022
-
[8]
Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, and Anru R. Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions https://openreview.net/forum?id=zyLVMgsZ0U_. International Conference on Learning Representations, 2023
2023
-
[9]
Schoenholz, Patrick F
Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. Neural message passing for quantum chemistry https://proceedings.mlr.press/v70/gilmer17a.html. In International Conference on Machine Learning, pages 1263--1272, 2017
2017
-
[10]
Battaglia
Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, and Peter W. Battaglia. Learning to simulate complex physics with graph networks https://proceedings.mlr.press/v119/sanchez-gonzalez20a.html. In International Conference on Machine Learning, pages 8459--8468, 2020
2020
-
[11]
Fourier neural operator for parametric partial differential equations https://openreview.net/forum?id=c8P9NQVtmnO
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations https://openreview.net/forum?id=c8P9NQVtmnO. International Conference on Learning Representations, 2021
2021
-
[12]
Neural operator: learning maps between function spaces with applications to PDEs https://jmlr.org/papers/v24/21-1524.html
Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: learning maps between function spaces with applications to PDEs https://jmlr.org/papers/v24/21-1524.html. Journal of Machine Learning Research, 24(89):1--97, 2023
2023
-
[13]
Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators https://doi.org/10.1038/s42256-021-00302-5. Nature Machine Intelligence, 3:218--229, 2021
-
[14]
Physical Review Letters , volume =
Jorg Behler and Michele Parrinello. Generalized neural-network representation of high-dimensional potential-energy surfaces https://doi.org/10.1103/PhysRevLett.98.146401. Physical Review Letters, 98:146401, 2007
-
[15]
Kyle Mills, Matthew Spanner, and Isaac Tamblyn. Extensive deep neural networks for transferring small scale learning to large scale systems https://doi.org/10.1039/C8SC04578J. Chemical Science, 10:4129--4140, 2019
-
[16]
Schutt, Pieter-Jan Kindermans, Huziel Enoc Sauceda, Stefan Chmiela, Alexandre Tkatchenko, and Klaus-Robert Muller
Kristof T. Schutt, Pieter-Jan Kindermans, Huziel Enoc Sauceda, Stefan Chmiela, Alexandre Tkatchenko, and Klaus-Robert Muller. SchNet: a continuous-filter convolutional neural network for modeling quantum interactions https://proceedings.neurips.cc/paper/2017/hash/303ed4c69846ab36c2904d3ba8573050-Abstract.html. In Advances in Neural Information Processing ...
2017
-
[17]
Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E
Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials https://doi.org/10.1038/s41467-022-29939-5. Nature Communications, 13:2453, 2022
-
[18]
E(n) equivariant graph neural networks https://proceedings.mlr.press/v139/satorras21a.html
Victor Garcia Satorras, Emiel Hoogeboom, and Max Welling. E(n) equivariant graph neural networks https://proceedings.mlr.press/v139/satorras21a.html. In International Conference on Machine Learning, pages 9323--9332, 2021
2021
-
[19]
Equivariant diffusion for molecule generation in 3D https://proceedings.mlr.press/v162/hoogeboom22a.html
Emiel Hoogeboom, Victor Garcia Satorras, Clement Vignac, and Max Welling. Equivariant diffusion for molecule generation in 3D https://proceedings.mlr.press/v162/hoogeboom22a.html. In International Conference on Machine Learning, pages 8867--8887, 2022
2022
-
[20]
GeoDiff: A geometric diffusion model for molecular conformation generation https://openreview.net/forum?id=PzcvxEMzvQC
Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, and Jian Tang. GeoDiff: A geometric diffusion model for molecular conformation generation https://openreview.net/forum?id=PzcvxEMzvQC. International Conference on Learning Representations, 2022
2022
-
[21]
Torsional diffusion for molecular conformer generation https://openreview.net/forum?id=w6fj2r62r_H
Bowen Jing, Gabriele Corso, Jeffrey Chang, Regina Barzilay, and Tommi Jaakkola. Torsional diffusion for molecular conformer generation https://openreview.net/forum?id=w6fj2r62r_H. In Advances in Neural Information Processing Systems, 2022
2022
-
[22]
Normalizing flows for probabilistic modeling and inference https://jmlr.org/papers/v22/19-1028.html
George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshminarayanan. Normalizing flows for probabilistic modeling and inference https://jmlr.org/papers/v22/19-1028.html. Journal of Machine Learning Research, 22(57):1--64, 2021
2021
-
[23]
Variational inference with normalizing flows https://proceedings.mlr.press/v37/rezende15.html
Danilo Jimenez Rezende and Shakir Mohamed. Variational inference with normalizing flows https://proceedings.mlr.press/v37/rezende15.html. In International Conference on Machine Learning, pages 1530--1538, 2015
2015
-
[24]
Density estimation using Real NVP https://openreview.net/forum?id=HkpbnH9lx
Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using Real NVP https://openreview.net/forum?id=HkpbnH9lx. International Conference on Learning Representations, 2017
2017
-
[25]
R. L. Dobrushin. The description of a random field by means of conditional probabilities and conditions of its regularity https://doi.org/10.1137/1113026. Theory of Probability and Its Applications, 13(2):197--224, 1968
-
[26]
Michael E. Fisher and Michael N. Barber. Scaling theory for finite-size effects in the critical region https://doi.org/10.1103/PhysRevLett.28.1516. Physical Review Letters, 28(23):1516--1519, 1972
-
[27]
Finite Size Scaling and Numerical Simulation of Statistical Systems https://doi.org/10.1142/1011
Vladimir Privman, editor. Finite Size Scaling and Numerical Simulation of Statistical Systems https://doi.org/10.1142/1011. World Scientific, 1990
-
[28]
Counting independent sets up to the tree threshold https://doi.org/10.1145/1132516.1132538
Dror Weitz. Counting independent sets up to the tree threshold https://doi.org/10.1145/1132516.1132538. In ACM Symposium on Theory of Computing, pages 140--149, 2006
-
[29]
Recurrence of distributional limits of finite planar graphs https://doi.org/10.1214/EJP.v6-96
Itai Benjamini and Oded Schramm. Recurrence of distributional limits of finite planar graphs https://doi.org/10.1214/EJP.v6-96. Electronic Journal of Probability, 6:1--13, 2001
-
[30]
David Aldous and J. Michael Steele. The objective method: probabilistic combinatorial optimization and local weak convergence https://doi.org/10.1007/978-3-662-09444-0_1. In Probability on Discrete Structures, pages 1--72. Springer, 2004
-
[31]
Lars Onsager. Crystal statistics. I. A two-dimensional model with an order-disorder transition https://journals.aps.org/pr/abstract/10.1103/PhysRev.65.117. Physical Review, 65(3--4):117--149, 1944
-
[32]
Georgii, Gibbs Measures and Phase Transitions, 2nd ed., De Gruyter Studies in Mathematics Vol
Hans-Otto Georgii. Gibbs Measures and Phase Transitions https://doi.org/10.1515/9783110250329. De Gruyter, second edition, 2011
-
[33]
Herbert E. Robbins. An empirical Bayes approach to statistics https://doi.org/10.1007/978-1-4612-0919-5_26. In Breakthroughs in Statistics, pages 388--394. Springer, 1992
-
[34]
Bradley Efron. Tweedie's formula and selection bias https://doi.org/10.1198/jasa.2011.tm11181. Journal of the American Statistical Association, 106(496):1602--1614, 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.