arxiv: 2605.07980 · v1 · submitted 2026-05-08 · 💻 cs.LG · cond-mat.stat-mech· math.ST· stat.TH

Recognition: 2 theorem links

· Lean Theorem

Susceptibilities and Patterning: A Primer on Linear Response in Bayesian Learning

Chris Elliott , Daniel Murfet

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:06 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.stat-mechmath.STstat.TH

keywords susceptibility matrixBayesian neural networkslinear responsepatterning probleminfluence functionsfluctuation-dissipationposterior covariancestructural coordinates

0 comments

The pith

The susceptibility matrix functions as the Jacobian mapping changes in data distributions to shifts in structural coordinates of Bayesian neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops susceptibilities as linear response quantities in Bayesian learning, showing that the derivative of a posterior expectation with respect to a data perturbation equals a posterior covariance via the fluctuation-dissipation theorem. Different observables produce the influence matrix for per-sample effects and the structural susceptibility matrix that associates model components with data patterns. This matrix equals the Jacobian (scaled by nβ) of the map from data distributions to structural coordinates. Its pseudo-inverse then yields a linearized method for the patterning task of selecting data perturbations that induce a target structural change. The approach grounds interpretation of neural networks in the geometry of the loss landscape and supplies empirical estimators for practical use.

Core claim

Susceptibility of an observable φ to a data perturbation is defined as the derivative of its posterior expectation; by the fluctuation-dissipation theorem this equals the corresponding posterior covariance. When φ is chosen as a per-sample loss the result is the influence matrix, while component-localized observables produce the structural susceptibility matrix. The latter matrix is proportional to the Jacobian of the map from data distributions to structural coordinates, and its pseudo-inverse supplies a first-order solution to the patterning problem of finding data perturbations that realize a prescribed structural shift.

What carries the argument

The susceptibility matrix, obtained either as the derivative of a posterior expectation or as a posterior covariance, that serves as the Jacobian between data distributions and structural coordinates.

If this is right

Empirical estimators for susceptibilities can be computed from posterior samples without additional model training.
The influence matrix recovers the Bayesian influence function as a special case.
Structural susceptibilities pair individual model components with specific data patterns through covariance terms.
Pseudo-inverse application gives an explicit linear formula for data perturbations that target desired structural adjustments.
The construction connects posterior geometry to the loss landscape via linear response.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same framework could be tested on non-neural models such as Bayesian linear regression to check whether the Jacobian interpretation holds outside deep networks.
One could compare the linearized patterning solutions against full nonlinear optimization of data perturbations on held-out tasks to measure the range of validity.
Structural susceptibilities might be used to generate targeted data augmentations that steer component activations in deployed models.
The approach suggests a route to sensitivity analysis in continual learning settings where data distributions shift over time.

Load-bearing premise

The fluctuation-dissipation theorem applies directly to the posterior distribution arising in Bayesian neural network training, and structural coordinates are well-defined with the relevant map being differentiable.

What would settle it

For a small Bayesian neural network, compute the empirical susceptibility matrix and its pseudo-inverse, apply the predicted data perturbation, and verify whether the observed change in structural coordinates matches the first-order prediction within the linear regime.

Figures

Figures reproduced from arXiv: 2605.07980 by Chris Elliott, Daniel Murfet.

**Figure 1.** Figure 1: The 2D Ising model on a 20 × 20 lattice with periodic boundary conditions. Left three panels: sample configurations drawn from the Boltzmann distribution at three values of β (blue = spin +1, red = spin −1). At β = 0.10 (high temperature), the spins are disordered; near the critical point βc ≈ 0.44, large correlated domains appear; at β = 0.70 (low temperature), essentially all spins are aligned. Right pan… view at source ↗

**Figure 2.** Figure 2: Susceptibility of left and right halves to a probe spin in the left half, as a function of [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Left: lattice layout showing three regions (A, B, C) separated by a horizontal wall on the right half. Stars mark probe spin positions. Right: the response matrix χpα = Covβ[sp, Mα] at β = 0.44. • Every probe couples most strongly to the magnetization of its own region (χpα ≈ 25–41). This is the basic signal: the susceptibility identifies which region each probe belongs to. • Probe A1, positioned in the to… view at source ↗

**Figure 4.** Figure 4: Diagrams expressing the terms that contribute to Cov [PITH_FULL_IMAGE:figures/full_fig_p031_4.png] view at source ↗

read the original abstract

These notes introduce the theory of susceptibilities as developed in [arXiv:2504.18274, arXiv:2601.12703] for interpreting neural networks. The susceptibility of an observable $\phi$ to a data perturbation is defined as a derivative of a posterior expectation, which by the fluctuation--dissipation theorem equals a posterior covariance. Different choices of $\phi$ yield different objects: per-sample losses give the influence matrix (the Bayesian influence function of [arXiv:2509.26544]), while component-localized observables give the structural susceptibility matrix that pairs model components with data patterns. The susceptibility matrix is (up to a factor of $n\beta$) the Jacobian of the map from data distributions to structural coordinates; its pseudo-inverse provides a linearized solution to the patterning problem of [arXiv:2601.13548]: finding data perturbations that produce a desired structural change. We motivate the theory from its statistical-mechanical foundations, then give a detailed exposition of susceptibilities, their empirical estimators, and their connection to the geometry of the loss landscape.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a clear primer on the authors' prior susceptibility framework but adds no new theorems or empirical results.

read the letter

The paper's main contribution is pedagogical: it collects the susceptibility definitions, the fluctuation-dissipation link, and the Jacobian interpretation of the structural susceptibility matrix into one readable set of notes. The motivation from the Gibbs posterior and loss-landscape geometry is laid out directly, and the distinction between influence matrices (from per-sample losses) and structural susceptibilities (from component-localized observables) is explained without unnecessary abstraction. The empirical estimator section and the linearized patterning map via pseudo-inverse are the parts that could be immediately useful to someone already working in Bayesian interpretability.

Referee Report

0 major / 3 minor

Summary. The paper is a primer introducing the theory of susceptibilities for interpreting Bayesian neural networks, developed from prior works. It defines the susceptibility of an observable φ to data perturbations as the derivative of its posterior expectation under the Gibbs posterior; by the fluctuation-dissipation theorem this equals a posterior covariance. Per-sample losses yield the influence matrix (Bayesian influence function), while component-localized observables yield the structural susceptibility matrix. The latter is (up to a factor of nβ) the Jacobian of the map from data distributions to structural coordinates; its pseudo-inverse supplies a linearized solution to the patterning problem of finding data perturbations that induce a desired structural change. The exposition motivates the framework from statistical-mechanical foundations, details empirical estimators, and connects the objects to loss-landscape geometry.

Significance. If the identifications hold, the work supplies a coherent linear-response framework that links Bayesian posteriors over neural-network parameters to interpretable structural coordinates via data perturbations. The explicit connection between the susceptibility matrix and the Jacobian of the data-to-structure map, together with the pseudo-inverse patterning operator, offers a concrete computational handle on how changes in the training distribution affect model components. The provision of empirical estimators and the grounding in loss-landscape geometry are practical strengths that could support downstream applications in model auditing and data design.

minor comments (3)

The abstract states that the susceptibility matrix is the Jacobian 'up to a factor of nβ' but does not define n or β at that point; a parenthetical reminder of their meanings (sample size and inverse temperature) would improve readability for readers who begin with the abstract.
The patterning problem is referenced via arXiv:2601.13548 without a one-sentence recap of its precise formulation; adding a brief inline definition would make the claim about the pseudo-inverse self-contained.
Empirical estimators are mentioned in the abstract and presumably detailed later; a short pseudocode block or explicit formula for the Monte-Carlo estimator of the structural susceptibility matrix would help readers implement the method.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful and accurate summary of the manuscript, the positive assessment of its significance, and the recommendation for minor revision. No specific major comments were raised in the report.

Circularity Check

1 steps flagged

Susceptibility-to-Jacobian identification is definitional by construction

specific steps

self definitional [Abstract]
"The susceptibility of an observable ϕ to a data perturbation is defined as a derivative of a posterior expectation, which by the fluctuation--dissipation theorem equals a posterior covariance. ... component-localized observables give the structural susceptibility matrix ... The susceptibility matrix is (up to a factor of nβ) the Jacobian of the map from data distributions to structural coordinates"

Susceptibility is defined precisely as the indicated derivative of a posterior expectation under data perturbation. Structural coordinates are the posterior expectations of the component-localized observables. The matrix of these derivatives is therefore the Jacobian matrix by definition, rendering the stated equivalence tautological rather than a non-trivial result obtained from the statistical-mechanical motivation or loss-landscape geometry.

full rationale

The paper is an expository primer on concepts from prior self-authored works. Its central claim equates the susceptibility matrix to the Jacobian of the data-to-structural map. This reduces directly to the paper's own definition of susceptibility as the derivative of a posterior expectation (with structural coordinates arising from the same observables), plus invocation of the fluctuation-dissipation theorem. No independent derivation or external benchmark is supplied for this equivalence within the present manuscript; the identification holds by the definitional setup rather than as a derived prediction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The framework introduces new matrices and concepts building on statistical mechanics and previous papers.

axioms (1)

domain assumption Fluctuation-dissipation theorem holds for the posterior distribution in Bayesian learning
Invoked to equate derivative to covariance.

invented entities (2)

structural susceptibility matrix no independent evidence
purpose: Pairs model components with data patterns
Introduced as a new object from choosing component-localized observables.
structural coordinates no independent evidence
purpose: Coordinates in the space of model structures
Used in the map from data distributions.

pith-pipeline@v0.9.0 · 5492 in / 1285 out tokens · 38410 ms · 2026-05-11T02:06:36.892443+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

[1]

G. Wang, J. Hoogland, S. van Wingerden, Z. Furman, and D. Murfet. Differentiation and specialization of attention heads via the refined local learning coefficient.arXiv:2410.02984, 2024

work page arXiv 2024
[2]

Baker, G

G. Baker, G. Wang, J. Hoogland, and D. Murfet. Structural inference: Interpreting small language models with susceptibilities.arXiv:2504.18274, 2025

work page arXiv 2025
[3]

G. Wang, G. Baker, A. Gordon, and D. Murfet. Embryology of a language model. arXiv:2508.00331, 2025

work page arXiv 2025
[4]

Gordon, G

A. Gordon, G. Baker, G. Wang, W. Snell, S. van Wingerden, and D. Murfet. Towards spec- troscopy: Susceptibility clusters in language models.arXiv:2601.12703, 2026

work page arXiv 2026
[5]

Gordon, R

A. Gordon, R. Hitchcock, and D. Murfet. Interpreting the Ising model.https://timaeus. co/research/2026-04-21-spectroscopy-ising, 2026

work page 2026
[6]

E. Lau, Z. Furman, G. Wang, D. Murfet, and S. Wei. The local learning coefficient: A singularity-aware complexity measure.arXiv:2308.12108, 2023

work page arXiv 2023
[7]

Gerraty and D

B. Gerraty and D. Murfet. Expectations and the exceptional divisor. In preparation, 2026

work page 2026
[8]

Wang and D

G. Wang and D. Murfet. Patterning: The dual of interpretability.arXiv:2601.13548, 2026

work page arXiv 2026
[9]

Gromoll and W

D. Gromoll and W. Meyer. On differentiable functions with isolated critical points.Topology, 8:361–369, 1969

work page 1969
[10]

Watanabe

S. Watanabe. Almost all learning machines are singular. InIEEE Symposium on Foundations of Computational Intelligence (FOCI 2007), pages 383–388, 2007

work page 2007
[11]

Watanabe.Algebraic Geometry and Statistical Learning Theory

S. Watanabe.Algebraic Geometry and Statistical Learning Theory. Cambridge University Press, 2009

work page 2009
[12]

Watanabe

S. Watanabe. A widely applicable Bayesian information criterion.Journal of Machine Learning Research, 14:867–897, 2013

work page 2013
[13]

H. B. Callen.Thermodynamics and an Introduction to Thermostatistics. John Wiley & Sons, 2nd edition, 1985. 32

work page 1985
[14]

R. Kubo. The fluctuation-dissipation theorem.Reports on Progress in Physics, 29(1):255–284, 1966

work page 1966
[15]

P. S. Laplace. Memoir on the probability of the causes of events.Statistical Science, 1(3):364– 378, 1986. Translation by S. M. Stigler ofM´ emoire sur la probabilit´ e des causes par les ´ ev` enements(1774)

work page 1986
[16]

C. M. Bender and S. A. Orszag.Advanced Mathematical Methods for Scientists and Engineers. Springer, 1999

work page 1999
[17]

Tierney and J

L. Tierney and J. B. Kadane. Accurate approximations for posterior moments and marginal densities.Journal of the American Statistical Association, 81(393):82–86, 1986

work page 1986
[18]

Welling and Y

M. Welling and Y. W. Teh. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th International Conference on Machine Learning, pages 681–688, 2011

work page 2011
[19]

Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viégas, and Rory Sayres

J. Hoogland, G. Wang, M. Farrugia-Roberts, L. Carroll, S. Wei, and D. Murfet. Loss landscape degeneracy and stagewise development in transformers.arXiv:2402.02364, 2024

work page arXiv 2024
[20]

P. A. Kreer, W. Wu, M. Adam, Z. Furman, and J. Hoogland. Bayesian influence functions for Hessian-free data attribution.arXiv:2509.26544, 2025

work page arXiv 2025
[21]

Furman, G

Z. Furman, G. Wang, and D. Murfet. The loss kernel: A geometric probe for deep learning interpretability. In preparation, 2026

work page 2026
[22]

Elliott and D

C. Elliott and D. Murfet. Linear response estimators for singular statistical models.arXiv preprint, 2026

work page 2026
[23]

Giordano, T

R. Giordano, T. Broderick, and M. I. Jordan. Covariances, robustness, and variational Bayes. Journal of Machine Learning Research, 19(51):1–49, 2018

work page 2018
[24]

Giordano and T

R. Giordano and T. Broderick. The Bayesian infinitesimal jackknife for variance. arXiv:2305.06466, 2024

work page arXiv 2024
[25]

Y. Iba. W-kernel and its principal space for frequentist evaluation of Bayesian estimators. arXiv:2311.13017, 2025

work page arXiv 2025
[26]

R. Penrose. On best approximate solutions of linear matrix equations.Mathematical Proceed- ings of the Cambridge Philosophical Society, 52(1):17–19, 1956

work page 1956
[27]

F. R. Hampel. The influence curve and its role in robust estimation.Journal of the American Statistical Association, 69(346):383–393, 1974

work page 1974
[28]

Gustafson

P. Gustafson. Local sensitivity of posterior expectations.The Annals of Statistics, 24:174–195, 1996

work page 1996
[29]

Salazar, W

R. Salazar, W. Troiani, B. Snikkers, and D. Murfet. Susceptibilities for Turing machines. In preparation, 2026

work page 2026
[30]

R. E. Kass, L. Tierney, and J. B. Kadane. The validity of posterior expansions based on Laplace’s method. In S. Geisser, J. S. Hodges, S. J. Press, and A. Zellner, editors,Bayesian and Likelihood Methods in Statistics and Econometrics: Essays in Honor of George A. Barnard, pages 473–488. North-Holland, Amsterdam, 1990. 33

work page 1990
[31]

Wong.Asymptotic Approximations of Integrals

R. Wong.Asymptotic Approximations of Integrals. SIAM, Philadelphia, classics edition, 2001

work page 2001
[32]

Isserlis

L. Isserlis. On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables.Biometrika, 12(1/2):134–139, 1918

work page 1918
[33]

Shun and P

Z. Shun and P. McCullagh. Laplace approximation of high dimensional integrals.Journal of the Royal Statistical Society. Series B (Methodological), 57(4):749–760, 1995. 34

work page 1995