pith. machine review for the scientific record. sign in

arxiv: 2605.08777 · v1 · submitted 2026-05-09 · 📊 stat.ML · cs.LG· math.PR

Recognition: no theorem link

Measuring and Decomposing Mode Separation via the Canonical Diffusion

Ori Meidler, Or Zuk, Shaul Tolkovsky

Pith reviewed 2026-05-12 01:19 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.PR
keywords mode separationdiffusion processautocovariancemetastabilityscore-based modelsmolecular dynamicsGaussian mixturesdimensionality reduction
0
0 comments X

The pith

A canonical reversible diffusion with constant scalar coefficient extracts mode separation from a density's autocovariance matrix.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a way to quantify how sharply a high-dimensional distribution fragments into barrier-separated clusters by running a single intrinsic stochastic process on it. This process is the unique reversible diffusion whose stationary distribution is the given density and whose diffusion coefficient is a constant scalar. Two quantities are read from the process autocovariance matrix: a scalar SSA that grows with barrier strength, and a set of DA directions ordered by how slowly they decorrelate. The construction needs only samples and a score function, so it scales to pretrained generative models and recovers known slow variables in molecular dynamics without any trajectory data.

Core claim

We measure mode separation through a single stochastic process intrinsic to the density: a unique reversible diffusion with f as its stationary distribution and constant scalar diffusion coefficient. We extract two readouts from its autocovariance matrix: SSA (Sum of Squared Autocorrelations), a scalar barrier-sensitive measure; and DA (Dominant Autocorrelation directions), linear projections ordered by metastability rather than variance. Under an isotropic-Gaussian null, we derive a closed-form spectrum for the empirical autocovariance that generalizes Marchenko-Pastur, with an analytic upper edge that selects the lag at which DA is read off. Both readouts use only samples and a score, and,

What carries the argument

The canonical diffusion: the unique reversible diffusion with given stationary density f and constant scalar diffusion coefficient, whose autocovariance matrix supplies SSA and DA.

If this is right

  • In Gaussian mixtures SSA tracks mutual information between the latent mode variable and the observed samples.
  • In SDXL text-to-image outputs the readouts detect compositional structure that differential entropy and PCA both miss.
  • For static samples of alanine dipeptide the DA directions recover the known slow backbone dihedrals.
  • The Marchenko-Pastur-type edge for the null spectrum gives an automatic rule for choosing the lag at which DA is evaluated.
  • Everything runs from samples plus a score function, allowing direct use inside pretrained score-based generative models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same construction could be applied to any density for which a score estimator exists, including those learned from unlabeled data in other scientific domains.
  • Because DA orders directions by decorrelation time rather than variance, it may serve as a drop-in replacement for PCA in tasks where slow mixing matters more than spread.
  • The closed-form null spectrum suggests a direct test for whether an empirical density deviates from Gaussianity in its barrier structure.

Load-bearing premise

The diffusion defined by reversibility plus constant scalar coefficient is the right intrinsic process whose autocovariance isolates barrier crossing rather than other geometric features.

What would settle it

Apply the method to a known two-mode Gaussian mixture whose mutual information is independently computable; if SSA fails to increase monotonically with the separation parameter while other measures do, the isolation claim is falsified.

Figures

Figures reproduced from arXiv: 2605.08777 by Ori Meidler, Or Zuk, Shaul Tolkovsky.

Figure 1
Figure 1. Figure 1: Two readouts of the canonical diffusion. (a–d) SSA on 10D GMMs (left 2/3): SSA (blue, left axis) and MI (red, right axis) across sweeps in (a) mode separation, (b) isotropic variance at fixed separation, (c) weight asymmetry, (d) mode count. SSA tracks MI in ordering; MI saturates near H(Z) = log k while SSA grows with barrier height. Mean over 10 seeds, ±1 SD. Daggered points (K=4, 5) did not meet the 1/N… view at source ↗
Figure 2
Figure 2. Figure 2: Empirical eigenvalue distribution of Cˆsym(τ ) under the isotropic Gaussian null f = N (0, Id), at γ = 0.25 across τ ∈ {0, 0.5, 2, 10} (left to right); N = 500, single seed. His￾togram: empirical; red: analytic density ρ γ τ from Theorem 2. Dark-green dotted: λ+(τ ); or￾ange dotted: λ+(∞) from Equation (10); the two converge as τ grows. Full 6 × 4 sweep across γ ∈ {0.25, 0.5, 0.75, 1, 2, 5} (including the … view at source ↗
Figure 3
Figure 3. Figure 3: SSA and DA on SDXL generations. Left: Four representative samples per guidance scale for the prompt “a professional photograph of a person” (rows: w ∈ {1, 3, 7.5, 15}). At w = 1 outputs are diverse but weakly prompt-faithful; at w = 3 they become mostly on-prompt with broad stylistic variation; at w = 7.5 they show structured variation across distinct coherent types (highest SSA, [PITH_FULL_IMAGE:figures/… view at source ↗
Figure 4
Figure 4. Figure 4: Ramachandran diagram colored by (a) DA1, (b) TICA1, (c) PC1. DA1 and TICA1 vary [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Scale-dependence of A0 = Id. A bimodal density rescaled by c ∈ {0.5, 1, 2}: the three densities have the same shape but differ in units. Under A0 = Id, SSA varies across the three scalings (Sˆ = 3.80, 16.44, 42.62 for c = 0.5, 1, 2); under the canonical choice A0 = σ 2 f Id, SSA is constant at Sˆ = 6.19, as required by similarity invariance (Proposition 1). Why not A0 = Covf (X). Full-covariance whitening,… view at source ↗
Figure 6
Figure 6. Figure 6: Free-convolution validation at γ = 0.25. Light-blue histogram: empirical eigenvalue density of Cˆsym(τ ) under the isotropic Gaussian null. Red curve: analytic free-convolution density. Red stem at λ = 0 (when present): δ0 atom with mass w0. Dark-green dotted: λ+(τ ); orange dotted: λ+(∞) [PITH_FULL_IMAGE:figures/full_fig_p026_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Free-convolution validation at γ = 0.5; layout as in [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Free-convolution validation at γ = 0.75; layout as in [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Free-convolution validation at γ = 1; the τ = 0 panel shows the expected MP δ0 atom of mass 1 − 1/γ = 0 at the γ = 1 boundary (no atom); τ > 0 remains atom-free as γ ≤ 2. Layout as in [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Free-convolution validation at γ = 2. The τ = 0 panel has an MP atom of mass 0.5; the τ > 0 panels are non-atomic (the sub-critical γ = 2 boundary), with a sharp continuous peak near zero shown on a log-y axis. Layout as in [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Free-convolution validation at γ = 5. The atom mass drops from 1 − 1/γ = 0.80 at τ = 0 (MP) to 1 − 2/γ = 0.60 at every τ > 0 (free-conv). Layout as in [PITH_FULL_IMAGE:figures/full_fig_p027_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Validation under planted ground truth, with [PITH_FULL_IMAGE:figures/full_fig_p031_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Same construction as Figure 12, but with per-axis well widths [PITH_FULL_IMAGE:figures/full_fig_p031_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: 2D illustrations of the sweep parameters used in [PITH_FULL_IMAGE:figures/full_fig_p036_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Rank comparison of SSA and MI across 50 random bimodal GMMs in [PITH_FULL_IMAGE:figures/full_fig_p036_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Eigenvalue spectrum of Cˆsym(τ ∗ ) at τ ∗ = 10,000 ps for alanine dipeptide (d = 45, N = 2,000, KDE bandwidth h = hScott/2 = 0.0091). Histogram of all 45 eigenvalues on a symlog scale. Red curve: analytic free-convolution density ρ γ τ ∗ from Theorem 2 with γ = 45/N and σ 2 bulk fit on the bulk after peeling the two leading spikes. Dashed vertical lines: peeled-bulk edges λ±(τ ∗ ); orange dotted lines: fu… view at source ↗
Figure 17
Figure 17. Figure 17: shows implied timescale convergence and direction sensitivity for TICA on the 45- dimensional pairwise distance featurization. The top two implied timescales have not fully plateaued at lag = 10 frames, indicating that the Markov assumption is not yet satisfied for timescale estima￾tion at this lag. However, the TICA directions at lag = 10 yield the strongest dihedral correlations (TICA1 |r(ψ)| = 0.75; ri… view at source ↗
Figure 18
Figure 18. Figure 18: Joint projection of all N = 2,000 trajectory snapshots onto the (DA1, PC1) plane for the “elderly” prompt at w = 7.5. The two axes are non-collinear (| cos(DA1,PC1)| = 0.514, an angle of 59◦ ; Spearman |ρ| with gender = 0.90): DA1 separates the distribution along the gender axis while PC1 mixes gender with realism and style. The lag τ ∗ = 2,000 is selected as the largest lag at which the leading eigenvalu… view at source ↗
Figure 19
Figure 19. Figure 19: Eigenvalue spectrum of Cˆsym(τ ∗ ) for “a portrait of an elderly person” at w = 7.5, τ ∗ = 2,000 (d = 768, N = 200, γ = 3.84, ∆t = 0.0025). (a) Empirical histogram (light blue) overlaid with the analytic free-convolution density (orange) at the naive variance σ 2 f = 1.12×10−4 ; the leading spike at λ1 = 9.96 × 10−3 (red vertical) sits well above the loose edge λ+(σ 2 f ) (orange dashed). (b) Same histogr… view at source ↗
Figure 20
Figure 20. Figure 20: Top-5 images at each end of DA1 (top panel, top row = high DA1, bottom row = low DA1) and PC1 (bottom panel, top row = high PC1 = construction cranes, bottom row = low PC1 = birds) for “a photo of a crane” at w = 7.5, τ ∗ = 2,000. Both axes separate the two senses of the prompt: birds at one extreme, construction equipment at the other. Both pass Hartigan’s dip test (p < 10−3 ). I Reproducibility and Hype… view at source ↗
read the original abstract

Mode separation, namely how sharply a distribution fragments into barrier-separated clusters, is a fundamental geometric property of densities, difficult to quantify in high dimensions. It is structurally distinct from dispersion, yet existing tools fall short: differential entropy rises with spread regardless of fragmentation, PCA orders directions by variance regardless of barriers, and mutual information requires a mixture decomposition one usually does not have. We measure mode separation through a single stochastic process intrinsic to the density: a unique reversible diffusion with $f$ as its stationary distribution and constant scalar diffusion coefficient. We extract two readouts from its autocovariance matrix: SSA (Sum of Squared Autocorrelations), a scalar barrier-sensitive measure; and DA (Dominant Autocorrelation directions), linear projections ordered by metastability rather than variance. Under an isotropic-Gaussian null, we derive a closed-form spectrum for the empirical autocovariance that generalizes Marchenko--Pastur, with an analytic upper edge that selects the lag at which DA is read off. Both readouts use only samples and a score function, scaling to high dimensions through pretrained score-based generative models via Tweedie's identity. We apply our framework to three settings: (i) synthetic Gaussian mixtures, where SSA tracks mutual information; (ii) SDXL text-to-image generations, where SSA and DA capture structure that entropy and PCA miss; and (iii) molecular dynamics of alanine dipeptide, where DA recovers the known slow backbone dihedrals from static samples alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that mode separation (barrier-induced fragmentation of a density) can be quantified and decomposed via the autocovariance operator of the unique reversible diffusion having the target density f as stationary measure and constant scalar diffusion coefficient. From this operator it extracts SSA (sum of squared autocorrelations), a scalar barrier-sensitive statistic, and DA (dominant autocorrelation directions), linear projections ordered by metastability. Under an isotropic-Gaussian null it derives a closed-form spectrum for the empirical autocovariance that generalizes the Marchenko-Pastur law, yielding an analytic edge for lag selection. Both readouts require only samples and a score function and are demonstrated on Gaussian mixtures (where SSA tracks mutual information), SDXL text-to-image outputs, and alanine-dipeptide MD trajectories (where DA recovers known slow dihedrals from static samples).

Significance. If the central invariance claim holds, the framework supplies a new, intrinsically defined, parameter-free diagnostic for high-dimensional fragmentation that is orthogonal to variance-based (PCA) and entropy-based measures. The analytic null spectrum, the scaling route through pretrained score models via Tweedie’s identity, and the concrete recovery of known slow modes in alanine dipeptide are notable strengths that would make the method immediately usable for generative-model evaluation and molecular-dynamics analysis.

major comments (3)
  1. [§2.1–2.2] The manuscript asserts that the autocovariance of the chosen canonical diffusion isolates barrier crossing rather than local curvature, anisotropy or higher moments, yet provides no theorem establishing invariance of SSA or the DA spectrum under non-barrier geometric perturbations while retaining sensitivity to barrier height. The empirical checks on Gaussian mixtures and alanine dipeptide are consistent but do not rule out confounding (see the paragraph introducing the canonical process and the subsequent definition of SSA/DA).
  2. [§3.1, Eq. (12)–(15)] The closed-form spectrum under the isotropic-Gaussian null (generalizing Marchenko-Pastur) is stated without the intermediate expansion of the autocovariance operator or the verification that cross terms vanish; it is therefore unclear whether the analytic upper edge for lag selection remains valid once the diffusion is discretized or the score is estimated (see the derivation of the null spectrum).
  3. [§5.3, Figure 7] In the alanine-dipeptide experiment the claim that DA recovers the known slow backbone dihedrals relies on visual inspection of the leading directions; a quantitative comparison (e.g., overlap with the true slow subspace or comparison against a variance-ordered baseline) is missing, weakening the assertion that metastability rather than variance is being recovered.
minor comments (2)
  1. [§2.3 and §4] Notation for the lag parameter and the empirical autocovariance matrix is introduced inconsistently between the theoretical and experimental sections; a single consolidated definition would improve readability.
  2. [Figure 3] The caption of Figure 3 does not state the number of samples or the score-estimation procedure used for the SDXL experiment, making the reported SSA values difficult to reproduce.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed report. We address each major comment below, providing clarifications on the theoretical construction and indicating revisions that will strengthen the manuscript.

read point-by-point responses
  1. Referee: [§2.1–2.2] The manuscript asserts that the autocovariance of the chosen canonical diffusion isolates barrier crossing rather than local curvature, anisotropy or higher moments, yet provides no theorem establishing invariance of SSA or the DA spectrum under non-barrier geometric perturbations while retaining sensitivity to barrier height. The empirical checks on Gaussian mixtures and alanine dipeptide are consistent but do not rule out confounding (see the paragraph introducing the canonical process and the subsequent definition of SSA/DA).

    Authors: The canonical diffusion is the unique reversible process with stationary density f and constant scalar diffusion coefficient, so its infinitesimal generator is L = ∇log f · ∇ + Δ. The autocovariance operator is therefore completely determined by the score ∇log f, which encodes barrier heights via the underlying potential. We do not assert invariance under arbitrary geometric changes; rather, the constant-diffusion construction removes local anisotropy by design while preserving sensitivity to global mode separation induced by barriers. To make this explicit, we will add a short proposition in §2.2 relating SSA/DA to barrier height under controlled perturbations of local curvature, together with a brief discussion of the scope of the invariance. revision: yes

  2. Referee: [§3.1, Eq. (12)–(15)] The closed-form spectrum under the isotropic-Gaussian null (generalizing Marchenko-Pastur) is stated without the intermediate expansion of the autocovariance operator or the verification that cross terms vanish; it is therefore unclear whether the analytic upper edge for lag selection remains valid once the diffusion is discretized or the score is estimated (see the derivation of the null spectrum).

    Authors: We agree that the derivation was condensed. In the revised version we will move the full expansion of the autocovariance operator and the verification that cross terms vanish to a dedicated appendix. The analytic upper edge is derived for the continuous-time process with exact score; we will add a remark clarifying its status under Euler–Maruyama discretization and score estimation error, supported by additional Monte-Carlo checks that confirm the edge remains a reliable lag selector in the regimes used in the experiments. revision: yes

  3. Referee: [§5.3, Figure 7] In the alanine-dipeptide experiment the claim that DA recovers the known slow backbone dihedrals relies on visual inspection of the leading directions; a quantitative comparison (e.g., overlap with the true slow subspace or comparison against a variance-ordered baseline) is missing, weakening the assertion that metastability rather than variance is being recovered.

    Authors: We accept that a quantitative metric would strengthen the claim. In the revision we will compute the principal angles (or Grassmann distance) between the leading DA directions and the known slow dihedral subspace, and report the same overlap for a variance-ordered PCA baseline on identical data. These numbers, together with an updated Figure 7, will be added to §5.3. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on standard stochastic process definitions and independent random-matrix results

full rationale

The central construction selects the unique reversible diffusion with given stationary density f and constant scalar diffusion coefficient; this is a standard, externally defined object in the theory of reversible Markov processes and does not reduce to any fitted quantity or target mode-separation statistic. The null-spectrum derivation is obtained by applying classical Marchenko-Pastur-type analysis to the empirical autocovariance matrix under an isotropic-Gaussian null model, yielding a closed-form edge that is parameter-free with respect to the data and independent of the mode-separation claim. SSA and DA are then read off from the same autocovariance operator; they are not fitted to any external measure of barrier crossing. No load-bearing step invokes a self-citation whose content is itself unverified or whose uniqueness theorem is imported from the same authors without external grounding. The framework therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the existence and uniqueness of a reversible diffusion with constant scalar diffusion coefficient whose stationary distribution is exactly the target density f; this is treated as given rather than derived. No free parameters are explicitly fitted in the abstract description, but the lag choice via the analytic upper edge of the null spectrum is data-dependent. No new particles or forces are postulated.

axioms (1)
  • domain assumption Existence and uniqueness of a reversible diffusion with constant scalar diffusion coefficient having f as stationary distribution
    Invoked in the first sentence of the abstract as the intrinsic process from which SSA and DA are extracted.

pith-pipeline@v0.9.0 · 5569 in / 1401 out tokens · 38719 ms · 2026-05-12T01:19:44.426912+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    Diffusion maps.Applied and Computational Har- monic Analysis, 21(1):5–30, 2006

    Ronald R Coifman and St ´ephane Lafon. Diffusion maps.Applied and Computational Har- monic Analysis, 21(1):5–30, 2006

  2. [2]

    Diffusion maps, spectral clustering and eigenfunctions of Fokker–Planck operators

    Boaz Nadler, St ´ephane Lafon, Ronald R Coifman, and Ioannis G Kevrekidis. Diffusion maps, spectral clustering and eigenfunctions of Fokker–Planck operators. InAdvances in Neural Information Processing Systems (NeurIPS), volume 18, 2005

  3. [3]

    The dip test of unimodality.The Annals of Statistics, 13(1):70–84, 1985

    John A Hartigan and Pamela M Hartigan. The dip test of unimodality.The Annals of Statistics, 13(1):70–84, 1985

  4. [4]

    Spectral map: Embedding slow kinetics in collective variables.The Journal of Physical Chemistry Letters, 14(22):5216–5220, 2023

    Jakub Rydzewski. Spectral map: Embedding slow kinetics in collective variables.The Journal of Physical Chemistry Letters, 14(22):5216–5220, 2023

  5. [5]

    Spectral map for slow collective variables, Markovian dynamics, and transi- tion state ensembles.Journal of Chemical Theory and Computation, 20(18):7775–7784, 2024

    Jakub Rydzewski. Spectral map for slow collective variables, Markovian dynamics, and transi- tion state ensembles.Journal of Chemical Theory and Computation, 20(18):7775–7784, 2024

  6. [6]

    Separation of a mixture of independent signals using time delayed correlations.Physical Review Letters, 72(23):3634–3637, 1994

    Lutz Molgedey and Heinz Georg Schuster. Separation of a mixture of independent signals using time delayed correlations.Physical Review Letters, 72(23):3634–3637, 1994

  7. [7]

    Identification of slow molecular order parameters for Markov model construction.The Journal of Chemical Physics, 139(1):015102, 2013

    Guillermo P ´erez-Hern´andez, Fabian Paul, Toni Giorgino, Gianni De Fabritiis, and Frank No´e. Identification of slow molecular order parameters for Markov model construction.The Journal of Chemical Physics, 139(1):015102, 2013

  8. [8]

    Springer, 2014

    Grigorios A Pavliotis.Stochastic Processes and Applications: Diffusion Processes, the Fokker–Planck and Langevin Equations, volume 60 ofTexts in Applied Mathematics. Springer, 2014

  9. [9]

    B. W. Silverman. Using kernel density estimates to investigate multimodality.Journal of the Royal Statistical Society: Series B (Methodological), 43(1):97–99, 1981

  10. [10]

    Denoising diffusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 6840–6851, 2020

  11. [11]

    Generative modeling by estimating gradients of the data dis- tribution

    Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data dis- tribution. InAdvances in Neural Information Processing Systems (NeurIPS), volume 32, 2019

  12. [12]

    Score-based generative modeling through stochastic differential equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations (ICLR), 2021

  13. [13]

    Flow matching for generative modeling

    Yaron Lipman, Ricky T Q Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. InInternational Conference on Learning Representations (ICLR), 2023

  14. [14]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022

  15. [15]

    SDXL: Improving latent diffusion models for high-resolution image synthesis

    Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas M ¨uller, Joe Penna, and Robin Rombach. SDXL: Improving latent diffusion models for high-resolution image synthesis. InThe Twelfth International Conference on Learning Representations (ICLR), 2024

  16. [16]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InInterna- tional Conference on Machine Learning (ICML), pages 8748–8763. PMLR, 2021

  17. [17]

    MDShare: Download- ing sample datasets for molecular dynamics analysis.https://markovmodel.github.io/ mdshare

    Computational Molecular Biology Group, Freie Universit ¨at Berlin. MDShare: Download- ing sample datasets for molecular dynamics analysis.https://markovmodel.github.io/ mdshare. Accessed: 2026-05-06. 10

  18. [18]

    Monte Carlo methods in statistical mechanics: Foundations and new algorithms

    Alan Sokal. Monte Carlo methods in statistical mechanics: Foundations and new algorithms. In Cecile DeWitt-Morette, Pierre Cartier, and Antoine Folacci, editors,Functional Integration: Basics and Applications, pages 131–192. Springer, 1997

  19. [19]

    Practical Markov chain Monte Carlo.Statistical Science, 7(4):473–483, 1992

    Charles J Geyer. Practical Markov chain Monte Carlo.Statistical Science, 7(4):473–483, 1992

  20. [20]

    A. W. van der Vaart.Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998

  21. [21]

    Distribution of eigenvalues for some sets of random matrices.Mathematics of the USSR-Sbornik, 1(4):457–483, 1967

    Vladimir A Mar ˇcenko and Leonid A Pastur. Distribution of eigenvalues for some sets of random matrices.Mathematics of the USSR-Sbornik, 1(4):457–483, 1967

  22. [22]

    Joint convergence of sample cross- covariance matrices.ALEA: Latin American Journal of Probability and Mathematical Statis- tics, 20:395–423, 2023

    Monika Bhattacharjee, Arup Bose, and Apratim Dey. Joint convergence of sample cross- covariance matrices.ALEA: Latin American Journal of Probability and Mathematical Statis- tics, 20:395–423, 2023. doi: 10.30757/ALEA.v20-14

  23. [23]

    Sai Charan

    Santosh Kumar and S. Sai Charan. Spectral statistics for the difference of two wishart matrices. Journal of Physics A: Mathematical and Theoretical, 53(50):505202, 2020. doi: 10.1088/ 1751-8121/abc3fe

  24. [24]

    American Mathematical Society, 1992

    Dan V V oiculescu, Kenneth J Dykema, and Alexandru Nica.Free Random Variables, volume 1 ofCRM Monograph Series. American Mathematical Society, 1992

  25. [25]

    Cambridge University Press, 2006

    Alexandru Nica and Roland Speicher.Lectures on the Combinatorics of Free Probability, volume 335 ofLondon Mathematical Society Lecture Note Series. Cambridge University Press, 2006

  26. [26]

    Springer Series in Statistics

    Zhidong Bai and Jack W Silverstein.Spectral Analysis of Large Dimensional Random Matri- ces. Springer Series in Statistics. Springer, 2nd edition, 2010

  27. [27]

    Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.The Annals of Probability, 33(5):1643–1697, 2005

    Jinho Baik, G ´erard Ben Arous, and Sandrine P´ech´e. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.The Annals of Probability, 33(5):1643–1697, 2005

  28. [28]

    The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices.Advances in Mathematics, 227(1):494–521, 2011

    Florent Benaych-Georges and Raj Rao Nadakuditi. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices.Advances in Mathematics, 227(1):494–521, 2011

  29. [29]

    Local stability of the free additive convolu- tion.Journal of Functional Analysis, 271(3):672–719, 2016

    Zhigang Bao, L ´aszl´o Erd˝os, and Kevin Schnelli. Local stability of the free additive convolu- tion.Journal of Functional Analysis, 271(3):672–719, 2016

  30. [30]

    Tracy–Widom distribution for the largest eigenvalue of real sample covariance matrices with general population.The Annals of Applied Probability, 26 (6):3786–3839, 2016

    Ji Oon Lee and Kevin Schnelli. Tracy–Widom distribution for the largest eigenvalue of real sample covariance matrices with general population.The Annals of Applied Probability, 26 (6):3786–3839, 2016

  31. [31]

    Pavliotis and Andrew M

    Grigorios A. Pavliotis and Andrew M. Stuart.Multiscale Methods: Averaging and Homoge- nization, volume 53 ofTexts in Applied Mathematics. Springer, 2008

  32. [32]

    Springer, 2015

    Anton Bovier and Frank den Hollander.Metastability: A Potential-Theoretic Approach, vol- ume 351 ofGrundlehren der mathematischen Wissenschaften. Springer, 2015

  33. [33]

    H. A. Kramers. Brownian motion in a field of force and the diffusion model of chemical reactions.Physica, 7(4):284–304, 1940

  34. [34]

    Princeton University Press, 1967

    Edward Nelson.Dynamical Theories of Brownian Motion. Princeton University Press, 1967. 11 A Canonical Diffusion This appendix collects the uniqueness proof of the canonical diffusion (Theorem 1), the proof of similarity invariance (Proposition 1), and the design rationale for the choice of diffusion matrix. A.1 Proof of Theorem 1 We assumef:R d →(0,∞)is s...

  35. [35]

    at least one relaxation time out,

    The2×2transformation ( ˜X, Z)7→(A, B)is orthogonal, so(A, B)has the same joint i.i.d.N(0, I d)distribution as( ˜X, Z), with columns ofAandBindependent of one another. Then ˜X= (A−B)/ √ 2andZ= (A+ B)/ √ 2, so direct expansion gives ˜X ˜X ⊤ = 1 2(AA⊤ +BB ⊤ −AB ⊤ −BA ⊤)andZ ˜X ⊤ + ˜XZ ⊤ = AA⊤ −BB ⊤, so 2N ˆCsym(τ) = (ασ 2 +βσ)| {z } P AA⊤ + (ασ 2 −βσ)| {z } ...