arxiv: 2604.11729 · v1 · submitted 2026-04-13 · 🧮 math.PR · cs.DS· cs.LG· math.ST· stat.TH

Recognition: unknown

Universality of first-order methods on random and deterministic matrices

Chris Jones, Dmitriy Kunisky, Lucas Pesenti, Nicola Gorini

Pith reviewed 2026-05-10 16:13 UTC · model grok-4.3

classification 🧮 math.PR cs.DScs.LGmath.STstat.TH

keywords traffic distributiongeneral first-order methodsapproximate message passingdeterministic matricesWalsh-HadamardOnsager correctionlimiting dynamicsdiagrammatic expansions

0 comments

The pith

The traffic distribution of structured deterministic matrices determines the limiting dynamics of general first-order methods on them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

General first-order methods combine matrix-vector multiplications with entrywise nonlinearities, and their large-n behavior is known for random matrices but had been unclear for structured deterministic ones. The paper introduces the traffic distribution of an input matrix as the collection of all limiting values of permutation-invariant polynomials in its entries. It computes this distribution explicitly for the first nontrivial deterministic cases, including minor variants of the Walsh-Hadamard matrix and the discrete sine and cosine transform matrices. These calculations fix the asymptotic dynamics of the methods on those inputs and settle parts of conjectures from 1994. A new approximate message passing iteration is also constructed whose state remains Gaussian conditional on latent random variables, covering both random and the new deterministic matrices through a simple combinatorial view of the Onsager correction.

Core claim

We calculate the traffic distribution for the first non-trivial deterministic matrices, including minor variants of the Walsh-Hadamard and discrete sine and cosine transform matrices. This determines the limiting dynamics of GFOM on these inputs, resolving parts of longstanding conjectures of Marinari, Parisi, and Ritort (1994). We design a new AMP iteration which unifies several previous AMP variants and generalizes to new input types, whose limiting dynamics are Gaussian conditional on some latent random variables. The asymptotic dynamics hold for a large and natural class of traffic distributions encompassing both random and deterministic input matrices.

What carries the argument

The limiting traffic distribution of the input matrix, defined as the collection of all limiting values of permutation-invariant polynomials in the matrix entries, which carries the diagrammatic expansion analysis of GFOM dynamics.

Load-bearing premise

The limiting traffic distribution of the input matrix exists and fully determines the large-n dynamics of GFOM through diagrammatic expansions for the claimed deterministic matrices.

What would settle it

Numerically run GFOM iterations on a large Walsh-Hadamard matrix and check whether the observed state distributions match the explicit predictions obtained from the calculated traffic distribution.

Figures

Figures reproduced from arXiv: 2604.11729 by Chris Jones, Dmitriy Kunisky, Lucas Pesenti, Nicola Gorini.

**Figure 1.** Figure 1: A cactus graph in C. Intuitively, a cactus is a “tree of cycles”. Definition 1.3 (Scalar graph polynomials). Given α ∈ A and A ∈ R n×n sym , define polynomials wα(A), zα(A) ∈ R[A] by: wα(A) = X i:V (α)→[n] Y {u,v}∈E(α) A[i(u), i(v)] , zα(A) = X i:V (α),→[n] Y {u,v}∈E(α) A[i(u), i(v)] . 1This notion is sometimes more specifically called a bridgeless cactus; in this paper we take this to be part of the defin… view at source ↗

**Figure 2.** Figure 2: Examples of treelike and Gaussian diagrams. The root vertex is circled. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Example for Proposition 5.8 of a 2-edge-connected graph which is not a cactus. If the open cactus in red is removed, the graph remains 2-edge-connected. To prove Proposition 5.8, we will consider the last ear in an ear decomposition of α. We prove a small variant of the classical ear decomposition (see [Rob39] or [BM08, §5.3]) which lets us exclude a specified vertex from the internal vertices of the last … view at source ↗

**Figure 4.** Figure 4: Illustration of the main diagrammatic manipulations in the proof of [PITH_FULL_IMAGE:figures/full_fig_p045_4.png] view at source ↗

**Figure 5.** Figure 5: Our Feynman diagram notation vs. the ’t Hooft double line notation. [PITH_FULL_IMAGE:figures/full_fig_p074_5.png] view at source ↗

**Figure 6.** Figure 6: The 12 Feynman diagrams associated to ⟨Tr A4 ⟩GOE. The two red edges are matched in the orientation specified by the arrows, and similarly for the blue edges. Gluing either of the left two diagrams results in a “taco”. Gluing the remaining diagrams results in degenerate polyhedra. After gluing, the “tacos” have 3 vertices, the middle diagrams have 2 vertices, and the right diagrams have 1 vertex, respect… view at source ↗

**Figure 7.** Figure 7: An illustration of the Kreweras complement operation on non-crossing partitions. The [PITH_FULL_IMAGE:figures/full_fig_p080_7.png] view at source ↗

**Figure 8.** Figure 8: An illustration of the matchings involved in [PITH_FULL_IMAGE:figures/full_fig_p083_8.png] view at source ↗

read the original abstract

General first-order methods (GFOM) are a flexible class of iterative algorithms which update a state vector by matrix-vector multiplications and entrywise nonlinearities. A long line of work has sought to understand the large-n dynamics of GFOM, mostly focusing on "very random" input matrices and the approximate message passing (AMP) special case of GFOM whose state is asymptotically Gaussian. Yet, it has long remained unknown how to construct iterative algorithms that retain this Gaussianity for more structured inputs, or why existing AMP algorithms can be as effective for some deterministic matrices as they are for random matrices. We analyze diagrammatic expansions of GFOM via the limiting traffic distribution of the input matrix, the collection of all limiting values of permutation-invariant polynomials in the matrix entries, to obtain the following results: 1. We calculate the traffic distribution for the first non-trivial deterministic matrices, including (minor variants of) the Walsh-Hadamard and discrete sine and cosine transform matrices. This determines the limiting dynamics of GFOM on these inputs, resolving parts of longstanding conjectures of Marinari, Parisi, and Ritort (1994). 2. We design a new AMP iteration which unifies several previous AMP variants and generalizes to new input types, whose limiting dynamics are Gaussian conditional on some latent random variables. The asymptotic dynamics hold for a large and natural class of traffic distributions (encompassing both random and deterministic input matrices) and the algorithm's analysis gives a simple combinatorial interpretation of the Onsager correction, answering questions posed recently by Wang, Zhong, and Fan (2022).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives explicit traffic distributions for Walsh-Hadamard and DCT/DST matrices and a new unifying AMP that keeps conditional Gaussianity across a wide class of inputs.

read the letter

The main new pieces are the explicit traffic distributions for variants of the Walsh-Hadamard matrix and the discrete sine and cosine transforms, plus a single AMP iteration that covers both random and these deterministic cases while keeping the state conditionally Gaussian. That directly addresses parts of the 1994 Marinari-Parisi-Ritort conjectures and gives a combinatorial reading of the Onsager term that matches questions raised by Wang, Zhong, and Fan in 2022. The diagrammatic expansion route from the traffic distribution to the algorithm dynamics looks clean and avoids the usual random-matrix restrictions. What works is the concrete calculations for those structured matrices and the unification of earlier AMP flavors into one framework that still applies when the input is deterministic. The approach stays grounded in the matrix entries rather than fitted parameters. The soft spots are limited. Everything rests on the traffic distribution existing and the expansions being valid for these matrices; the abstract states the program clearly but the strength of the claims will depend on how tightly the full derivations are written. No internal contradictions show up in the stated scope. This is for people working on asymptotic analysis of first-order methods and AMP extensions. Anyone who needs to run these algorithms on transform-based or Hadamard-structured inputs will find usable new examples and a broader set of matrices where the Gaussian analysis still holds. It deserves a serious referee. The results fill a documented gap and the tools are reproducible in principle, so the paper should go out for review even if some sections need tightening on the explicit computations.

Referee Report

2 major / 3 minor

Summary. The paper analyzes the large-n dynamics of general first-order methods (GFOM) on both random and deterministic matrices by means of diagrammatic expansions controlled by the limiting traffic distribution of the input matrix (the collection of all limiting permutation-invariant polynomials in the entries). It explicitly computes the traffic distributions for the first non-trivial deterministic families, including minor variants of the Walsh-Hadamard matrix and the discrete sine/cosine transform matrices, thereby determining the limiting GFOM dynamics on these inputs and partially resolving conjectures of Marinari, Parisi, and Ritort (1994). It further introduces a new AMP iteration that unifies prior variants, yields limiting Gaussian dynamics conditional on latent random variables, and supplies a combinatorial interpretation of the Onsager correction; the analysis is claimed to hold for a broad natural class of traffic distributions that includes both random and deterministic matrices.

Significance. If the derivations are correct, the work provides the first explicit traffic-distribution calculations for classical deterministic orthogonal matrices and shows that these suffice to control GFOM asymptotics, thereby extending the AMP framework beyond the very-random regime. The combinatorial account of the Onsager term and the conditional-Gaussian construction are technically attractive and address open questions raised by Wang, Zhong, and Fan (2022). The results are likely to be of interest to researchers in high-dimensional statistics, optimization, and statistical physics.

major comments (2)

[§4] §4 (traffic-distribution calculations): the explicit evaluation of the limiting traffic distribution for the Walsh-Hadamard and DCT/DST variants is central to claim 1, yet the manuscript provides only the final expressions; the intermediate combinatorial counting arguments that establish the existence of the limit and the vanishing of all non-permutation-invariant moments should be expanded so that the reader can verify the claimed values without external computation.
[§5.3] §5.3 (new AMP iteration): the proof that the state remains conditionally Gaussian given the latent variables relies on a diagrammatic expansion that closes under the chosen traffic distribution class. The manuscript should supply a self-contained inductive step showing that the Onsager correction exactly cancels the non-Gaussian contributions at each iteration; without this, the unification claim rests on an unverified closure property.

minor comments (3)

[Definition 2.3] Notation for the traffic distribution (Definition 2.3) uses the same symbol for both the finite-n empirical measure and its limit; a distinct symbol for the limiting object would improve readability.
[Introduction] The statement of the Marinari-Parisi-Ritort conjectures in the introduction should include a precise citation to the 1994 paper and a one-sentence summary of which parts are resolved by the new traffic-distribution calculations.
[Figure 1] Figure 1 (phase diagram) lacks error bars or confidence bands on the empirical curves; adding these would make the agreement with the theoretical predictions easier to assess.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on our manuscript. We address each major comment point by point below and indicate the changes we will make in the revised version.

read point-by-point responses

Referee: [§4] §4 (traffic-distribution calculations): the explicit evaluation of the limiting traffic distribution for the Walsh-Hadamard and DCT/DST variants is central to claim 1, yet the manuscript provides only the final expressions; the intermediate combinatorial counting arguments that establish the existence of the limit and the vanishing of all non-permutation-invariant moments should be expanded so that the reader can verify the claimed values without external computation.

Authors: We agree that the intermediate combinatorial counting arguments are necessary for independent verification of the traffic distributions. In the revised manuscript we will expand §4 to include the full step-by-step combinatorial derivations establishing the existence of the limits and the vanishing of all non-permutation-invariant moments for the Walsh-Hadamard and DCT/DST variants. revision: yes
Referee: [§5.3] §5.3 (new AMP iteration): the proof that the state remains conditionally Gaussian given the latent variables relies on a diagrammatic expansion that closes under the chosen traffic distribution class. The manuscript should supply a self-contained inductive step showing that the Onsager correction exactly cancels the non-Gaussian contributions at each iteration; without this, the unification claim rests on an unverified closure property.

Authors: We acknowledge that an explicit self-contained inductive step would make the argument clearer. We will revise §5.3 to include a detailed inductive proof demonstrating that the diagrammatic expansion remains closed under the traffic-distribution class and that the Onsager correction exactly cancels non-Gaussian contributions at each iteration. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained from matrix entries

full rationale

The paper computes the limiting traffic distributions explicitly from the entrywise definitions of the deterministic matrices (Walsh-Hadamard, DCT, DST variants) and feeds these into diagrammatic expansions to obtain GFOM dynamics. The new AMP construction and its Gaussianity conditional on latents are likewise derived directly from the traffic-distribution class without any reduction to fitted parameters, self-referential definitions, or load-bearing self-citations. All steps remain grounded in the input matrix and the combinatorial structure of the expansions, satisfying the requirement that the central claims are independent of the outputs they produce.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the existence and sufficiency of the limiting traffic distribution for determining GFOM dynamics; this is treated as a domain assumption drawn from prior work on random matrices and extended here.

axioms (1)

domain assumption The limiting traffic distribution of the input matrix exists and determines the large-n dynamics of GFOM via diagrammatic expansions.
Invoked as the core analytical tool to obtain both the deterministic-matrix calculations and the new AMP analysis.

pith-pipeline@v0.9.0 · 5602 in / 1281 out tokens · 40092 ms · 2026-05-10T16:13:29.780527+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

High-Dimensional Statistics: Reflections on Progress and Open Problems
math.ST 2026-05 unverdicted novelty 2.0

A survey synthesizing representative advances, common themes, and open problems in high-dimensional statistics while pointing to key entry-point works.

Reference graph

Works this paper leans on

13 extracted references · 2 canonical work pages · cited by 1 Pith paper

[1]

Approximate message passing algorithms for rotationally invariant matrices.Annals of Statistics, 50(1):197–224, 2022

72 [Fan22] Zhou Fan. Approximate message passing algorithms for rotationally invariant matrices.Annals of Statistics, 50(1):197–224, 2022. 3, 10, 12, 59 [FVRS22] Oliver Y. Feng, Ramji Venkataramanan, Cynthia Rush, and Richard J. Samworth. A Unifying Tutorial on Approximate Message Passing.Foundations and Trends in Machine Learning, 15(4):335–536, 2022. 1,...

2022
[2]

State evolution for general approximate message passing algorithms, with applications to spatial coupling.Information and Inference: A Journal of the IMA, 2(2):115–144, 2013

6 [JM13] Adel Javanmard and Andrea Montanari. State evolution for general approximate message passing algorithms, with applications to spatial coupling.Information and Inference: A Journal of the IMA, 2(2):115–144, 2013. 3, 10, 11, 25, 62, 65 [JP25] Chris Jones and Lucas Pesenti. Fourier analysis of iterative algorithms. In52nd International Colloquium on...

work page arXiv 2013
[3]

Graphical models concepts in compressed sensing

3 [Mon12] Andrea Montanari. Graphical models concepts in compressed sensing. In Yonina C. Eldar and Gitta Kutyniok, editors,Compressed Sensing: Theory and Applications, pages 394–438. Cambridge University Press, 2012. 1, 11 [Mon19] Andrea Montanari. Optimization of the Sherrington-Kirkpatrick Hamiltonian. In60th IEEE Annual Symposium on Foundations of Com...

2012
[4]

World Scientific, 1987

6 [MPV87] Marc Mézard, Giorgio Parisi, and Miguel Angel Virasoro.Spin glass theory and beyond: An Introduction to the Replica Method and Its Applications, volume 9. World Scientific, 1987. 3 [MR15] Andrea Montanari and Emile Richard. Non-negative principal component analysis: message passing algorithms and sharp asymptotics.IEEE Transactions on Informatio...

1987
[5]

JT gravity as a matrix integral,

12 [Pes26] Lucas Pesenti.Algorithms beyond the union bound: polynomial optimization and discrepancy theory. PhD thesis, Bocconi University, 2026. 2 [Pet82] L.C. Petersen. On the relation between the multidimensional moment problem and the one-dimensional moment problem.Mathematica Scandinavica, 51:361–366, 1982. 86 [PP95] Giorgio Parisi and Marc Potters. ...

work page arXiv 2026
[6]

Adaptive damping and mean removal for the generalized approximate message passing algorithm

70, 73 [VSR+15] Jeremy Vila, Philip Schniter, Sundeep Rangan, Florent Krzakala, and Lenka Zdeborová. Adaptive damping and mean removal for the generalized approximate message passing algorithm. InIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2015), pages 2021–2025. IEEE, 2015. 3 [WZF22] Tianhao Wang, Xinyi Zhong, and Zho...

2015
[7]

Related planarity phenomena also appear in mathematics, for example in the connections between large random matrices and non-crossing pairings

In the limitn→∞, only planar diagrams contribute at leading order, an observation going back to foundational work of ’t Hooft [tH74, BIPZ78]. Related planarity phenomena also appear in mathematics, for example in the connections between large random matrices and non-crossing pairings
[8]

combinatorially rigorous

In special scaling limits of the potential withn, the Feynman diagram expansion can be interpreted in terms of physical theories such as 2D gravity and certain string-theoretic models [DFGZJ95, CGH+17, SSS19]. The combinatorial approach in this paper fits naturally into this perspective. First, our results are formulated in the large-n limit, and the domi...
[9]

All joint moments converge: for anyk≥1and α1,...,αk∈A, the limit of the joint moments limn→∞E [ k∏ i=1 x(n) αi ] (61) exists
[10]

Lemma C.3(Truncation).Let(x n)n≥1and(y n)n≥1be sequences of random variables such that

All marginals are subexponential: for everyα∈A, there existsCα> 0such that for all p≥1, limn→∞E ( x(n) α )2p ≤(Cαp)2p.(62) Thenx (n) converges in distribution to the unique law onRA with moments given by Eq.(61). Lemma C.3(Truncation).Let(x n)n≥1and(y n)n≥1be sequences of random variables such that
[11]

2.(x n)n≥1is tight, i.e.,supn≥1Pr(|xn|>K)−→ K→∞ 0

For anyK >0, conditionally on|xn|≤K,(yn)n≥1converges in distribution. 2.(x n)n≥1is tight, i.e.,supn≥1Pr(|xn|>K)−→ K→∞ 0. Then,(y n)n≥1converges in distribution. Proof.First, we prove: 86 Claim C.4.(y n)n≥1is tight. Proof. For anyK,L>0, we havePr(|yn|>L)≤Pr(|yn|>L||xn|≤K)+ Pr(|xn|>K). PickK large enough so that the second term is bounded byεuniformly inn.(...
[12]

In this case, there are no non-core vertices to match on the path fromuj i tov j i, so the induced matching is empty (and hence trivially order-preserving)

Eithervj i =wj i for bothj∈{1,2}. In this case, there are no non-core vertices to match on the path fromuj i tov j i, so the induced matching is empty (and hence trivially order-preserving)
[13]

In this case, by induction, the matching betweenvj i and wj i is order-preserving

Or vj i ̸= wj i for bothj∈{1,2}. In this case, by induction, the matching betweenvj i and wj i is order-preserving. Matchingv1 i to v2 i and adding the matching fromvj i to wj i yields an order-preserving matching fromuj i tow j i. By induction, the restriction ofP induces an isomorphism between the cores ofτ1 and τ2 within each subtree rooted atvj i. Sin...