Pointwise Complexity for Gaussian Fields: Upper Envelopes, Algorithmic Lower Bounds, and Separation

Yunbei Xu

arxiv: 2606.07931 · v2 · pith:LWJ6BNCHnew · submitted 2026-06-06 · 🧮 math.PR · cond-mat.stat-mech· cs.IT· cs.LG· math.IT· math.ST· stat.TH

Pointwise Complexity for Gaussian Fields: Upper Envelopes, Algorithmic Lower Bounds, and Separation

Yunbei Xu This is my paper

Pith reviewed 2026-07-04 00:05 UTC · model grok-4.3

classification 🧮 math.PR cond-mat.stat-mechcs.ITcs.LGmath.ITmath.STstat.TH MSC 60G15

keywords Gaussian processesmajorizing measurespointwise boundsgeneric chainingalgorithmic lower boundsFano principle

0 comments

The pith

Pointwise majorizing measures provide high-probability envelopes for Gaussian fields

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves a variance-aware pointwise majorizing-measure theorem for centered Gaussian processes. This gives a simultaneous high-probability envelope for the entire field at each point x, governed by the integral Φ_μ(x) that incorporates local variance and the ambient prior measure of balls. The result refines classical generic chaining and connects to Bayesian algorithmic lower bounds. It allows construction of examples where pointwise complexity separates from classical minimax theory in overparameterized settings.

Core claim

The theorem states that for centered Gaussian processes, the envelope at x is governed by Φ_μ(x) := ∫_0^{4σ(x)} √log(1/μ(B_d(x,ε))) dε together with the corresponding Gaussian tail term, providing a field-level refinement of generic chaining.

What carries the argument

The pointwise Fernique-Talagrand functional Φ_μ(x), an integral over the local scale of the square root log reciprocal of the prior measure on metric balls.

If this is right

It provides a reusable field-level refinement of classical generic chaining.
It offers a Gaussian-process counterpart of pointwise empirical-process bounds for deep neural networks.
It records a Bayesian algorithmic lower envelope from the interactive Fano/data-processing principle using ghost small-ball mass.
The separation example shows algorithmic lower bounds can validate pointwise complexity where minimax theory is too coarse.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may allow tighter analysis of specific estimators in high-dimensional models by using known priors.
Recasting in minimax language as penalty-range information relaxation raises questions about algorithmic robustness for regularized estimators.
Comparison decoders in Gaussian location experiments suggest similar lower bounds could apply to other observation models.

Load-bearing premise

The ambient prior μ is fixed and known in advance, and the metric d is such that the pointwise integral majorizes the local oscillations of the Gaussian field.

What would settle it

Observe a centered Gaussian process with a fixed prior μ where realized values exceed the Φ_μ(x) envelope plus the Gaussian tail term on a set of positive probability beyond the bound.

read the original abstract

We prove a variance-aware pointwise majorizing-measure theorem for centered Gaussian processes. Classical generic chaining characterizes the scalar quantity $\mathbb E\sup_{x\in T}X_x$; the theorem here gives a simultaneous high-probability envelope for the entire field. For an ambient prior $\mu$, the envelope at $x$ is governed by a pointwise Fernique-Talagrand functional \[\Phi_\mu(x):=\int_0^{4\sigma(x)}\sqrt{\log\frac{1}{\mu(B_d(x,\varepsilon))}}\,d\varepsilon,\] together with the corresponding Gaussian tail term. The theorem provides a reusable field-level refinement of classical generic chaining and a Gaussian-process counterpart of pointwise empirical-process bounds for deep neural networks. We also record a Bayesian algorithmic lower envelope from the interactive Fano/data-processing principle. For a known prior $\pi$, an observation channel, and a concrete estimator $\widehat t(Y)$, the lower bound is expressed through the exact ghost small-ball mass $\mathbb E_{Y\sim Q}\pi(B_d(\widehat t(Y),\Delta))$, rather than a worst-case covering number. In Gaussian location experiments, comparison decoders convert Bayes location error into lower bounds on decision-aligned Gaussian ranges. We then construct an elementary example separating the usual Fano relaxation, the Bayesian algorithmic lower envelope, the pointwise Gaussian envelope, and the full-class minimax risk. Together, these results show that algorithmic lower bounds provide local-geometric validations of pointwise complexity for fixed estimators in overparameterized ambient classes, precisely in regimes where classical minimax theory becomes either too coarse or oracle-dependent. This separation can also be recast in minimax language as penalty-range information relaxation, highlighting an important question of algorithmic robustness for classical high-dimensional models and regularized algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper localizes generic chaining to pointwise envelopes via a small-ball integral and separates four different complexity quantities with an elementary example.

read the letter

The core contribution is a pointwise majorizing-measure result for centered Gaussian processes. For a fixed prior μ the envelope at each x is controlled by the integral Φ_μ(x) = ∫_0^{4σ(x)} √log(1/μ(B_d(x,ε))) dε plus a Gaussian tail term. This is presented as a direct localization of the classical Fernique-Talagrand functional rather than a uniform bound on the supremum.

What is actually new is the explicit construction of that pointwise functional together with the Bayesian algorithmic lower envelope that uses the exact ghost small-ball mass E_{Y~Q} π(B_d(ˆt(Y), Δ)) instead of a worst-case covering number. The paper also supplies a concrete four-way separation among the Fano relaxation, the Bayesian algorithmic bound, the pointwise Gaussian envelope, and the full minimax risk. That separation is elementary and shows regimes where classical minimax becomes too coarse or oracle-dependent.

The work sits on standard tools (Fernique-Talagrand, Fano/data-processing, comparison decoders) without obvious circularity or unstated uniformity requirements that would break the pointwise claim. The stress-test note finds no internal contradiction between the envelope, the lower bound, and the separation example.

The main limitation is the standing assumption that the ambient prior μ is fixed and known in advance and that the metric d makes the integral actually majorize local oscillations. If an application lacks such a prior, the bound does not apply directly. The abstract does not report numerical checks or explicit constants, so tightness in concrete cases remains to be verified from the full proofs.

This is for readers already working with generic chaining or pointwise bounds in Gaussian fields, statistical learning, or neural-network theory. Someone who needs local rather than uniform control or who wants to distinguish algorithmic lower bounds from minimax ones will find the separation useful.

I would send it to peer review. The localization and the separation are concrete enough to merit referee attention even if the proofs need tightening.

Referee Report

0 major / 3 minor

Summary. The paper proves a variance-aware pointwise majorizing-measure theorem for centered Gaussian processes. For an ambient prior μ, the envelope at x is governed by the pointwise Fernique-Talagrand functional Φ_μ(x) := ∫_0^{4σ(x)} √log(1/μ(B_d(x,ε))) dε together with a Gaussian tail term. It also records a Bayesian algorithmic lower envelope expressed via the exact ghost small-ball mass E_{Y∼Q} π(B_d(ˆt(Y), Δ)) and constructs an elementary separation example among the usual Fano relaxation, the Bayesian lower envelope, the pointwise Gaussian envelope, and the full-class minimax risk.

Significance. If the central claims hold, the pointwise envelope supplies a reusable, field-level refinement of classical generic chaining that directly controls local oscillations rather than only the scalar supremum. The Bayesian lower bound via exact ghost small-ball mass, rather than worst-case covering numbers, and the explicit separation example are strengths; they furnish concrete, falsifiable local-geometric validations of complexity for fixed estimators in overparameterized classes where classical minimax theory is coarse.

minor comments (3)

The abstract states that the metric d is such that Φ_μ(x) majorizes local oscillations, but the precise regularity conditions on d and the relation between σ(x) and the canonical metric should be stated explicitly in the theorem statement (likely §2 or §3).
Notation for the observation channel Q and the ghost small-ball mass should be introduced with a short display equation when first used, to improve readability for readers outside the immediate subfield.
The separation example is described as 'elementary'; adding a short table or diagram comparing the four quantities numerically for the chosen Gaussian location model would make the distinction concrete without lengthening the text.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments appear in the provided report, so there are no specific points requiring point-by-point response.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation introduces the pointwise functional Φ_μ(x) as an explicit integral over the small-ball function of a fixed ambient prior μ and metric d; this is presented as a direct localization of the classical Fernique-Talagrand majorizing-measure construction rather than a self-referential definition. The Bayesian lower envelope is likewise expressed directly in terms of the ghost small-ball mass E_{Y∼Q} π(B_d(ˆt(Y),Δ)) under a known prior and observation channel, without fitting parameters or renaming fitted quantities as predictions. The separation example is constructed explicitly to distinguish the listed quantities and does not rely on any self-citation chain or uniqueness theorem imported from the authors' prior work. All load-bearing steps invoke standard external tools (generic chaining, Fano/data-processing) whose validity is independent of the present claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard definition of a centered Gaussian process indexed by a metric space and on the existence of a fixed ambient prior μ; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption The random field is a centered Gaussian process with respect to a metric d.
Invoked in the statement of the pointwise majorizing-measure theorem.

pith-pipeline@v0.9.1-grok · 5878 in / 1229 out tokens · 22235 ms · 2026-07-04T00:05:00.513332+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Bellman-sufficient Information Complexity
cs.LG 2026-06 unverdicted novelty 7.0

Introduces Bellman-sufficient information complexity as a representation-level framework for sequential decision making that unifies upper and lower bounds through a conditional Bellman information-risk sandwich.
Bellman-sufficient Information Complexity
cs.LG 2026-06 unverdicted novelty 6.0

Proposes Bellman-sufficient state representations and information indices Y=χ(Ω) to organize sequential decision making with a conditional Bellman information-risk sandwich providing matching upper and lower complexit...

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · cited by 1 Pith paper · 4 internal anchors

[1]

Marginal triviality of the scaling limits of critical 4D Ising and $ ^4_4$ models

Michael Aizenman and Hugo Duminil-Copin. Marginal triviality of the scaling limits of critical 4D Ising and $ ^4_4$ models. Annals of Mathematics, 194(1):163--235, 2021

work page 2021
[2]

Diffusions hypercontractives

Dominique Bakry and Michel Emery. Diffusions hypercontractives. In S\'eminaire de probabilit\'es XIX, Lecture Notes in Mathematics, vol. 1123, pages 177--206. Springer, 1985

work page 1985
[3]

Brydges, and Gordon Slade

Roland Bauerschmidt, David C. Brydges, and Gordon Slade. Introduction to a Renormalisation Group Method. Lecture Notes in Mathematics, vol. 2242. Springer, 2019

work page 2019
[4]

Barron, Lucien Birg\'e, and Pascal Massart

Andrew R. Barron, Lucien Birg\'e, and Pascal Massart. Risk bounds for model selection via penalization. Probability Theory and Related Fields, 113:301--413, 1999

work page 1999
[5]

Mikhail Belkin, Daniel Hsu, and Partha P. Mitra. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate. In Advances in Neural Information Processing Systems 31, 2018

work page 2018
[6]

Square-root lasso: pivotal recovery of sparse signals via conic programming

Alexandre Belloni, Victor Chernozhukov, and Lie Wang. Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791--806, 2011

work page 2011
[7]

Bickel, Ya'acov Ritov, and Alexandre B

Peter J. Bickel, Ya'acov Ritov, and Alexandre B. Tsybakov. Simultaneous analysis of lasso and Dantzig selector. The Annals of Statistics, 37(4):1705--1732, 2009

work page 2009
[8]

Brown, James E

David B. Brown, James E. Smith, and Peng Sun. Information relaxations and duality in stochastic dynamic programs. Operations Research, 58(4, Part 1):785--801, 2010

work page 2010
[9]

Statistics for High-Dimensional Data: Methods, Theory and Applications

Peter B\"uhlmann and Sara van de Geer. Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, 2011

work page 2011
[10]

Minimal penalties for Gaussian model selection

Lucien Birg\'e and Pascal Massart. Minimal penalties for Gaussian model selection. Probability Theory and Related Fields, 138(1--2):33--73, 2007

work page 2007
[11]

Majorizing measures, sequential complexities, and online learning

Adam Block, Yuval Dagan, and Alexander Rakhlin. Majorizing measures, sequential complexities, and online learning. In Proceedings of the 34th Annual Conference on Learning Theory, vol. 134, pages 1--4, 2021

work page 2021
[12]

Convex set functions in $d$-space

Christer Borell. Convex set functions in $d$-space. Periodica Mathematica Hungarica, 6:111--136, 1975

work page 1975
[13]

Herm Jan Brascamp and Elliott H. Lieb. On extensions of the Brunn--Minkowski and Prekopa--Leindler theorems, including inequalities for log-concave functions, and with an application to the diffusion equation. Journal of Functional Analysis, 22(4):366--389, 1976

work page 1976
[14]

Brydges and Gordon Slade

David C. Brydges and Gordon Slade. A renormalisation group method. V. A single renormalisation group step. Journal of Statistical Physics, 159:589--667, 2015

work page 2015
[15]

Yang-Mills for probabilists

Sourav Chatterjee. Yang--Mills for probabilists. arXiv:1803.01950, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[16]

Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, and Yunbei Xu

Fan Chen, Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, and Yunbei Xu. Assouad, Fano, and Le Cam with interaction: A unifying lower bound framework and characterization for bandit learnability. arXiv:2410.05117, 2024

work page arXiv 2024
[17]

Cover and Peter E

Thomas M. Cover and Peter E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21--27, 1967

work page 1967
[18]

Approximation by superpositions of a sigmoidal function

George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2:303--314, 1989

work page 1989
[19]

Lee, and Yuval Peres

Jian Ding, James R. Lee, and Yuval Peres. Cover times, blanket times, and majorizing measures. Annals of Mathematics, 175(3):1409--1471, 2012

work page 2012
[20]

E. B. Dynkin. Local times and quantum fields. In Seminar on Stochastic Processes, 1983. Birkh\"auser, 1984

work page 1983
[21]

S., Kakade, S

Du, S. S., Kakade, S. M., Lee, J. D., Lovett, S., Mahajan, G., Sun, W., and Wang, R. (2021). Bilinear classes: A structural framework for provable generalization in RL. In Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2826--2836

work page 2021
[22]

R\'egularit\'e des trajectoires des fonctions al\'eatoires gaussiennes

Xavier Fernique. R\'egularit\'e des trajectoires des fonctions al\'eatoires gaussiennes. In \'Ecole d'\'Et\'e de Probabilit\'es de Saint-Flour IV--1974, Lecture Notes in Mathematics, vol. 480. Springer, 1975

work page 1974
[23]

The statistical complexity of interactive decision making,

Dylan J. Foster, Sham M. Kakade, Jian Qian, and Alexander Rakhlin. The statistical complexity of interactive decision making. arXiv:2112.13487, 2021

work page arXiv 2021
[24]

Foster, Noah Golowich, and Yanjun Han

Dylan J. Foster, Noah Golowich, and Yanjun Han. Tight guarantees for interactive decision making with the decision-estimation coefficient. In Proceedings of the 36th Annual Conference on Learning Theory, vol. 195, pages 1--75, 2023

work page 2023
[25]

Statistical Learning with Sparsity: The Lasso and Generalizations

Trevor Hastie, Robert Tibshirani, and Martin Wainwright. Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, 2015

work page 2015
[26]

Multilayer feedforward networks are universal approximators

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359--366, 1989

work page 1989
[27]

Quantum Yang--Mills theory

Arthur Jaffe and Edward Witten. Quantum Yang--Mills theory. Clay Mathematics Institute Millennium Prize Problem description, 2000

work page 2000
[28]

Isoperimetric problems for convex bodies and a localization lemma

Ravi Kannan, Laszlo Lovasz, and Miklos Simonovits. Isoperimetric problems for convex bodies and a localization lemma. Discrete & Computational Geometry, 13:541--559, 1995

work page 1995
[29]

Pointwise Generalization in Deep Neural Networks

Shaojie Li and Yunbei Xu. Pointwise generalization in deep neural networks. arXiv:2605.18598, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[30]

Probability in Banach Spaces: Isoperimetry and Processes

Michel Ledoux and Michel Talagrand. Probability in Banach Spaces: Isoperimetry and Processes. Springer, 1991

work page 1991
[31]

Leonid A. Levin. Average case complete problems. SIAM Journal on Computing, 15(1):285--286, 1986

work page 1986
[32]

The geometry of logconcave functions and sampling algorithms

Laszlo Lovasz and Santosh Vempala. The geometry of logconcave functions and sampling algorithms. Random Structures & Algorithms, 30(3):307--358, 2007

work page 2007
[33]

A Note on Pointwise Dimensions

Neil Lutz. A note on pointwise dimensions. arXiv:1612.05849, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[34]

On the power of adaptivity for -best arm identification in linear bandits

Arnab Maiti, Yunbei Xu, and Kevin Jamieson. On the power of adaptivity for -best arm identification in linear bandits. In Proceedings of the 39th Annual Conference on Learning Theory, vol. 336, pages 4911--4968, 2026

work page 2026
[35]

Negahban, Pradeep Ravikumar, Martin J

Sahand N. Negahban, Pradeep Ravikumar, Martin J. Wainwright, and Bin Yu. A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers. Statistical Science, 27(4):538--557, 2012

work page 2012
[36]

The duality proof of the Sudakov minoration

David Pollard. The duality proof of the Sudakov minoration. Lecture notes, Yale University, available at http://www.stat.yale.edu/ pollard/Notes/Sudakov.pdf, 2001

work page 2001
[37]

Wainwright, and Bin Yu

Garvesh Raskutti, Martin J. Wainwright, and Bin Yu. Minimax rates of estimation for high-dimensional linear regression over $ _q$-balls. IEEE Transactions on Information Theory, 57(10):6976--6994, 2011

work page 2011
[38]

Spielman and Shang-Hua Teng

Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. Journal of the ACM, 51(3):385--463, 2004

work page 2004
[39]

The Generic Chaining: Upper and Lower Bounds of Stochastic Processes

Michel Talagrand. The Generic Chaining: Upper and Lower Bounds of Stochastic Processes. Springer, 2005

work page 2005
[40]

Regularity of Gaussian processes

Michel Talagrand. Regularity of Gaussian processes. Acta Mathematica, 159:99--149, 1987

work page 1987
[41]

Regression shrinkage and selection via the lasso

Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267--288, 1996

work page 1996
[42]

Chaining, interpolation and convexity II: The contraction principle

Ramon van Handel. Chaining, interpolation and convexity II: The contraction principle. The Annals of Probability, 46(3):1764--1805, 2018

work page 2018
[43]

Wainwright

Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, 2019

work page 2019
[44]

Towards optimal problem dependent generalization error bounds in statistical learning theory

Yunbei Xu and Assaf Zeevi. Towards optimal problem dependent generalization error bounds in statistical learning theory. Mathematics of Operations Research, 2025

work page 2025
[45]

Bayesian design principles for frequentist sequential learning

Yunbei Xu and Assaf Zeevi. Bayesian design principles for frequentist sequential learning. Journal of the ACM, 2025

work page 2025
[46]

Andrew C. Yao. Probabilistic computations: Toward a unified measure of complexity. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science, pages 222--227, 1977

work page 1977
[47]

A Bayesian Proof and Interpretation of Talagrand's Majorizing Measure Theorem

Ilias Zadik. A Bayesian proof and interpretation of Talagrand's majorizing measure theorem. arXiv preprint arXiv:2605.30321, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[48]

Information theoretical upper and lower bounds for statistical estimation

Tong Zhang. Information theoretical upper and lower bounds for statistical estimation. IEEE Transactions on Information Theory, 52(4):1307--1321, 2006

work page 2006

[1] [1]

Marginal triviality of the scaling limits of critical 4D Ising and \( ^4_4\) models

Michael Aizenman and Hugo Duminil-Copin. Marginal triviality of the scaling limits of critical 4D Ising and \( ^4_4\) models. Annals of Mathematics, 194(1):163--235, 2021

work page 2021

[2] [2]

Diffusions hypercontractives

Dominique Bakry and Michel Emery. Diffusions hypercontractives. In S\'eminaire de probabilit\'es XIX, Lecture Notes in Mathematics, vol. 1123, pages 177--206. Springer, 1985

work page 1985

[3] [3]

Brydges, and Gordon Slade

Roland Bauerschmidt, David C. Brydges, and Gordon Slade. Introduction to a Renormalisation Group Method. Lecture Notes in Mathematics, vol. 2242. Springer, 2019

work page 2019

[4] [4]

Barron, Lucien Birg\'e, and Pascal Massart

Andrew R. Barron, Lucien Birg\'e, and Pascal Massart. Risk bounds for model selection via penalization. Probability Theory and Related Fields, 113:301--413, 1999

work page 1999

[5] [5]

Mikhail Belkin, Daniel Hsu, and Partha P. Mitra. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate. In Advances in Neural Information Processing Systems 31, 2018

work page 2018

[6] [6]

Square-root lasso: pivotal recovery of sparse signals via conic programming

Alexandre Belloni, Victor Chernozhukov, and Lie Wang. Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791--806, 2011

work page 2011

[7] [7]

Bickel, Ya'acov Ritov, and Alexandre B

Peter J. Bickel, Ya'acov Ritov, and Alexandre B. Tsybakov. Simultaneous analysis of lasso and Dantzig selector. The Annals of Statistics, 37(4):1705--1732, 2009

work page 2009

[8] [8]

Brown, James E

David B. Brown, James E. Smith, and Peng Sun. Information relaxations and duality in stochastic dynamic programs. Operations Research, 58(4, Part 1):785--801, 2010

work page 2010

[9] [9]

Statistics for High-Dimensional Data: Methods, Theory and Applications

Peter B\"uhlmann and Sara van de Geer. Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, 2011

work page 2011

[10] [10]

Minimal penalties for Gaussian model selection

Lucien Birg\'e and Pascal Massart. Minimal penalties for Gaussian model selection. Probability Theory and Related Fields, 138(1--2):33--73, 2007

work page 2007

[11] [11]

Majorizing measures, sequential complexities, and online learning

Adam Block, Yuval Dagan, and Alexander Rakhlin. Majorizing measures, sequential complexities, and online learning. In Proceedings of the 34th Annual Conference on Learning Theory, vol. 134, pages 1--4, 2021

work page 2021

[12] [12]

Convex set functions in \(d\)-space

Christer Borell. Convex set functions in \(d\)-space. Periodica Mathematica Hungarica, 6:111--136, 1975

work page 1975

[13] [13]

Herm Jan Brascamp and Elliott H. Lieb. On extensions of the Brunn--Minkowski and Prekopa--Leindler theorems, including inequalities for log-concave functions, and with an application to the diffusion equation. Journal of Functional Analysis, 22(4):366--389, 1976

work page 1976

[14] [14]

Brydges and Gordon Slade

David C. Brydges and Gordon Slade. A renormalisation group method. V. A single renormalisation group step. Journal of Statistical Physics, 159:589--667, 2015

work page 2015

[15] [15]

Yang-Mills for probabilists

Sourav Chatterjee. Yang--Mills for probabilists. arXiv:1803.01950, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[16] [16]

Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, and Yunbei Xu

Fan Chen, Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, and Yunbei Xu. Assouad, Fano, and Le Cam with interaction: A unifying lower bound framework and characterization for bandit learnability. arXiv:2410.05117, 2024

work page arXiv 2024

[17] [17]

Cover and Peter E

Thomas M. Cover and Peter E. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21--27, 1967

work page 1967

[18] [18]

Approximation by superpositions of a sigmoidal function

George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2:303--314, 1989

work page 1989

[19] [19]

Lee, and Yuval Peres

Jian Ding, James R. Lee, and Yuval Peres. Cover times, blanket times, and majorizing measures. Annals of Mathematics, 175(3):1409--1471, 2012

work page 2012

[20] [20]

E. B. Dynkin. Local times and quantum fields. In Seminar on Stochastic Processes, 1983. Birkh\"auser, 1984

work page 1983

[21] [21]

S., Kakade, S

Du, S. S., Kakade, S. M., Lee, J. D., Lovett, S., Mahajan, G., Sun, W., and Wang, R. (2021). Bilinear classes: A structural framework for provable generalization in RL. In Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2826--2836

work page 2021

[22] [22]

R\'egularit\'e des trajectoires des fonctions al\'eatoires gaussiennes

Xavier Fernique. R\'egularit\'e des trajectoires des fonctions al\'eatoires gaussiennes. In \'Ecole d'\'Et\'e de Probabilit\'es de Saint-Flour IV--1974, Lecture Notes in Mathematics, vol. 480. Springer, 1975

work page 1974

[23] [23]

The statistical complexity of interactive decision making,

Dylan J. Foster, Sham M. Kakade, Jian Qian, and Alexander Rakhlin. The statistical complexity of interactive decision making. arXiv:2112.13487, 2021

work page arXiv 2021

[24] [24]

Foster, Noah Golowich, and Yanjun Han

Dylan J. Foster, Noah Golowich, and Yanjun Han. Tight guarantees for interactive decision making with the decision-estimation coefficient. In Proceedings of the 36th Annual Conference on Learning Theory, vol. 195, pages 1--75, 2023

work page 2023

[25] [25]

Statistical Learning with Sparsity: The Lasso and Generalizations

Trevor Hastie, Robert Tibshirani, and Martin Wainwright. Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, 2015

work page 2015

[26] [26]

Multilayer feedforward networks are universal approximators

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359--366, 1989

work page 1989

[27] [27]

Quantum Yang--Mills theory

Arthur Jaffe and Edward Witten. Quantum Yang--Mills theory. Clay Mathematics Institute Millennium Prize Problem description, 2000

work page 2000

[28] [28]

Isoperimetric problems for convex bodies and a localization lemma

Ravi Kannan, Laszlo Lovasz, and Miklos Simonovits. Isoperimetric problems for convex bodies and a localization lemma. Discrete & Computational Geometry, 13:541--559, 1995

work page 1995

[29] [29]

Pointwise Generalization in Deep Neural Networks

Shaojie Li and Yunbei Xu. Pointwise generalization in deep neural networks. arXiv:2605.18598, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[30] [30]

Probability in Banach Spaces: Isoperimetry and Processes

Michel Ledoux and Michel Talagrand. Probability in Banach Spaces: Isoperimetry and Processes. Springer, 1991

work page 1991

[31] [31]

Leonid A. Levin. Average case complete problems. SIAM Journal on Computing, 15(1):285--286, 1986

work page 1986

[32] [32]

The geometry of logconcave functions and sampling algorithms

Laszlo Lovasz and Santosh Vempala. The geometry of logconcave functions and sampling algorithms. Random Structures & Algorithms, 30(3):307--358, 2007

work page 2007

[33] [33]

A Note on Pointwise Dimensions

Neil Lutz. A note on pointwise dimensions. arXiv:1612.05849, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[34] [34]

On the power of adaptivity for \( \)-best arm identification in linear bandits

Arnab Maiti, Yunbei Xu, and Kevin Jamieson. On the power of adaptivity for \( \)-best arm identification in linear bandits. In Proceedings of the 39th Annual Conference on Learning Theory, vol. 336, pages 4911--4968, 2026

work page 2026

[35] [35]

Negahban, Pradeep Ravikumar, Martin J

Sahand N. Negahban, Pradeep Ravikumar, Martin J. Wainwright, and Bin Yu. A unified framework for high-dimensional analysis of \(M\)-estimators with decomposable regularizers. Statistical Science, 27(4):538--557, 2012

work page 2012

[36] [36]

The duality proof of the Sudakov minoration

David Pollard. The duality proof of the Sudakov minoration. Lecture notes, Yale University, available at http://www.stat.yale.edu/ pollard/Notes/Sudakov.pdf, 2001

work page 2001

[37] [37]

Wainwright, and Bin Yu

Garvesh Raskutti, Martin J. Wainwright, and Bin Yu. Minimax rates of estimation for high-dimensional linear regression over \( _q\)-balls. IEEE Transactions on Information Theory, 57(10):6976--6994, 2011

work page 2011

[38] [38]

Spielman and Shang-Hua Teng

Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. Journal of the ACM, 51(3):385--463, 2004

work page 2004

[39] [39]

The Generic Chaining: Upper and Lower Bounds of Stochastic Processes

Michel Talagrand. The Generic Chaining: Upper and Lower Bounds of Stochastic Processes. Springer, 2005

work page 2005

[40] [40]

Regularity of Gaussian processes

Michel Talagrand. Regularity of Gaussian processes. Acta Mathematica, 159:99--149, 1987

work page 1987

[41] [41]

Regression shrinkage and selection via the lasso

Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267--288, 1996

work page 1996

[42] [42]

Chaining, interpolation and convexity II: The contraction principle

Ramon van Handel. Chaining, interpolation and convexity II: The contraction principle. The Annals of Probability, 46(3):1764--1805, 2018

work page 2018

[43] [43]

Wainwright

Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, 2019

work page 2019

[44] [44]

Towards optimal problem dependent generalization error bounds in statistical learning theory

Yunbei Xu and Assaf Zeevi. Towards optimal problem dependent generalization error bounds in statistical learning theory. Mathematics of Operations Research, 2025

work page 2025

[45] [45]

Bayesian design principles for frequentist sequential learning

Yunbei Xu and Assaf Zeevi. Bayesian design principles for frequentist sequential learning. Journal of the ACM, 2025

work page 2025

[46] [46]

Andrew C. Yao. Probabilistic computations: Toward a unified measure of complexity. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science, pages 222--227, 1977

work page 1977

[47] [47]

A Bayesian Proof and Interpretation of Talagrand's Majorizing Measure Theorem

Ilias Zadik. A Bayesian proof and interpretation of Talagrand's majorizing measure theorem. arXiv preprint arXiv:2605.30321, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[48] [48]

Information theoretical upper and lower bounds for statistical estimation

Tong Zhang. Information theoretical upper and lower bounds for statistical estimation. IEEE Transactions on Information Theory, 52(4):1307--1321, 2006

work page 2006