Mean Dimension of Ridge Functions
Pith reviewed 2026-05-25 11:23 UTC · model grok-4.3
The pith
Ridge functions of high-dimensional Gaussians keep bounded mean dimension when Lipschitz but scale as square root of dimension when discontinuous without sparsity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
If the ridge function is Lipschitz continuous, then the mean dimension remains bounded as d→∞. If instead the ridge function is discontinuous, then the mean dimension depends on a measure of the ridge function's sparsity, and absent sparsity the mean dimension can grow proportionally to √d. Preintegrating a ridge function yields a new, potentially much smoother ridge function; if one of the ridge coefficients is bounded away from zero as d→∞, then preintegration can reduce the mean dimension from O(√d) to O(1).
What carries the argument
Mean dimension computed with respect to the spherical Gaussian measure on ridge functions f(x) = g(a · x), which quantifies the average sensitivity of the output to each coordinate of the input vector.
If this is right
- Lipschitz ridge functions have mean dimension that does not grow with dimension d.
- Discontinuous ridge functions without sparsity in the direction vector have mean dimension that scales proportionally to sqrt(d).
- A sparsity measure of the ridge coefficients governs the mean dimension when the ridge function is discontinuous.
- Preintegration reduces the mean dimension of certain discontinuous ridge functions from O(sqrt(d)) to O(1) when one coefficient remains bounded away from zero.
Where Pith is reading between the lines
- The same preintegration step might keep mean dimension low for other discontinuous functions that are not exactly ridge functions.
- If the ridge direction becomes sparser with growing d, even discontinuous ridge functions could retain bounded mean dimension.
- Mean dimension calculations under different input measures could produce different scaling behaviors for the same ridge functions.
Load-bearing premise
The scaling results assume the standard definition of mean dimension under the spherical Gaussian measure and that the function depends exactly on one fixed linear combination of the inputs.
What would settle it
Compute the mean dimension explicitly for the step-function ridge g(t) = sign(t) with uniform ridge coefficients a_i = 1/sqrt(d) and check whether it grows like sqrt(d) as d increases, or for its preintegrated version and check whether the value stays constant.
read the original abstract
We consider the mean dimension of some ridge functions of spherical Gaussian random vectors of dimension $d$. If the ridge function is Lipschitz continuous, then the mean dimension remains bounded as $d\to\infty$. If instead, the ridge function is discontinuous, then the mean dimension depends on a measure of the ridge function's sparsity, and absent sparsity the mean dimension can grow proportionally to $\sqrt{d}$. Preintegrating a ridge function yields a new, potentially much smoother ridge function. We include an example where, if one of the ridge coefficients is bounded away from zero as $d\to\infty$, then preintegration can reduce the mean dimension from $O(\sqrt{d})$ to $O(1)$.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript analyzes the mean dimension of exact ridge functions f(x) = g(a · x) for x drawn from the standard Gaussian measure on R^d. It establishes that the mean dimension remains O(1) as d → ∞ whenever g is Lipschitz continuous. When g is discontinuous the mean dimension is governed by a sparsity measure on g and, without sparsity, can grow as O(√d). The authors further show that preintegration produces a new ridge function whose mean dimension is O(1) provided at least one coordinate of a stays bounded away from zero.
Significance. The results supply precise asymptotic control on effective dimension for a widely used function class under the isotropic Gaussian measure. The Lipschitz and preintegration statements are parameter-free and rest on the standard definition of mean dimension, which enhances their utility for quasi-Monte Carlo methods and dimension-reduction techniques. The sparsity-dependent scaling for discontinuous ridges supplies a concrete, falsifiable prediction that can be checked numerically.
minor comments (3)
- [§2] §2: the definition of mean dimension is invoked without recalling its integral expression; adding the formula (even if standard) would improve readability for readers outside the immediate subfield.
- [Theorem 4.1] Theorem 4.1: the statement that preintegration yields a ridge function with bounded mean dimension would benefit from an explicit statement of the new ridge direction after integration.
- Notation: the vector a is sometimes written in bold and sometimes not; consistent vector notation throughout would reduce minor confusion.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were provided in the report.
Circularity Check
No significant circularity
full rationale
The paper derives asymptotic bounds on mean dimension for exact ridge functions f(x)=g(a·x) under the spherical Gaussian measure, using the standard definition of mean dimension. The Lipschitz case yields bounded mean dimension, the discontinuous case yields √d growth without sparsity, and preintegration reduces scaling when a coordinate of a is bounded away from zero. These are direct consequences of the ridge structure, the measure, and the mean-dimension definition; no equations reduce to self-definition, no fitted parameters are renamed as predictions, and no load-bearing claims rest on self-citations. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Mean dimension is defined via the standard ANOVA decomposition or Sobol' indices with respect to the spherical Gaussian measure.
- domain assumption Ridge functions are exactly functions of one linear combination of the coordinates.
Reference graph
Works this paper leans on
-
[1]
H.-J. Bungartz and M. Griebel , Sparse grids, Acta numerica, 13 (2004), pp. 147–269
work page 2004
-
[2]
R. E. Caflisch, W. Morokoff, and A. B. Owen , Valuation of mortgage backed securities using Brownian bridges to reduce effective dimension , Journal of Computational Finance, 1 (1997), pp. 27–46
work page 1997
-
[3]
P. G. Constantine, Active subspaces: Emerging ideas for dimension reduction in parameter studies, SIAM, Philadelphia, 2015
work page 2015
-
[4]
G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314
work page 1989
-
[5]
J. Dick and F. Pillichshammer , Digital sequences, discrepancy and quasi-Monte Carlo inte- gration, Cambridge University Press, Cambridge, 2010
work page 2010
-
[6]
B. Efron and C. Stein , The jackknife estimate of variance , Annals of Statistics, 9 (1981), pp. 586–596
work page 1981
-
[7]
J. H. Friedman and W. Stuetzle , Projection pursuit regression, Journal of the American statistical Association, 76 (1981), pp. 817–823
work page 1981
-
[8]
P. G. Glasserman, Monte Carlo methods in financial engineering , Springer, New York, 2004
work page 2004
-
[9]
M. Griebel, F. Y. Kuo, and I. H. Sloan, The smoothing effect of the ANOVA decomposition, Journal of Complexity, 26 (2010), pp. 523–551
work page 2010
-
[10]
M. Griebel, F. Y. Kuo, and I. H. Sloan , The smoothing effect of integration in Rd and the ANOVA decomposition, Mathematics of Computation, 82 (2013), pp. 383–400. MEAN DIMENSION OF RIDGE FUNCTIONS 17
work page 2013
-
[11]
The smoothing effect of integration in Rd and the ANOVA decomposition
M. Griebel, F. Y. Kuo, and I. H. Sloan, Note on “The smoothing effect of integration in Rd and the ANOVA decomposition”, Mathematics of Computation, 86 (2017), pp. 1847–1854
work page 2017
-
[12]
A. Griewank, F. Y. Kuo, H. Le¨ovey, and I. H. Sloan, High dimensional integration of kinks and jumpsSmoothing by preintegration, Journal of Computational and Applied Mathemat- ics, 344 (2018), pp. 259–274
work page 2018
- [13]
-
[14]
F. J. Hickernell , Koksma-Hlawka inequality , Wiley StatsRef: Statistics Reference Online, (2014)
work page 2014
-
[15]
W. Hoeffding, A class of statistics with asymptotically normal distribution , Annals of Math- ematical Statistics, 19 (1948), pp. 293–325
work page 1948
- [16]
-
[17]
F. Kuo, I. Sloan, G. Wasilkowski, and H. Wo´zniakowski, On decompositions of multivari- ate functions, Mathematics of computation, 79 (2010), pp. 953–966
work page 2010
-
[18]
F. Y. Kuo and D. Nuyens , Application of quasi-Monte Carlo methods to elliptic PDEs with random diffusion coefficients: a survey of analysis and implementation , Foundations of Computational Mathematics, 16 (2016), pp. 1631–1696
work page 2016
- [19]
-
[20]
B. Moskowitz and R. E. Caflisch , Smoothness and dimension reduction in quasi-Monte Carlo methods, Mathematical and Computer Modelling, 23 (1996), pp. 37–54
work page 1996
-
[21]
Niederreiter , Random Number Generation and Quasi-Monte Carlo Methods , SIAM, Philadelphia, PA, 1992
H. Niederreiter , Random Number Generation and Quasi-Monte Carlo Methods , SIAM, Philadelphia, PA, 1992
work page 1992
-
[22]
Ostrowski, ¨ uber normen von matrizen, Mathematische Zeitschrift, 63 (1955), pp
A. Ostrowski, ¨ uber normen von matrizen, Mathematische Zeitschrift, 63 (1955), pp. 2–18
work page 1955
-
[23]
A. B. Owen, Randomly permuted (t,m,s )-nets and (t,s )-sequences, in Monte Carlo and Quasi- Monte Carlo Methods in Scientific Computing, H. Niederreiter and P. J.-S. Shiue, eds., New York, 1995, Springer-Verlag, pp. 299–317
work page 1995
-
[24]
A. B. Owen, Monte Carlo variance of scrambled net quadrature , SIAM Journal of Numerical Analysis, 34 (1997), pp. 1884–1910
work page 1997
-
[25]
A. B. Owen, Scrambling Sobol’ and Niederreiter-Xing points, Journal of Complexity, 14 (1998), pp. 466–489
work page 1998
-
[26]
A. B. Owen , The dimension distribution and quadrature test functions , Statistica Sinica, (2003), pp. 1–17
work page 2003
-
[27]
A. B. Owen , Multidimensional variation for quasi-Monte Carlo , in International Conference on Statistics in honour of Professor Kai-Tai Fang’s 65th birthday, J. Fan and G. Li, eds., 2005
work page 2005
-
[28]
A. B. Owen , A randomized Halton algorithm in R , Tech. Report arXiv:1706.02808, Stanford University, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[29]
D. B. Owen , A table of normal integrals: A table , Communications in Statistics-Simulation and Computation, 9 (1980), pp. 389–419
work page 1980
-
[30]
J. K. Patel and C. B. Read , Handbook of the normal distribution , vol. 150, Marcel Dekker, Inc., New York, 2nd ed., 1996
work page 1996
-
[31]
I. H. Sloan and S. Joe, Lattice Methods for Multiple Integration, Oxford Science Publications, Oxford, 1994
work page 1994
-
[32]
I. M. Sobol’ , The distribution of points in a cube and the accurate evaluation of integrals , USSR Computational Mathematics and Mathematical Physics, 7 (1967), pp. 86–112
work page 1967
-
[33]
I. M. Sobol’ , Multidimensional Quadrature Formulas and Haar Functions , Nauka, Moscow,
-
[34]
I. M. Sobol’, Sensitivity estimates for nonlinear mathematical models, Mathematical Modeling and Computational Experiment, 1 (1993), pp. 407–414
work page 1993
-
[35]
X. Wang, Improving the rejection sampling method in quasi-Monte Carlo methods , Journal of computational and applied Mathematics, 114 (2000), pp. 231–246
work page 2000
-
[36]
G. Wasilkowski, ε-superposition and truncation dimensions and multivariate decomposition method for∞-variate linear problems, in Multivariate Algorithms and Information-Based Complexity, F. J. Hickernell and P. Kritzer, eds., Berlin/Boston, 2019, De Gruyter. Ac- cepted
work page 2019
-
[37]
Moments and Absolute Moments of the Normal Distribution
A. Winkelbauer, Moments and absolute moments of the normal distribution , Tech. Report arXiv:1209.4340, Vienna University of Technology, 2012. 18 C. HOYT AND A. B. OWEN
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[38]
Appendix. 8.1. Upper bound for jumps. Proof. Here we prove Theorem 4.1. If θk = 0 then τ 2 k = 0 too. We may suppose that any such xk have been removed from the model. Then τ 2 k = 1 2 E (( 1{y +x>t }− 1{y +z >t} )2) = 1 2 E ( |1{y +x>t }− 1{y +z >t}| ) where y∼N (0, 1−θ2 k) and x,z∼N (0,θ 2 k) are all independent. Next, for any ϵ> 0 2τ 2 k ⩽ Pr(|y +x−t|<...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.