pith. sign in

arxiv: 1906.09255 · v1 · pith:5VRSMBX5new · submitted 2019-06-21 · 📊 stat.ML · cs.IT· cs.LG· math.IT· math.ST· stat.TH

Max-Affine Regression: Provable, Tractable, and Near-Optimal Statistical Estimation

Pith reviewed 2026-05-25 18:24 UTC · model grok-4.3

classification 📊 stat.ML cs.ITcs.LGmath.ITmath.STstat.TH
keywords max-affine regressionalternating minimizationstatistical estimationphase retrievalconvex regressionhigh-dimensional statisticsnon-convex optimizationspectral initialization
0
0 comments X

The pith

Alternating minimization with a spectral initializer recovers the coefficients of a max-affine regression model at near-parametric rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies estimation of a regression function that is the pointwise maximum of k fixed affine functions, a model that includes linear regression and phase retrieval as special cases. It analyzes a non-convex least-squares formulation solved by alternating minimization, showing that a spectral-plus-random-search initializer places the iterates inside a basin where the algorithm converges geometrically fast to a small ball around the true coefficients. The resulting error rates scale near-optimally with dimension, sample size, and noise variance under random sub-Gaussian designs, offering a concrete algorithmic route to avoid the dimensionality curse that appears in unrestricted convex regression.

Core claim

When the design is random and sub-Gaussian and k is fixed, the alternating minimization algorithm, started from a spectral method combined with random search in a low-dimensional space, converges with high probability at a geometric rate to a small neighborhood of the optimal coefficients and attains a near-parametric, minimax-optimal estimation rate up to poly-log factors.

What carries the argument

Alternating minimization on the non-convex least-squares objective, initialized by a spectral method plus random search over a low-dimensional subspace.

If this is right

  • The same procedure yields new guarantees for classical phase retrieval under weaker assumptions on the measurement distribution than previously required.
  • The approach supplies an explicit, implementable way to regularize convex-regression-type problems and thereby escape the curse of dimensionality.
  • Estimation remains tractable and statistically efficient even when the unknown function is piecewise affine with a constant number of pieces.
  • The analysis directly produces high-probability bounds that depend on dimension, sample size, and noise variance in the expected minimax-optimal way.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The initializer construction may extend to other non-convex regression problems that admit a low-dimensional spectral signal.
  • The geometric convergence result suggests that similar alternating schemes could be analyzed for piecewise-linear models with slowly growing k.
  • The near-optimal rates indicate that max-affine regression can serve as a computationally friendly surrogate for general convex regression in moderate dimensions.

Load-bearing premise

The covariates follow a random sub-Gaussian design and the number of affine pieces k is treated as a fixed constant; the geometric convergence further requires the initializer to land inside the basin of attraction of the global optimum.

What would settle it

A dataset drawn from the stated random design in which the alternating-minimization iterates either fail to contract geometrically or produce estimation error that exceeds the claimed near-parametric bound by more than a poly-log factor in the dimension.

Figures

Figures reproduced from arXiv: 1906.09255 by Adityanand Guntuboyina, Ashwin Pananjady, Avishek Ghosh, Kannan Ramchandran.

Figure 1
Figure 1. Figure 1: Convergence of the AM with Gaussian covariates—in [PITH_FULL_IMAGE:figures/full_fig_p013_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Convergence of AM when the covariates are drawn i.i [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Simulation of the PCA and overall guarantees for Ga [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Convergence of the alternating minimization in th [PITH_FULL_IMAGE:figures/full_fig_p083_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: AM with Rademacher and √ 1 2.4 (Bin(10, 0.4) − 4) covariates in log-log scale— the best-fit line to plot (a) has slope 1.94 and the best-fit line to plot (b) has slope 1.96, and hence the sample complexity scales linearly with k but quadratically with dimension d. Theorem 5. Suppose that the covariates xi are drawn i.i.d. from a standard Gaussian distribution, and that n ≥ Cσ2 k 2 πmin . Then the parameter… view at source ↗
read the original abstract

Max-affine regression refers to a model where the unknown regression function is modeled as a maximum of $k$ unknown affine functions for a fixed $k \geq 1$. This generalizes linear regression and (real) phase retrieval, and is closely related to convex regression. Working within a non-asymptotic framework, we study this problem in the high-dimensional setting assuming that $k$ is a fixed constant, and focus on estimation of the unknown coefficients of the affine functions underlying the model. We analyze a natural alternating minimization (AM) algorithm for the non-convex least squares objective when the design is random. We show that the AM algorithm, when initialized suitably, converges with high probability and at a geometric rate to a small ball around the optimal coefficients. In order to initialize the algorithm, we propose and analyze a combination of a spectral method and a random search scheme in a low-dimensional space, which may be of independent interest. The final rate that we obtain is near-parametric and minimax optimal (up to a poly-logarithmic factor) as a function of the dimension, sample size, and noise variance. In that sense, our approach should be viewed as a direct and implementable method of enforcing regularization to alleviate the curse of dimensionality in problems of the convex regression type. As a by-product of our analysis, we also obtain guarantees on a classical algorithm for the phase retrieval problem under considerably weaker assumptions on the design distribution than was previously known. Numerical experiments illustrate the sharpness of our bounds in the various problem parameters.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies max-affine regression (regression function is the pointwise maximum of k fixed affine functions) in the high-dimensional regime under random design. It analyzes a natural alternating minimization (AM) algorithm on the non-convex least-squares objective and shows that, when initialized by a combination of a spectral method and random search in a low-dimensional space, the iterates converge geometrically with high probability to a small ball around the global optimum. The resulting estimator achieves near-parametric, minimax-optimal rates (up to poly-log factors) in dimension, sample size, and noise variance. A byproduct is improved guarantees for phase retrieval under weaker design assumptions.

Significance. If the claims hold, the work supplies a direct, implementable algorithm with non-asymptotic guarantees for a natural non-convex generalization of linear regression that includes real phase retrieval and is closely related to convex regression. The near-optimal rates for fixed k demonstrate that the method enforces regularization sufficient to escape the curse of dimensionality. The phase-retrieval corollary under weaker assumptions is a concrete additional contribution.

major comments (2)
  1. [§4 and Theorem 1] §4 (initializer analysis) and the statement of Theorem 1: the claimed high-probability success of the spectral-plus-random-search initializer rests on a covering or union-bound argument whose failure probability is controlled only up to poly(k) factors. Because the random-search component operates in a space whose effective dimension scales with the ambient dimension d of each affine piece, the overall success probability can fall below 1-o(1) as d grows even for fixed k. This directly undermines the geometric-convergence guarantee for the subsequent AM iterates, which is load-bearing for the central claim.
  2. [Theorem 2] Theorem 2 (final statistical rate): the near-minimax optimality is stated to hold up to a poly-logarithmic factor, but the precise dependence of the poly-log term on k and the failure probability of the initializer is not made explicit; without this, it is impossible to verify that the rate remains near-optimal once the initializer success probability is properly bounded.
minor comments (2)
  1. [Eq. (1)] Notation for the max-affine model (Eq. (1) or (2)) should explicitly state whether the k affine pieces share a common intercept or are allowed distinct intercepts; the current presentation leaves this ambiguous for readers.
  2. [Numerical experiments] The numerical experiments section would benefit from a plot or table that isolates the effect of the random-search radius on initialization success probability as d increases.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on the initializer analysis and statistical rates. We address the two major comments point by point below and will revise the manuscript to incorporate explicit probability bounds and dependence on k.

read point-by-point responses
  1. Referee: [§4 and Theorem 1] §4 (initializer analysis) and the statement of Theorem 1: the claimed high-probability success of the spectral-plus-random-search initializer rests on a covering or union-bound argument whose failure probability is controlled only up to poly(k) factors. Because the random-search component operates in a space whose effective dimension scales with the ambient dimension d of each affine piece, the overall success probability can fall below 1-o(1) as d grows even for fixed k. This directly undermines the geometric-convergence guarantee for the subsequent AM iterates, which is load-bearing for the central claim.

    Authors: We thank the referee for highlighting the need for explicit control on the failure probability. The random-search component is performed in a low-dimensional space whose dimension depends only on k (specifically, a net over a k-dimensional parameter space after the spectral step reduces the problem), not on the ambient d. The union bound therefore incurs only a poly(k) factor, and the per-point success probability of the random search yields an overall initializer success probability of 1 - O(poly(k) exp(-c d)) for a positive constant c independent of k and d. For any fixed k this tends to 1 as d grows, preserving the high-probability geometric convergence of the subsequent AM iterates. We will revise §4 and Theorem 1 to state this bound explicitly. revision: yes

  2. Referee: [Theorem 2] Theorem 2 (final statistical rate): the near-minimax optimality is stated to hold up to a poly-logarithmic factor, but the precise dependence of the poly-log term on k and the failure probability of the initializer is not made explicit; without this, it is impossible to verify that the rate remains near-optimal once the initializer success probability is properly bounded.

    Authors: We agree that the dependence should be stated explicitly. The poly-logarithmic factor in Theorem 2 arises from the covering numbers and concentration inequalities and is of the form (log(d n / δ))^{O(1)}, where δ is the initializer failure probability. Because δ = O(poly(k) exp(-c d)) with k fixed, the extra factors remain polylogarithmic in d and n (with polynomial dependence on k only). The resulting rate is therefore still near-minimax optimal up to these explicit factors. We will revise Theorem 2 and the surrounding text to display the precise dependence on k and δ. revision: yes

Circularity Check

0 steps flagged

No circularity: direct non-asymptotic analysis of explicit algorithm under random-design assumptions

full rationale

The paper states a direct analysis of the alternating minimization algorithm (initialized via spectral method plus random search in low-dimensional space) under random sub-Gaussian design with fixed k. Convergence at geometric rate to a small ball around the optimum, and the final near-parametric minimax rate (up to poly-log factors), are derived from the stated assumptions without any reduction of the claimed quantities to fitted parameters, self-defined objects, or load-bearing self-citations. The initializer success is part of the analysis rather than presupposed by the target result. No equations or steps reduce the prediction to the input by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, invented entities, or non-standard axioms can be identified beyond the modeling assumptions stated in the abstract.

axioms (2)
  • domain assumption Design matrix entries are drawn from a random distribution (sub-Gaussian or comparable).
    Explicitly invoked for the convergence analysis of the alternating-minimization iterates.
  • domain assumption k is a fixed constant independent of dimension and sample size.
    Stated as the regime in which the near-parametric rates are derived.

pith-pipeline@v0.9.0 · 5839 in / 1472 out tokens · 35601 ms · 2026-05-25T18:24:34.859110+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions

    stat.ML 2026-05 unverdicted novelty 7.0

    ABGD parametrizes piecewise linear functions as difference of max-affine functions and converges linearly to an epsilon-accurate solution with O(d max(sigma/epsilon,1)^2) samples under sub-Gaussian noise, which is min...

  2. Expectation Maximization (EM) Converges for General Agnostic Mixtures

    cs.LG 2026-04 conditional novelty 7.0

    Gradient EM converges exponentially to optimal population loss minimizers for agnostic fitting of k parametric functions under strong convexity and smoothness of the loss, proper initialization, and separation conditions.

Reference graph

Works this paper leans on

91 extracted references · 91 canonical work pages · cited by 2 Pith papers · 8 internal anchors

  1. [1]

    Convex regression: theory, practice, and applications

    G \'a bor Bal \'a zs. Convex regression: theory, practice, and applications . PhD thesis, University of Alberta, 2016

  2. [2]

    Slice inverse regression with score functions

    Dmitry Babichev and Francis Bach. Slice inverse regression with score functions. Electronic Journal of Statistics , 12(1):1507--1543, 2018

  3. [3]

    Missing values in multivariate analysis

    Evelyn ML Beale and Roderick JA Little. Missing values in multivariate analysis. Journal of the Royal Statistical Society: Series B (Methodological) , 37(1):129--145, 1975

  4. [4]

    -entropy of convex sets and functions

    EM Bronshtein. -entropy of convex sets and functions. Siberian Mathematical Journal , 17(3):393--398, 1976

  5. [5]

    The truncated normal distribution

    John Burkardt. The truncated normal distribution. 2014

  6. [6]

    Statistical guarantees for the EM algorithm: From population to sample-based analysis

    Sivaraman Balakrishnan, Martin J Wainwright, and Bin Yu. Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics , 45(1):77--120, 2017

  7. [7]

    Solving random quadratic systems of equations is nearly as easy as solving linear systems

    Yuxin Chen and Emmanuel Candes. Solving random quadratic systems of equations is nearly as easy as solving linear systems. In Advances in Neural Information Processing Systems , pages 739--747, 2015

  8. [8]

    On EM algorithms and their proximal generalizations

    St \'e phane Chr \'e tien and Alfred O Hero. On EM algorithms and their proximal generalizations. ESAIM: Probability and Statistics , 12:308--326, 2008

  9. [9]

    Optimal Bounds on the VC-dimension

    Monika Csikos, Andrey Kupavskii, and Nabil H Mustafa. Optimal bounds on the VC -dimension. arXiv preprint arXiv:1807.07924 , 2018

  10. [10]

    Spectral experts for estimating mixtures of linear regressions

    Arun Tejasvi Chaganty and Percy Liang. Spectral experts for estimating mixtures of linear regressions. In International Conference on Machine Learning , pages 1040--1048, 2013

  11. [11]

    Optimal rates of convergence for noisy sparse phase retrieval via thresholded W irtinger flow

    T Tony Cai, Xiaodong Li, and Zongming Ma. Optimal rates of convergence for noisy sparse phase retrieval via thresholded W irtinger flow. The Annals of Statistics , 44(5):2221--2251, 2016

  12. [12]

    Phase retrieval via wirtinger flow: Theory and algorithms

    Emmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval via wirtinger flow: Theory and algorithms. IEEE Transactions on Information Theory , 61(4):1985--2007, 2015

  13. [13]

    Array imaging using intensity-only measurements

    Anwei Chai, Miguel Moscoso, and George Papanicolaou. Array imaging using intensity-only measurements. Inverse Problems , 27(1):015005, 2010

  14. [14]

    Distributional and l^q norm inequalities for polynomials over convex bodies in R ^n

    Anthony Carbery and James Wright. Distributional and l^q norm inequalities for polynomials over convex bodies in R ^n . Mathematical research letters , 8(3):233--248, 2001

  15. [15]

    Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval

    John C Duchi and Feng Ruan. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval. arXiv preprint arXiv:1705.02356 , 2017

  16. [16]

    Multiclass learning approaches: A theoretical comparison with implications

    Amit Daniely, Sivan Sabato, and Shai S Shwartz. Multiclass learning approaches: A theoretical comparison with implications. In Advances in Neural Information Processing Systems , pages 485--493, 2012

  17. [17]

    Ten steps of EM suffice for mixtures of two G aussians

    Constantinos Daskalakis, Christos Tzamos, and Manolis Zampetakis. Ten steps of EM suffice for mixtures of two G aussians. In 30th Annual Conference on Learning Theory , 2017

  18. [18]

    Phase retrieval: Stability and recovery guarantees

    Yonina C Eldar and Shahar Mendelson. Phase retrieval: Stability and recovery guarantees. Applied and Computational Harmonic Analysis , 36(3):473--494, 2014

  19. [19]

    Phase retrieval and image reconstruction for astronomy

    C Fienup and J Dainty. Phase retrieval and image reconstruction for astronomy. Image Recovery: Theory and Application , 231:275, 1987

  20. [20]

    Phase retrieval algorithms: a comparison

    James R Fienup. Phase retrieval algorithms: a comparison. Applied optics , 21(15):2758--2769, 1982

  21. [21]

    Phase retrieval from very few measurements

    Matthew Fickus, Dustin G Mixon, Aaron A Nelson, and Yang Wang. Phase retrieval from very few measurements. Linear Algebra and its Applications , 449:475--499, 2014

  22. [22]

    Phase retrieval for imaging problems

    Fajwel Fogel, Ir \`e ne Waldspurger, and Alexandre d'Aspremont. Phase retrieval for imaging problems. Mathematical programming computation , 8(3):311--335, 2016

  23. [23]

    Richard J. Gardner. Geometric Tomography . Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2 edition, 2006

  24. [24]

    Three-dimensional support function estimation and application for projection magnetic resonance imaging

    Jens Gregor and Fernando R Rannou. Three-dimensional support function estimation and application for projection magnetic resonance imaging. International journal of imaging systems and technology , 12(1):43--50, 2002

  25. [25]

    A practical algorithm for the determination of phase from image and diffraction plane pictures

    R Gerchberg and W Saxton. A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik , 35:237--246, 1972

  26. [26]

    Covering numbers for convex functions

    Adityanand Guntuboyina and Bodhisattva Sen. Covering numbers for convex functions. IEEE Transactions on Information Theory , 59(4):1957--1965, 2013

  27. [27]

    Optimal rates of convergence for convex set estimation from support functions

    Adityanand Guntuboyina. Optimal rates of convergence for convex set estimation from support functions. The Annals of Statistics , 40(1):385--411, 2012

  28. [28]

    Entropy of convex functions on R ^d

    Fuchang Gao and Jon A Wellner. Entropy of convex functions on R ^d . Constructive approximation , 46(3):565--592, 2017

  29. [29]

    Maximum likelihood estimation from incomplete data

    Herman O Hartley. Maximum likelihood estimation from incomplete data. Biometrics , 14(2):174--194, 1958

  30. [30]

    Phase problem in crystallography

    Robert W Harrison. Phase problem in crystallography. JOSA a , 10(5):1046--1055, 1993

  31. [31]

    A convex/log-concave correlation inequality for G aussian measure and an application to abstract wiener spaces

    Gilles Harg \'e . A convex/log-concave correlation inequality for G aussian measure and an application to abstract wiener spaces. Probability theory and related fields , 130(3):415--440, 2004

  32. [32]

    Multivariate convex regression with adaptive partitioning

    Lauren A Hannah and David B Dunson. Multivariate convex regression with adaptive partitioning. The Journal of Machine Learning Research , 14(1):3261--3294, 2013

  33. [33]

    Learning mixtures of spherical G aussians: moment methods and spectral decompositions

    Daniel Hsu and Sham M Kakade. Learning mixtures of spherical G aussians: moment methods and spectral decompositions. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science , pages 11--20. ACM, 2013

  34. [34]

    Semiparametric and nonparametric methods in econometrics , volume 12

    Joel L Horowitz. Semiparametric and nonparametric methods in econometrics , volume 12. Springer, 2009

  35. [35]

    It \^o -wiener chaos expansion with exact residual and correlation, variance inequalities

    Yaozhong Hu. It \^o -wiener chaos expansion with exact residual and correlation, variance inequalities. Journal of Theoretical Probability , 10(4):835--848, 1997

  36. [36]

    Multivariate convex regression: global risk bounds and adaptation

    Qiyang Han and Jon A Wellner. Multivariate convex regression: global risk bounds and adaptation. arXiv preprint arXiv:1601.06844 , 2016

  37. [37]

    Hierarchical mixtures of experts and the EM algorithm

    Michael I Jordan and Robert A Jacobs. Hierarchical mixtures of experts and the EM algorithm. Neural computation , 6(2):181--214, 1994

  38. [38]

    Adaptive mixtures of local experts

    Robert A Jacobs, Michael I Jordan, Steven J Nowlan, and Geoffrey E Hinton. Adaptive mixtures of local experts. Neural computation , 3(1):79--87, 1991

  39. [39]

    Limit distribution of the sum and maximum from multivariate G aussian sequences

    Barry James, Kang James, and Yongcheng Qi. Limit distribution of the sum and maximum from multivariate G aussian sequences. Journal of multivariate analysis , 98(3):517--532, 2007

  40. [40]

    Local maxima in the likelihood of G aussian mixture models: Structural results and algorithmic consequences

    Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J Wainwright, and Michael I Jordan. Local maxima in the likelihood of G aussian mixture models: Structural results and algorithmic consequences. In Advances in neural information processing systems , pages 4116--4124, 2016

  41. [41]

    Matrix completion from a few entries

    Raghunandan H Keshavan, Andrea Montanari, and Sewoong Oh. Matrix completion from a few entries. IEEE transactions on information theory , 56(6):2980--2998, 2010

  42. [42]

    Reduction of variance for G aussian densities via restriction to convex sets

    Marek Kanter and Harold Proppe. Reduction of variance for G aussian densities via restriction to convex sets. Journal of Multivariate Analysis , 7(1):74--81, 1977

  43. [43]

    Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression

    Jeongyeol Kwon, Wei Qian, Constantine Caramanis, Yudong Chen, and Damek Davis. Global convergence of EM algorithm for mixtures of two component linear regression. arXiv preprint arXiv:1810.05752v3 , 2018

  44. [44]

    The concentration of measure phenomenon

    Michel Ledoux. The concentration of measure phenomenon . Number 89. American Mathematical Soc., 2001

  45. [45]

    Consistency of multidimensional convex regression

    Eunji Lim and Peter W Glynn. Consistency of multidimensional convex regression. Operations Research , 60(1):196--208, 2012

  46. [46]

    Sliced inverse regression for dimension reduction

    Ker-Chau Li. Sliced inverse regression for dimension reduction. Journal of the American Statistical Association , 86(414):316--327, 1991

  47. [47]

    Siegel's formula via S tein's identities

    Jun S Liu. Siegel's formula via S tein's identities. Statistics & Probability Letters , 21(3):247--251, 1994

  48. [48]

    Minimax rate of convergence and the performance of ERM in phase recovery

    Guillaume Lecu \'e and Shahar Mendelson. Minimax rate of convergence and the performance of ERM in phase recovery. arXiv preprint arXiv:1311.5024 , 2013

  49. [49]

    Convex piecewise-linear fitting

    Alessandro Magnani and Stephen P Boyd. Convex piecewise-linear fitting. Optimization and Engineering , 10(1):1--17, 2009

  50. [50]

    A computational framework for multivariate convex regression and its variants

    Rahul Mazumder, Arkopal Choudhury, Garud Iyengar, and Bodhisattva Sen. A computational framework for multivariate convex regression and its variants. Journal of the American Statistical Association , 114(525):318--331, 2019

  51. [51]

    Learning theory and algorithms for revenue optimization in second price auctions with reserve

    Andres M Medina and Mehryar Mohri. Learning theory and algorithms for revenue optimization in second price auctions with reserve. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) , pages 262--270, 2014

  52. [52]

    Learning simple auctions

    Jamie Morgenstern and Tim Roughgarden. Learning simple auctions. In Conference on Learning Theory , pages 1298--1318, 2016

  53. [53]

    Foundations of machine learning

    Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning . MIT press, 2018

  54. [54]

    Moments calculation for the double truncated multivariate normal density, 2009

    BG Manjunath and Stefan Wilhelm. Moments calculation for the double truncated multivariate normal density, 2009

  55. [55]

    Phase retrieval using alternating minimization

    Praneeth Netrapalli, Prateek Jain, and Sujay Sanghavi. Phase retrieval using alternating minimization. In Advances in Neural Information Processing Systems , pages 2796--2804, 2013

  56. [56]

    Small ball probability estimates for log-concave measures

    Grigoris Paouris. Small ball probability estimates for log-concave measures. Transactions of the American Mathematical Society , 364(1):287--308, 2012

  57. [57]

    Reconstructing convex sets from support line measurements

    Jerry Ladd Prince and Alan S Willsky. Reconstructing convex sets from support line measurements. IEEE Transactions on Pattern Analysis and Machine Intelligence , 12(4):377--389, 1990

  58. [58]

    The L ittlewood-- O fford problem and invertibility of random matrices

    Mark Rudelson and Roman Vershynin. The L ittlewood-- O fford problem and invertibility of random matrices. Advances in Mathematics , 218(2):600--633, 2008

  59. [59]

    Hanson-wright inequality and sub- G aussian concentration

    Mark Rudelson and Roman Vershynin. Hanson-wright inequality and sub- G aussian concentration. Electronic Communications in Probability , 18, 2013

  60. [60]

    Small ball probabilities for linear images of high-dimensional distributions

    Mark Rudelson and Roman Vershynin. Small ball probabilities for linear images of high-dimensional distributions. International Mathematics Research Notices , 2015(19):9594--9617, 2014

  61. [61]

    Fitting tractable convex sets to support function evaluations

    Yong Sheng Soh and Venkat Chandrasekaran. Fitting tractable convex sets to support function evaluations. arXiv preprint arXiv:1903.04194 , 2019

  62. [62]

    A surprising covariance involving the minimum of multivariate normal variables

    Andrew F Siegel. A surprising covariance involving the minimum of multivariate normal variables. Journal of the American Statistical Association , 88(421):77--80, 1993

  63. [63]

    Provable tensor methods for learning mixtures of generalized linear models

    Hanie Sedghi, Majid Janzamin, and Anima Anandkumar. Provable tensor methods for learning mixtures of generalized linear models. Proceedings of Machine Learning Research , 51:1223--1231, 2014

  64. [64]

    Fitting Convex Sets to Data: Algorithms and Applications

    Yong Sheng Soh. Fitting Convex Sets to Data: Algorithms and Applications . PhD thesis, California Institute of Technology, 2019

  65. [65]

    Nonparametric least squares estimation of a multivariate convex regression function

    Emilio Seijo and Bodhisattva Sen. Nonparametric least squares estimation of a multivariate convex regression function. The Annals of Statistics , 39(3):1633--1657, 2011

  66. [66]

    Iterative least trimmed squares for mixed linear regression

    Yanyao Shen and Sujay Sanghavi. Iterative least trimmed squares for mixed linear regression. arXiv preprint arXiv:1902.03653 , 2019

  67. [67]

    Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees

    Yan Shuo Tan and Roman Vershynin . Phase retrieval via randomized K aczmarz: Theoretical guarantees. arXiv e-prints , page arXiv:1706.09993, Jun 2017

  68. [68]

    The moment generating function of the truncated multi-normal distribution

    Georges M Tallis. The moment generating function of the truncated multi-normal distribution. Journal of the Royal Statistical Society: Series B (Methodological) , 23(1):223--229, 1961

  69. [69]

    An analysis of the EM algorithm and entropy-like proximal point methods

    Paul Tseng. An analysis of the EM algorithm and entropy-like proximal point methods. Mathematics of Operations Research , 29(1):27--44, 2004

  70. [70]

    Tsybakov

    Alexandre B. Tsybakov. Introduction to Nonparametric Estimation . Springer Publishing Company, Incorporated, 1st edition, 2008

  71. [71]

    The uniform convergence of frequencies of the appearance of events to their probabilities

    Vladimir N Vapnik and Aleksei Yakovlevich Chervonenkis. The uniform convergence of frequencies of the appearance of events to their probabilities. In Doklady Akademii Nauk , volume 181, pages 781--783. Russian Academy of Sciences, 1968

  72. [72]

    Regression analysis and empirical processes , volume 45 of CWI Tract

    Sara A van de Geer. Regression analysis and empirical processes , volume 45 of CWI Tract . Stichting Mathematisch Centrum, Centrum voor Wiskunde en Informatica, Amsterdam, 1988

  73. [73]

    Aad W van der Vaart and Jon A. Wellner. Weak convergence and empirical processes . Springer Series in Statistics. Springer-Verlag, New York, 1996. With applications to statistics

  74. [74]

    Learning convex concepts from G aussian distributions with PCA

    Santosh S Vempala. Learning convex concepts from G aussian distributions with PCA . In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science , pages 124--130. IEEE, 2010

  75. [75]

    Introduction to the non-asymptotic analysis of random matrices

    Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices. arXiv preprint arXiv:1011.3027 , 2010

  76. [76]

    High-dimensional probability: An introduction with applications in data science , volume 47

    Roman Vershynin. High-dimensional probability: An introduction with applications in data science , volume 47. Cambridge University Press, 2018

  77. [77]

    High-dimensional statistics: A non-asymptotic viewpoint , volume 48

    Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint , volume 48. Cambridge University Press, 2019

  78. [78]

    Phase retrieval with random G aussian sensing vectors by alternating projections

    Ir \`e ne Waldspurger . Phase retrieval with random G aussian sensing vectors by alternating projections. IEEE Transactions on Information Theory , 64(5):3301--3312, May 2018

  79. [79]

    All of nonparametric statistics

    Larry Wasserman. All of nonparametric statistics . Springer Science & Business Media, 2006

  80. [80]

    On the convergence properties of the EM algorithm

    CF Jeff Wu. On the convergence properties of the EM algorithm. The Annals of statistics , 11(1):95--103, 1983

Showing first 80 references.