pith. machine review for the scientific record. sign in

arxiv: 2411.02225 · v3 · submitted 2024-11-04 · 📊 stat.ML · cs.IT· cs.LG· math.IT· math.ST· stat.TH

Recognition: unknown

Sparse Max-Affine Regression

Haitham Kanj, Kiryung Lee, Seonho Kim

classification 📊 stat.ML cs.ITcs.LGmath.ITmath.STstat.TH
keywords modelsp-gdsparsestarepsilonestimateinitializationnoise
0
0 comments X
read the original abstract

This paper presents Sparse Gradient Descent as a solution for variable selection in convex piecewise linear regression, where the model is given as the maximum of $k$-affine functions $ x \mapsto \max_{j \in [k]} \langle a_j^\star, x \rangle + b_j^\star$ for $j = 1,\dots,k$. Here, $\{ a_j^\star\}_{j=1}^k$ and $\{b_j^\star\}_{j=1}^k$ denote the ground-truth weight vectors and intercepts. A non-asymptotic local convergence analysis is provided for Sp-GD under sub-Gaussian noise when the covariate distribution satisfies the sub-Gaussianity and anti-concentration properties. When the model order and parameters are fixed, Sp-GD provides an $\epsilon$-accurate estimate given $\mathcal{O}(\max(\epsilon^{-2}\sigma_z^2,1)s\log(d/s))$ observations where $\sigma_z^2$ denotes the noise variance. This also implies the exact parameter recovery by Sp-GD from $\mathcal{O}(s\log(d/s))$ noise-free observations. The proposed initialization scheme uses sparse principal component analysis to estimate the subspace spanned by $\{ a_j^\star\}_{j=1}^k$, then applies an $r$-covering search to estimate the model parameters. A non-asymptotic analysis is presented for this initialization scheme when the covariates and noise samples follow Gaussian distributions. When the model order and parameters are fixed, this initialization scheme provides an $\epsilon$-accurate estimate given $\mathcal{O}(\epsilon^{-2}\max(\sigma_z^4,\sigma_z^2,1)s^2\log^4(d))$ observations. A new transformation named Real Maslov Dequantization (RMD) is proposed to transform sparse generalized polynomials into sparse max-affine models. The error decay rate of RMD is shown to be exponentially small in its temperature parameter. Furthermore, theoretical guarantees for Sp-GD are extended to the bounded noise model induced by RMD. Numerical Monte Carlo results corroborate theoretical findings for Sp-GD and the initialization scheme.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions

    stat.ML 2026-05 unverdicted novelty 7.0

    ABGD parametrizes piecewise linear functions as difference of max-affine functions and converges linearly to an epsilon-accurate solution with O(d max(sigma/epsilon,1)^2) samples under sub-Gaussian noise, which is min...