Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression

Michael W. Mahoney; Xiangrui Meng

arxiv: 1210.3135 · v3 · pith:3GGXUI3Mnew · submitted 2012-10-11 · 💻 cs.DS

Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression

Xiangrui Meng , Michael W. Mahoney This is my paper

classification 💻 cs.DS

keywords epsilontimeembeddingpolyembeddingsinput-sparsityregressionsubspace

0 comments

read the original abstract

Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix $A \in \R^{n \times d}$ with $n \gg d$ and a $p \in [1, 2)$, with a constant probability, we can construct a low-distortion embedding matrix $\Pi \in \R^{O(\poly(d)) \times n}$ that embeds $\A_p$, the $\ell_p$ subspace spanned by $A$'s columns, into $(\R^{O(\poly(d))}, \| \cdot \|_p)$; the distortion of our embeddings is only $O(\poly(d))$, and we can compute $\Pi A$ in $O(\nnz(A))$ time, i.e., input-sparsity time. Our result generalizes the input-sparsity time $\ell_2$ subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for $\ell_2$. These input-sparsity time $\ell_p$ embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as $(1\pm \epsilon)$-distortion $\ell_p$ subspace embedding and relative-error $\ell_p$ regression. For $\ell_2$, we show that a $(1+\epsilon)$-approximate solution to the $\ell_2$ regression problem specified by the matrix $A$ and a vector $b \in \R^n$ can be computed in $O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2)$ time; and for $\ell_p$, via a subspace-preserving sampling procedure, we show that a $(1\pm \epsilon)$-distortion embedding of $\A_p$ into $\R^{O(\poly(d))}$ can be computed in $O(\nnz(A) \cdot \log n)$ time, and we also show that a $(1+\epsilon)$-approximate solution to the $\ell_p$ regression problem $\min_{x \in \R^d} \|A x - b\|_p$ can be computed in $O(\nnz(A) \cdot \log n + \poly(d) \log(1/\epsilon)/\epsilon^2)$ time. Moreover, we can improve the embedding dimension or equivalently the sample size to $O(d^{3+p/2} \log(1/\epsilon) / \epsilon^2)$ without increasing the complexity.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Beyond Averaging in John Ellipsoid Approximation: High-Accuracy Algorithms in the Leverage-Score Model
math.OC 2026-06 unverdicted novelty 7.0

John ellipsoid approximation in the leverage-score model achieves doubly logarithmic accuracy cost after setup by using last-iterate acceleration and Newton steps instead of averaging.
Randomized Sketching is Robust to Low-Precision Rounding on GPUs
cs.PF 2026-06 unverdicted novelty 5.0

FP16 SparseStack on GPUs shows embedding quality insensitive to rounding method, with sketch distribution as the primary accuracy driver across tested inputs.