Two-Stage Robust Sparse Gradient Methods for Regression Under Heavy-Tailed Designs

Di Wang; Kaiyuan Zhou; Wenyang Zhang; Xiaoyu Zhang

arxiv: 2601.05669 · v2 · pith:EOYUJXSZnew · submitted 2026-01-09 · 📊 stat.ME

Two-Stage Robust Sparse Gradient Methods for Regression Under Heavy-Tailed Designs

Kaiyuan Zhou , Xiaoyu Zhang , Wenyang Zhang , Di Wang This is my paper

classification 📊 stat.ME

keywords gradientheavy-tailedlocalizationsparsedataduringcontrolscovariates

0 comments

read the original abstract

We study high-dimensional sparse regression under simultaneous heavy-tailed covariates and noise. Heavy-tailed data affect sparse optimization in two different ways: extreme covariates can destabilize the gradient field during global localization, while heavy-tailed noise limits the final statistical accuracy during local refinement. Motivated by this two-phase structure, we propose two-stage RIGHT, a robust sparse first-order method based on coordinate-wise median-of-means (MoM) gradient estimation and delayed sample splitting. The MoM gradient estimator is computationally simple, compatible with hard-thresholded updates, and admits phase-adaptive concentration bounds whose rates depend on the current localization radius. Delayed splitting reuses data during global localization and reserves fresh batches for the shorter refinement stage, reducing the sample-splitting cost. The theoretical results reveal a decoupled rate structure: the design-tail index controls gradient stability and sample complexity, whereas the noise-tail index controls the final statistical rate. We also provide phase-wise lower-bound benchmarks showing that the design-driven localization barrier is intrinsic. Extensive simulation experiments and real data analysis showcase the efficacy of the proposed method over existing competitors.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Statistical Inference on Gradient Flows
math.ST 2026-05 unverdicted novelty 7.0

Proves uniform CLT for gradient flows in ERM and constructs an algorithm-aware, inversion-free covariance estimator for asymptotically valid time-uniform confidence intervals.