Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers

Gregory Valiant; Jacob Steinhardt; Moses Charikar

arxiv: 1703.04940 · v3 · pith:UL3EMV6Unew · submitted 2017-03-15 · 💻 cs.LG · cs.AI· cs.CC· cs.CR· stat.ML

Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers

Jacob Steinhardt , Moses Charikar , Gregory Valiant This is my paper

classification 💻 cs.LG cs.AIcs.CCcs.CRstat.ML

keywords robustestimationboundeddistributionlearningmeanmomentsresilience

0 comments

read the original abstract

We introduce a criterion, resilience, which allows properties of a dataset (such as its mean or best low rank approximation) to be robustly computed, even in the presence of a large fraction of arbitrary additional data. Resilience is a weaker condition than most other properties considered so far in the literature, and yet enables robust estimation in a broader variety of settings. We provide new information-theoretic results on robust distribution learning, robust estimation of stochastic block models, and robust mean estimation under bounded $k$th moments. We also provide new algorithmic results on robust distribution learning, as well as robust mean estimation in $\ell_p$-norms. Among our proof techniques is a method for pruning a high-dimensional distribution with bounded $1$st moments to a stable "core" with bounded $2$nd moments, which may be of independent interest.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Unified Approach to Robust Mean Estimation
stat.ML 2019-07 unverdicted novelty 7.0

A connection between Huber's contamination and heavy-tailed models yields unified robust mean estimators that are both computationally efficient and statistically optimal under certain conditions.
On efficient robust regression with subquadratic samples
cs.DS 2026-05 unverdicted novelty 6.0

Near-linear time algorithm for robust regression under Gaussian covariates achieves O(sqrt(ε κ)) error with Õ(d/ε⁴) samples when ε κ ≲ 1, plus SQ and low-degree lower bounds.
Generalized Rank Regression
stat.ME 2026-05 unverdicted novelty 5.0

Generalized Rank Regression extends rank methods to non-monotonic scores, derives Bahadur representation and asymptotic normality, proposes a two-stage sub-gradient algorithm, and shows variance equivalence to composi...