Byzantine-Tolerant Machine Learning

El Mahdi El Mhamdi, Julien Stainer, Peva Blanchard, Rachid Guerraoui

Authors on Pith no claims yet

classification 💻 cs.DC cs.LGcs.NEmath.OCstat.ML

keywords byzantinelearningmachineruleupdateworkerscomplexitydescent

read the original abstract

The growth of data, the need for scalability and the complexity of models used in modern machine learning calls for distributed implementations. Yet, as of today, distributed machine learning frameworks have largely ignored the possibility of arbitrary (i.e., Byzantine) failures. In this paper, we study the robustness to Byzantine failures at the fundamental level of stochastic gradient descent (SGD), the heart of most machine learning algorithms. Assuming a set of $n$ workers, up to $f$ of them being Byzantine, we ask how robust can SGD be, without limiting the dimension, nor the size of the parameter space. We first show that no gradient descent update rule based on a linear combination of the vectors proposed by the workers (i.e, current approaches) tolerates a single Byzantine failure. We then formulate a resilience property of the update rule capturing the basic requirements to guarantee convergence despite $f$ Byzantine workers. We finally propose Krum, an update rule that satisfies the resilience property aforementioned. For a $d$-dimensional learning problem, the time complexity of Krum is $O(n^2 \cdot (d + \log n))$.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization
cs.CR 2026-04 unverdicted novelty 5.0

FedDetox uses on-device knowledge-distilled classifiers to sanitize toxic data in federated SLM training, preserving safety alignment comparable to centralized baselines.