hub

Federated Optimization: Distributed Machine Learning for On-Device Intelligence

· 2016 · cs.LG · arXiv 1610.02527

21 Pith papers cite this work. Polarity classification is still indexing.

21 Pith papers citing it

open full Pith review browse 21 citing papers arXiv PDF

abstract

We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are unevenly distributed over an extremely large number of nodes. The goal is to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of the utmost importance and minimizing the number of rounds of communication is the principal goal. A motivating example arises when we keep the training data locally on users' mobile devices instead of logging it to a data center for training. In federated optimziation, the devices are used as compute nodes performing computation on their local data in order to update a global model. We suppose that we have extremely large number of devices in the network --- as many as the number of users of a given service, each of which has only a tiny fraction of the total data available. In particular, we expect the number of data points available locally to be much smaller than the number of devices. Additionally, since different users generate data with different patterns, it is reasonable to assume that no device has a representative sample of the overall distribution. We show that existing algorithms are not suitable for this setting, and propose a new algorithm which shows encouraging experimental results for sparse convex problems. This work also sets a path for future research needed in the context of \federated optimization.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Federated Learning: Strategies for Improving Communication Efficiency

cs.LG · 2016-10-18 · conditional · novelty 8.0

Structured updates (low-rank or masked) and sketched updates (quantized, rotated, subsampled) reduce uplink communication in federated learning by up to two orders of magnitude on convolutional and recurrent networks.

LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.

Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.

Self-Distillation is Optimal Among Spectral Shrinkage Estimators in Spiked Covariance Models

math.ST · 2026-05-18 · unverdicted · novelty 7.0

s-step self-distillation is optimal among spectral shrinkage estimators for s-spiked covariance matrices and necessary for optimality.

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning

cs.LG · 2026-03-05 · unverdicted · novelty 7.0

FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.

Byzantine-Robust Distributed SGD: A Unified Analysis and Tight Error Bounds

math.OC · 2026-04-11 · unverdicted · novelty 7.0

Unified convergence rates and tight lower bounds for Byzantine-robust distributed SGD under stochasticity and general data heterogeneity, showing local momentum reduces stochastic error floors.

XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers

cs.CR · 2026-04-10 · unverdicted · novelty 7.0

XFED is the first aggregation-agnostic non-collusive model poisoning attack that bypasses eight state-of-the-art defenses on six benchmark datasets without attacker coordination.

Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and bounded heterogeneity.

Multi-user Pufferfish Privacy

cs.CR · 2025-12-21 · unverdicted · novelty 6.0

Sufficient conditions using the Wasserstein metric of order 1 are derived to calibrate Laplace noise for pufferfish privacy in multi-user aggregated queries, with relaxations for binary data that reduce noise while preserving indistinguishability.

FedOptima: Optimizing Resource Utilization in Federated Learning

cs.DC · 2025-03-10 · unverdicted · novelty 6.0

FedOptima reduces both straggler and dependency idle times in federated learning via layer offloading, asynchronous aggregation, auxiliary networks, and server scheduling, delivering up to 21.8x faster training.

Federated Learning with Nonvacuous Generalisation Bounds

cs.LG · 2023-10-17 · unverdicted · novelty 6.0

Federated learning trains private local randomised predictors whose aggregation yields a global predictor with nonvacuous PAC-Bayesian generalisation bounds and near-centralized accuracy.

Response Time Enhances Alignment with Heterogeneous Preferences

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.

Who Trains Matters: Federated Learning under Enrollment and Participation Selection Biases

cs.LG · 2026-04-29 · unverdicted · novelty 6.0

A two-stage selection model for federated learning permits inverse probability weighting to recover the target-population mean update under ignorability and positivity.

Optimized Federated Knowledge Distillation with Distributed Neural Architecture Search

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

FedKDNAS combines client-side neural architecture search with knowledge distillation from aggregated server predictions to improve accuracy and efficiency in heterogeneous federated learning.

Semantic-Effectiveness Filtering and Control for Post-5G Wireless Connectivity

cs.NI · 2019-07-04 · unverdicted · novelty 5.0

Introduces a semantic-effectiveness (SE) plane to augment protocol stacks with standardized interfaces for semantic filtering and cross-layer control in post-5G wireless systems.

Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction

math.OC · 2026-05-09 · unverdicted · novelty 5.0

Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.

Evaluating Federated Learning approaches for mammography under breast density heterogeneity

cs.LG · 2026-05-09 · unverdicted · novelty 4.0

FedAvg matches centralized training accuracy on mammography data split by breast density heterogeneity, showing standard FL can handle this clinical variation without special fixes.

AICCE: AI Driven Compliance Checker Engine

cs.CR · 2026-04-03 · unverdicted · novelty 4.0

AICCE combines RAG-based retrieval of protocol specs with dual LLM pipelines for debate-driven explanations or fast script execution, reporting up to 99% accuracy on IPv6 samples.

Active Learning Solution on Distributed Edge Computing

cs.DC · 2019-06-25 · unverdicted · novelty 3.0

A hybrid approach applies active learning at edge devices and federated learning at fog nodes to reduce training data volume and communication cost for image classification in distributed edge-fog setups.

Split and Aggregation Learning for Foundation Models Over Mobile Embodied AI Network (MEAN): A Comprehensive Survey

cs.IT · 2026-05-01 · unverdicted · novelty 3.0

The paper surveys split and aggregation learning for foundation models in 6G networks to improve efficiency, resource use, and data privacy in distributed AI.

A Survey on AI for 6G: Challenges and Opportunities

cs.NI · 2026-03-30 · accept · novelty 1.0

AI techniques including deep learning, reinforcement learning, and federated learning are positioned to enable high data rates, low latency, and massive connectivity in 6G networks while addressing scalability, security, and energy challenges.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Self-Distillation is Optimal Among Spectral Shrinkage Estimators in Spiked Covariance Models math.ST · 2026-05-18 · unverdicted · none · ref 13 · internal anchor
s-step self-distillation is optimal among spectral shrinkage estimators for s-spiked covariance matrices and necessary for optimality.

Federated Optimization: Distributed Machine Learning for On-Device Intelligence

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer