hub Canonical reference

Federated Learning: Strategies for Improving Communication Efficiency

· 2016 · cs.LG · arXiv 1610.05492

Canonical reference. 90% of citing Pith papers cite this work as background.

60 Pith papers citing it

Background 90% of classified citations

open full Pith review browse 60 citing papers arXiv PDF

abstract

Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model while training data remains distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local data, and communicates this update to a central server, where the client-side updates are aggregated to compute a new global model. The typical clients in this setting are mobile phones, and communication efficiency is of the utmost importance. In this paper, we propose two ways to reduce the uplink communication costs: structured updates, where we directly learn an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, where we learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling before sending it to the server. Experiments on both convolutional and recurrent networks show that the proposed methods can reduce the communication cost by two orders of magnitude.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 9 method 1

citation-polarity summary

background 9 use method 1

representative citing papers

High-Probability Convergence Guarantees of Decentralized SGD

cs.LG · 2025-10-07 · unverdicted · novelty 8.0 · 2 refs

Decentralized SGD achieves high-probability convergence with order-optimal rates and linear speedup under standard cost assumptions matching those for MSE convergence.

TIGER: Inverting Transformer Gradients via Embedding-Subspace Distance Optimization

cs.CR · 2026-06-16 · unverdicted · novelty 7.0

TIGER turns the low-rank attention gradient subspace into a differentiable objective for continuous embedding optimization, improving reconstruction quality and robustness over prior discrete token tests especially under noise or DP.

The Capacity of Information-Theoretic Secure Aggregation in Federated Learning

cs.IT · 2026-06-05 · unverdicted · novelty 7.0

The capacity region among randomness for security, key-distribution communication, and aggregation communication is completely characterized for T-colluding secure aggregation with N users under a general two-phase user-to-user key distribution framework.

Information-Theoretic Decentralized Secure Aggregation with User Dropouts

cs.IT · 2026-05-21 · accept · novelty 7.0

For decentralized secure aggregation with at least U surviving users and at most T colluders, the optimal two-round rates are R1 ≥ 1 and R2 ≥ 1/(U-T-1) when U > T+1, and the task is impossible otherwise.

Scalable Distributed Stochastic Optimization via Bidirectional Compression: Beyond Pessimistic Limits

math.OC · 2026-05-08 · unverdicted · novelty 7.0

Inkheart SGD and M4 use bidirectional compression to achieve time complexities in distributed SGD that improve with worker count n and surpass prior lower bounds under a necessary structural assumption.

Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

Two randomized Hadamard transforms suffice to make coordinate marginals O(d^{-1/2})-close to Gaussian for most quantization methods, with three needed for vector quantization to match uniform random rotations asymptotically.

Scaling Federated Linear Contextual Bandits via Sketching

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

FSCLB scales federated linear contextual bandits with sketching to achieve over 90% lower computation and communication costs while preserving a near-optimal regret bound of O(sqrt(l d T)).

XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers

cs.CR · 2026-04-10 · unverdicted · novelty 7.0

XFED is the first aggregation-agnostic non-collusive model poisoning attack that bypasses eight state-of-the-art defenses on six benchmark datasets without attacker coordination.

Scalar Federated Learning for Linear Quadratic Regulator

eess.SY · 2026-04-06 · unverdicted · novelty 7.0

A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.

SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

cs.LG · 2025-10-09 · accept · novelty 7.0

SketchGuard decouples Byzantine filtering from aggregation in decentralized federated learning by exchanging k-dimensional Count Sketches for screening and full models only from accepted neighbors, achieving up to 50-70% communication savings while proving convergence and matching SOTA robustness.

Act in Collusion: Distributed Multi-Target Backdoor Attacks in Federated Learning

cs.CV · 2024-11-06 · unverdicted · novelty 7.0

DMBA maintains attack success rates above 80% for all backdoors in a distributed multi-target FL setting where baselines drop below 50%.

Tighter Performance Theory of FedExProx

math.OC · 2024-10-20 · unverdicted · novelty 7.0

New analysis framework yields tighter linear convergence for FedExProx on non-strongly convex quadratics and PL functions, proving outperformance over GD once communication costs are counted.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

cs.LG · 2019-10-23 · unverdicted · novelty 7.0

T5 casts all NLP tasks as text-to-text generation, systematically explores pre-training choices, and reaches strong performance on summarization, QA, classification and other tasks via large-scale training on the Colossal Clean Crawled Corpus.

Tuning-Free Efficient Estimation for Multi-Source Data via Covariance-Aware Shrinkage

stat.ME · 2026-06-29 · unverdicted · novelty 6.0

Proposes a covariance-aware tuning-free shrinkage framework and sequential algorithm for multi-source estimation that attains oracle risk asymptotically and improves on single-step methods.

Unifying Local Communications and Local Updates for LLM Pretraining

cs.LG · 2026-06-09 · unverdicted · novelty 6.0

GASLoC generalizes communication acceleration to the outer optimizer to enable gossip-based decentralized LLM pretraining that supports adaptive optimizers, local steps, and outperforms prior decentralized methods on standard tasks while matching DiLoCo in multi-step regimes.

Privacy-Preserving Credit Risk Prediction with Alternative Data

cs.LG · 2026-06-09 · unverdicted · novelty 6.0

PrivacyCredit is a machine learning method that combines traditional and alternative data for credit risk prediction while satisfying privacy-preserving, model-confidential, and lossless properties.

Enhanced localized conformal prediction with imperfect auxiliary information

stat.ME · 2026-06-07 · unverdicted · novelty 6.0

ELCP integrates auxiliary data with a density-ratio-weighted kernel to enhance localized conformal prediction sets, maintaining marginal coverage and improving asymptotic local coverage.

DIST-FL: Enhancing Security for TEE-based Aggregation in Federated Learning

cs.CR · 2026-06-03 · unverdicted · novelty 6.0

DIST-FL distributes TEE-guarded servers into an append-only ledger to ensure linearizable FL aggregation and counter rollback plus I/O attacks while matching single-TEE speed.

FlashbackCL: Mitigating Temporal Forgetting in Federated Learning

cs.LG · 2026-06-02 · unverdicted · novelty 6.0

FlashbackCL adds time-decayed label counts, class-balanced replay, and coreset curation to Flashback, yielding 6.9-10% gains and up to 68% less temporal forgetting on CIFAR-10 under controlled shifts.

Dimensionality Reduction for Robust Federated Learning: A Theoretical Analysis and Convergence Guarantee

cs.LG · 2026-05-27 · unverdicted · novelty 6.0

PDR uses sparse random projection to reduce server computation for Byzantine-robust FL aggregation to O(Mp) while preserving near-optimal convergence rates up to a tunable error inflation factor of (1+ε)/(1-ε).

Nonlinear Data Integration via Kernel Methods for Data Collaboration Analysis

cs.LG · 2026-05-26 · unverdicted · novelty 6.0

Nonlinear kernel integration (NKI) with graph regularization enables accurate alignment of nonlinearly obfuscated decentralized data representations, outperforming linear methods on image classification.

Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning

stat.ML · 2026-05-18 · unverdicted · novelty 6.0

Introduces FedHybrid and FedNewton for DP federated M-estimation, with finite-sample MSE bounds, minimax lower bound, and evaluations on vision datasets.

Provable Sparse Inversion and Token Relabel Enhanced One-shot Federated Learning with ViTs

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

FedMITR uses sparse model inversion and token relabeling to improve one-shot federated learning with ViTs under non-IID conditions, delivering a tighter generalization bound via algorithmic stability analysis and better empirical performance.

Adversary-Robust Learning from Fully Asynchronous Directional Derivative Estimates

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

FAR-SIGN achieves adversary-resilient fully asynchronous optimization via signed directional projections and two-timescale correction, with almost-sure convergence to stationary points at rates O(n^{-1/4+ε}) first-order and O(n^{-1/6+ε}) zeroth-order.

citing papers explorer

Showing 31 of 31 citing papers after filters.

High-Probability Convergence Guarantees of Decentralized SGD cs.LG · 2025-10-07 · unverdicted · none · ref 6 · 2 links · internal anchor
Decentralized SGD achieves high-probability convergence with order-optimal rates and linear speedup under standard cost assumptions matching those for MSE convergence.
Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven cs.LG · 2026-05-07 · unverdicted · none · ref 24 · internal anchor
Two randomized Hadamard transforms suffice to make coordinate marginals O(d^{-1/2})-close to Gaussian for most quantization methods, with three needed for vector quantization to match uniform random rotations asymptotically.
Scaling Federated Linear Contextual Bandits via Sketching cs.LG · 2026-05-01 · unverdicted · none · ref 23 · internal anchor
FSCLB scales federated linear contextual bandits with sketching to achieve over 90% lower computation and communication costs while preserving a near-optimal regret bound of O(sqrt(l d T)).
SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening cs.LG · 2025-10-09 · accept · none · ref 21 · internal anchor
SketchGuard decouples Byzantine filtering from aggregation in decentralized federated learning by exchanging k-dimensional Count Sketches for screening and full models only from accepted neighbors, achieving up to 50-70% communication savings while proving convergence and matching SOTA robustness.
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer cs.LG · 2019-10-23 · unverdicted · none · ref 37 · internal anchor
T5 casts all NLP tasks as text-to-text generation, systematically explores pre-training choices, and reaches strong performance on summarization, QA, classification and other tasks via large-scale training on the Colossal Clean Crawled Corpus.
Unifying Local Communications and Local Updates for LLM Pretraining cs.LG · 2026-06-09 · unverdicted · none · ref 18 · internal anchor
GASLoC generalizes communication acceleration to the outer optimizer to enable gossip-based decentralized LLM pretraining that supports adaptive optimizers, local steps, and outperforms prior decentralized methods on standard tasks while matching DiLoCo in multi-step regimes.
Privacy-Preserving Credit Risk Prediction with Alternative Data cs.LG · 2026-06-09 · unverdicted · none · ref 70 · internal anchor
PrivacyCredit is a machine learning method that combines traditional and alternative data for credit risk prediction while satisfying privacy-preserving, model-confidential, and lossless properties.
FlashbackCL: Mitigating Temporal Forgetting in Federated Learning cs.LG · 2026-06-02 · unverdicted · none · ref 2 · internal anchor
FlashbackCL adds time-decayed label counts, class-balanced replay, and coreset curation to Flashback, yielding 6.9-10% gains and up to 68% less temporal forgetting on CIFAR-10 under controlled shifts.
Dimensionality Reduction for Robust Federated Learning: A Theoretical Analysis and Convergence Guarantee cs.LG · 2026-05-27 · unverdicted · none · ref 16 · internal anchor
PDR uses sparse random projection to reduce server computation for Byzantine-robust FL aggregation to O(Mp) while preserving near-optimal convergence rates up to a tunable error inflation factor of (1+ε)/(1-ε).
Nonlinear Data Integration via Kernel Methods for Data Collaboration Analysis cs.LG · 2026-05-26 · unverdicted · none · ref 12 · internal anchor
Nonlinear kernel integration (NKI) with graph regularization enables accurate alignment of nonlinearly obfuscated decentralized data representations, outperforming linear methods on image classification.
Provable Sparse Inversion and Token Relabel Enhanced One-shot Federated Learning with ViTs cs.LG · 2026-05-11 · unverdicted · none · ref 2 · internal anchor
FedMITR uses sparse model inversion and token relabeling to improve one-shot federated learning with ViTs under non-IID conditions, delivering a tighter generalization bound via algorithmic stability analysis and better empirical performance.
Adversary-Robust Learning from Fully Asynchronous Directional Derivative Estimates cs.LG · 2026-05-10 · unverdicted · none · ref 194 · internal anchor
FAR-SIGN achieves adversary-resilient fully asynchronous optimization via signed directional projections and two-timescale correction, with almost-sure convergence to stationary points at rates O(n^{-1/4+ε}) first-order and O(n^{-1/6+ε}) zeroth-order.
Modulated learning for private and distributed regression with just a single sample per client device cs.LG · 2026-05-08 · unverdicted · none · ref 2 · 2 links · internal anchor
Introduces modulated learning for private distributed regression allowing one sample per client via calibrated noise injection on samples and aggregation of transformed representations to achieve unbiased gradients in expectation.
Response Time Enhances Alignment with Heterogeneous Preferences cs.LG · 2026-05-07 · unverdicted · none · ref 221 · internal anchor
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
DeepFedNAS: Efficient Hardware-Aware Architecture Adaptation for Heterogeneous IoT Federations via Pareto-Guided Supernet Training cs.LG · 2026-01-21 · unverdicted · none · ref 16 · internal anchor
DeepFedNAS delivers up to 1.21% higher accuracy and 61x faster architecture search for federated learning on heterogeneous IoT by replacing random supernet sampling with Pareto-optimal elite architectures and using a multi-objective fitness function as a zero-cost proxy.
DFedReweighting: A Unified Framework for Objective-Oriented Reweighting in Decentralized Federated Learning cs.LG · 2025-12-12 · unverdicted · none · ref 20 · internal anchor
DFedReweighting is a unified reweighting method for decentralized federated learning that customizes aggregation via target metrics and strategies to improve fairness, Byzantine robustness, and other objectives while proving linear convergence under standard assumptions.
Federated Learning with Nonvacuous Generalisation Bounds cs.LG · 2023-10-17 · unverdicted · none · ref 31 · internal anchor
Federated learning trains private local randomised predictors whose aggregation yields a global predictor with nonvacuous PAC-Bayesian generalisation bounds and near-centralized accuracy.
Federated Learning with Non-IID Data cs.LG · 2018-06-02 · conditional · none · ref 10 · internal anchor
Non-IID data causes up to 55% accuracy loss in federated learning due to weight divergence measured by earth mover's distance; 5% globally shared data recovers 30% accuracy on CIFAR-10.
Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback cs.LG · 2026-05-08 · unverdicted · none · ref 17 · internal anchor
SPEAR enables online federated LLM fine-tuning by using feedback-guided self-play to create contrastive pairs trained with maximum likelihood on correct completions and confidence-weighted unlikelihood on incorrect ones, outperforming baselines without ground-truth contexts.
Subspace Optimization for Efficient Federated Learning under Heterogeneous Data cs.LG · 2026-04-28 · unverdicted · none · ref 11 · internal anchor
SSF enables efficient federated learning under heterogeneous data by optimizing in a low-dimensional subspace with projected corrections and backfill updates, achieving a non-asymptotic convergence rate of order O~(1/T + 1/sqrt(NKT)).
FED-FSTQ: Fisher-Guided Token Quantization for Communication-Efficient Federated Fine-Tuning of LLMs on Edge Devices cs.LG · 2026-04-28 · unverdicted · none · ref 45 · 3 links · internal anchor
Fed-FSTQ uses Fisher-guided token quantization to cut uplink traffic 46-fold and improve straggler-limited time-to-accuracy by 52% versus Fed-LoRA in non-IID multilingual and medical QA tasks.
PubSwap: Public-Data Off-Policy Coordination for Federated RLVR cs.LG · 2026-04-14 · unverdicted · none · ref 9 · internal anchor
PubSwap uses a small public dataset for selective off-policy response swapping in federated RLVR to improve coordination and performance over standard baselines on math and medical reasoning tasks.
Representation-Aligned Multi-Scale Personalization for Federated Learning cs.LG · 2026-04-13 · unverdicted · none · ref 12 · internal anchor
FRAMP generates client-specific models from compact descriptors in federated learning, trains tailored submodels, and aligns representations to balance personalization with global consistency.
Communication-Efficient Gluon in Federated Learning cs.LG · 2026-04-12 · unverdicted · none · ref 18 · internal anchor
Compressed Gluon variants using unbiased/contraction compressors and SARAH-style variance reduction achieve convergence guarantees and lower communication costs in federated learning under layer-wise smoothness.
Forgetting to Witness: Efficient Federated Unlearning and Its Visible Evaluation cs.LG · 2026-04-06 · unverdicted · none · ref 1 · internal anchor
A complete pipeline for federated unlearning via knowledge distillation for efficient removal and a GAN-integrated classifier for visual evaluation of forgetting capacity.
Decentralized Rank Scheduling for Energy-Constrained Multi-Task Federated Fine-Tuning in Edge-Assisted IoV Networks cs.LG · 2025-08-13 · unverdicted · none · ref 10 · internal anchor
Presents a hierarchical energy-aware framework with UCB-DUAL bandit for decentralized rank scheduling in multi-task federated fine-tuning for IoV networks.
BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning cs.LG · 2024-07-12 · unverdicted · none · ref 1 · internal anchor
BoBa uses data distribution inference and overlapping clustering with voting to detect backdoor attacks in non-IID federated learning, claiming attack success rates below 0.001.
FoggyTrust: Robust Federated Learning with Hierarchical Trust Networks cs.LG · 2026-06-26 · unverdicted · none · ref 1 · internal anchor
FoggyTrust is a hierarchical extension of FLTrust that localizes trust computation to fog nodes and combines it with heterogeneity-aware optimizers, reporting over 50% gains on CIFAR-10 under Krum and Trim attacks.
Centralized vs Decentralized Federated Learning: A trade-off performance analysis cs.LG · 2026-05-15 · unverdicted · none · ref 13 · internal anchor
Experimental analysis of performance trade-offs across CFL, DFL, and SDFL using Fedstellar simulator, MNIST, and MLP.
Federated Learning for Global Carbon Emission Forecasting: A Hybrid Time-Series Approach with Statistical and Neural Models cs.LG · 2026-06-21 · unverdicted · none · ref 24 · internal anchor
A hybrid federated framework using ARIMA-GARCH, LSTM-Attention, and XGBoost reports average R² of 0.73, RMSE of 1.21, and MAPE of 6.5% for carbon emission forecasting across 14 clients.
Knowledge Distillation in Federated Learning: a Survey on Long Lasting Challenges and New Solutions cs.LG · 2024-06-16 · unverdicted · none · ref 73 · internal anchor
A survey organizing knowledge distillation techniques for addressing privacy, heterogeneity, communication, and personalization challenges in federated learning.

Federated Learning: Strategies for Improving Communication Efficiency

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer