FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.
Fedmuon: Accelerating federated learning with matrix orthogonalization
7 Pith papers cite this work. Polarity classification is still indexing.
years
2026 7verdicts
UNVERDICTED 7representative citing papers
DP-FedAdamW delivers an unbiased second-moment estimator for AdamW in DPFL, proving linear convergence acceleration without heterogeneity assumptions and outperforming SOTA by 5.83% on Tiny-ImageNet with Swin-Base at ε=1.
SUDA-Muon modularizes decentralized Muon via the SUDA template, proving a topology-separated convergence rate of O((1+σ/√N)K^{-1/4}) in nuclear-norm geometry while establishing that tracking-before-polarization is required to avoid non-stationary fixed points and that local-polarize-then-average is
LA-LoRA decouples LoRA matrix updates in DPFL settings to improve robustness to privacy noise, delivering up to 16.83% higher accuracy than prior LoRA variants on Swin-B under strict epsilon=1.
SSF enables efficient federated learning under heterogeneous data by optimizing in a low-dimensional subspace with projected corrections and backfill updates, achieving a non-asymptotic convergence rate of order O~(1/T + 1/sqrt(NKT)).
FedSLoP reduces communication and memory costs in federated learning through stochastic low-rank gradient projections, with a nonconvex convergence rate of O(1/sqrt(NT)) and competitive accuracy on heterogeneous MNIST data.
FedNSAM uses global Nesterov momentum to make local flatness consistent with global flatness in federated learning, yielding tighter convergence than FedSAM and better empirical performance.
citing papers explorer
-
FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.
-
DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models
DP-FedAdamW delivers an unbiased second-moment estimator for AdamW in DPFL, proving linear convergence acceleration without heterogeneity assumptions and outperforming SOTA by 5.83% on Tiny-ImageNet with Swin-Base at ε=1.
-
SUDA-Muon: Structural Design Principles and Boundaries for Fully Decentralized Muon
SUDA-Muon modularizes decentralized Muon via the SUDA template, proving a topology-separated convergence rate of O((1+σ/√N)K^{-1/4}) in nuclear-norm geometry while establishing that tracking-before-polarization is required to avoid non-stationary fixed points and that local-polarize-then-average is
-
Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models
LA-LoRA decouples LoRA matrix updates in DPFL settings to improve robustness to privacy noise, delivering up to 16.83% higher accuracy than prior LoRA variants on Swin-B under strict epsilon=1.
-
Subspace Optimization for Efficient Federated Learning under Heterogeneous Data
SSF enables efficient federated learning under heterogeneous data by optimizing in a low-dimensional subspace with projected corrections and backfill updates, achieving a non-asymptotic convergence rate of order O~(1/T + 1/sqrt(NKT)).
-
FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection
FedSLoP reduces communication and memory costs in federated learning through stochastic low-rank gradient projections, with a nonconvex convergence rate of O(1/sqrt(NT)) and competitive accuracy on heterogeneous MNIST data.
-
FedNSAM:Consistency of Local and Global Flatness for Federated Learning
FedNSAM uses global Nesterov momentum to make local flatness consistent with global flatness in federated learning, yielding tighter convergence than FedSAM and better empirical performance.