Fedmuon: Accelerating federated learning with matrix orthogonalization

Junkang Liu, Yuanyuan Liu, Fanhua Shang, Hongying Liu, Jin Liu, Wei Feng · 1907 · arXiv 2510.27403

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

representative citing papers

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning

cs.LG · 2026-03-05 · unverdicted · novelty 7.0

FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.

DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models

cs.LG · 2026-02-23 · unverdicted · novelty 7.0

DP-FedAdamW delivers an unbiased second-moment estimator for AdamW in DPFL, proving linear convergence acceleration without heterogeneity assumptions and outperforming SOTA by 5.83% on Tiny-ImageNet with Swin-Base at ε=1.

SUDA-Muon: Structural Design Principles and Boundaries for Fully Decentralized Muon

math.OC · 2026-04-27 · unverdicted · novelty 6.0

SUDA-Muon modularizes decentralized Muon via the SUDA template, proving a topology-separated convergence rate of O((1+σ/√N)K^{-1/4}) in nuclear-norm geometry while establishing that tracking-before-polarization is required to avoid non-stationary fixed points and that local-polarize-then-average is

Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models

cs.LG · 2026-02-23 · unverdicted · novelty 6.0

LA-LoRA decouples LoRA matrix updates in DPFL settings to improve robustness to privacy noise, delivering up to 16.83% higher accuracy than prior LoRA variants on Swin-B under strict epsilon=1.

Subspace Optimization for Efficient Federated Learning under Heterogeneous Data

cs.LG · 2026-04-28 · unverdicted · novelty 5.0

SSF enables efficient federated learning under heterogeneous data by optimizing in a low-dimensional subspace with projected corrections and backfill updates, achieving a non-asymptotic convergence rate of order O~(1/T + 1/sqrt(NKT)).

FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection

cs.LG · 2026-04-27 · unverdicted · novelty 5.0

FedSLoP reduces communication and memory costs in federated learning through stochastic low-rank gradient projections, with a nonconvex convergence rate of O(1/sqrt(NT)) and competitive accuracy on heterogeneous MNIST data.

FedNSAM:Consistency of Local and Global Flatness for Federated Learning

cs.LG · 2026-02-27 · unverdicted · novelty 4.0

FedNSAM uses global Nesterov momentum to make local flatness consistent with global flatness in federated learning, yielding tighter convergence than FedSAM and better empirical performance.

citing papers explorer

Showing 7 of 7 citing papers.

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning cs.LG · 2026-03-05 · unverdicted · none · ref 33
FedBCGD reduces communication in federated learning by a factor of 1/N through block-wise parameter updates with accelerated convergence guarantees.
DP-FedAdamW: An Efficient Optimizer for Differentially Private Federated Large Models cs.LG · 2026-02-23 · unverdicted · none · ref 42
DP-FedAdamW delivers an unbiased second-moment estimator for AdamW in DPFL, proving linear convergence acceleration without heterogeneity assumptions and outperforming SOTA by 5.83% on Tiny-ImageNet with Swin-Base at ε=1.
SUDA-Muon: Structural Design Principles and Boundaries for Fully Decentralized Muon math.OC · 2026-04-27 · unverdicted · none · ref 14
SUDA-Muon modularizes decentralized Muon via the SUDA template, proving a topology-separated convergence rate of O((1+σ/√N)K^{-1/4}) in nuclear-norm geometry while establishing that tracking-before-polarization is required to avoid non-stationary fixed points and that local-polarize-then-average is
Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models cs.LG · 2026-02-23 · unverdicted · none · ref 14
LA-LoRA decouples LoRA matrix updates in DPFL settings to improve robustness to privacy noise, delivering up to 16.83% higher accuracy than prior LoRA variants on Swin-B under strict epsilon=1.
Subspace Optimization for Efficient Federated Learning under Heterogeneous Data cs.LG · 2026-04-28 · unverdicted · none · ref 14
SSF enables efficient federated learning under heterogeneous data by optimizing in a low-dimensional subspace with projected corrections and backfill updates, achieving a non-asymptotic convergence rate of order O~(1/T + 1/sqrt(NKT)).
FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection cs.LG · 2026-04-27 · unverdicted · none · ref 21
FedSLoP reduces communication and memory costs in federated learning through stochastic low-rank gradient projections, with a nonconvex convergence rate of O(1/sqrt(NT)) and competitive accuracy on heterogeneous MNIST data.
FedNSAM:Consistency of Local and Global Flatness for Federated Learning cs.LG · 2026-02-27 · unverdicted · none · ref 24
FedNSAM uses global Nesterov momentum to make local flatness consistent with global flatness in federated learning, yielding tighter convergence than FedSAM and better empirical performance.

Fedmuon: Accelerating federated learning with matrix orthogonalization

fields

years

verdicts

representative citing papers

citing papers explorer