Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and bounded heterogeneity.
arXiv preprint arXiv:2502.08206 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
FedQueue predicts per-facility queue delays, buffers late arrivals via cutoffs, and uses staleness-aware aggregation to achieve O(1/sqrt(R)) convergence and 20.5% real-world improvement in cross-facility HPC federated learning.
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.
citing papers explorer
-
Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity
Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and bounded heterogeneity.
-
FedQueue: Queue-Aware Federated Learning for Cross-Facility HPC Training
FedQueue predicts per-facility queue delays, buffers late arrivals via cutoffs, and uses staleness-aware aggregation to achieve O(1/sqrt(R)) convergence and 20.5% real-world improvement in cross-facility HPC federated learning.
-
Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.