arXiv preprint arXiv:2502.08206 , year=

Optimizing · arXiv 2502.08206

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and bounded heterogeneity.

FedQueue: Queue-Aware Federated Learning for Cross-Facility HPC Training

cs.DC · 2026-05-04 · unverdicted · novelty 6.0

FedQueue predicts per-facility queue delays, buffers late arrivals via cutoffs, and uses staleness-aware aggregation to achieve O(1/sqrt(R)) convergence and 20.5% real-world improvement in cross-facility HPC federated learning.

Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction

math.OC · 2026-05-09 · unverdicted · novelty 5.0

Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.

citing papers explorer

Showing 3 of 3 citing papers.

Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity cs.LG · 2026-05-13 · unverdicted · none · ref 192
Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and bounded heterogeneity.
FedQueue: Queue-Aware Federated Learning for Cross-Facility HPC Training cs.DC · 2026-05-04 · unverdicted · none · ref 12
FedQueue predicts per-facility queue delays, buffers late arrivals via cutoffs, and uses staleness-aware aggregation to achieve O(1/sqrt(R)) convergence and 20.5% real-world improvement in cross-facility HPC federated learning.
Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction math.OC · 2026-05-09 · unverdicted · none · ref 72
Rennala MVR improves time complexity over Rennala SGD for smooth nonconvex stochastic optimization in heterogeneous parallel systems under a mean-squared smoothness assumption.

arXiv preprint arXiv:2502.08206 , year=

fields

years

verdicts

representative citing papers

citing papers explorer