pith. sign in

arxiv: 2311.11342 · v5 · pith:WFKQITJ7new · submitted 2023-11-19 · 💻 cs.LG · cs.DC· math.OC

On the Communication Complexity of Decentralized Stochastic Bilevel Optimization

classification 💻 cs.LG cs.DCmath.OC
keywords bilevelconvergenceoptimizationstochasticalgorithmscommunicationdecentralizedheterogeneous
0
0 comments X
read the original abstract

Stochastic bilevel optimization finds widespread applications in machine learning, including meta-learning, hyperparameter optimization, and neural architecture search. To extend stochastic bilevel optimization to distributed data, several decentralized stochastic bilevel optimization algorithms have been developed. However, existing methods often suffer from slow convergence rates and high communication costs in heterogeneous settings, limiting their applicability to real-world tasks. To address these issues, we propose two novel decentralized stochastic bilevel gradient descent algorithms based on \textit{simultaneous} and \textit{alternating} update strategies. Our algorithms can achieve faster convergence rates and lower communication costs than existing methods. Importantly, our convergence analyses do not rely on strong assumptions regarding heterogeneity. More importantly, our theoretical analyses clearly disclose how the computation and communication regarding the Hessian-inverse-vector product under the heterogeneous setting affects the convergence rate. To the best of our knowledge, this is the first time such favorable theoretical results have been achieved with mild assumptions in the heterogeneous setting. Furthermore, we demonstrate how to establish the convergence rate for the alternating update strategy when combined with the variance-reduced gradient. Finally, experimental results confirm the efficacy of our algorithms.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DUET: Decentralized Bilevel Optimization without Lower-Level Strong Convexity

    math.OC 2026-06 unverdicted novelty 7.0

    DUET achieves O(1/T^{1-5p-11/4 τ}) iteration complexity for approximate KKT-stationary points in decentralized bilevel optimization without lower-level strong convexity, using gradient tracking for data heterogeneity.

  2. S$^3$LDBO: A Snapshot Single-Loop Algorithm for Decentralized Bilevel Optimization

    math.OC 2026-05 unverdicted novelty 6.0

    S³LDBO is a snapshot single-loop algorithm for decentralized bilevel optimization that reduces computational cost via intermittent derivative skipping and provides ergodic and high-probability nonergodic iteration com...