pith. sign in

On the Communication Complexity of Decentralized Stochastic Bilevel Optimization

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Stochastic bilevel optimization finds widespread applications in machine learning, including meta-learning, hyperparameter optimization, and neural architecture search. To extend stochastic bilevel optimization to distributed data, several decentralized stochastic bilevel optimization algorithms have been developed. However, existing methods often suffer from slow convergence rates and high communication costs in heterogeneous settings, limiting their applicability to real-world tasks. To address these issues, we propose two novel decentralized stochastic bilevel gradient descent algorithms based on \textit{simultaneous} and \textit{alternating} update strategies. Our algorithms can achieve faster convergence rates and lower communication costs than existing methods. Importantly, our convergence analyses do not rely on strong assumptions regarding heterogeneity. More importantly, our theoretical analyses clearly disclose how the computation and communication regarding the Hessian-inverse-vector product under the heterogeneous setting affects the convergence rate. To the best of our knowledge, this is the first time such favorable theoretical results have been achieved with mild assumptions in the heterogeneous setting. Furthermore, we demonstrate how to establish the convergence rate for the alternating update strategy when combined with the variance-reduced gradient. Finally, experimental results confirm the efficacy of our algorithms.

fields

math.OC 1

years

2026 1

verdicts

UNVERDICTED 1

clear filters

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • S$^3$LDBO: A Snapshot Single-Loop Algorithm for Decentralized Bilevel Optimization math.OC · 2026-05-29 · unverdicted · none · ref 30 · internal anchor

    S³LDBO is a snapshot single-loop algorithm for decentralized bilevel optimization that reduces computational cost via intermittent derivative skipping and provides ergodic and high-probability nonergodic iteration complexity bounds in deterministic settings.