arxiv: 2604.20115 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.AI· stat.ML

Recognition: unknown

On the Stability and Generalization of First-order Bilevel Minimax Optimization

Xuelin Zhang , Peipei Yuan

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:21 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords bilevel optimizationminimax optimizationgeneralization boundsalgorithmic stabilitystochastic gradient descent-ascenthyperparameter optimization

0 comments

The pith

Algorithmic stability arguments yield the first generalization bounds for first-order bilevel minimax solvers

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to close the gap between convergence guarantees and generalization performance for bilevel minimax optimization algorithms used in hyperparameter tuning and reinforcement learning. It focuses on first-order methods and shows that their tendency to overfit can be controlled by measuring how much small changes in training data affect the output solution. Applying stability analysis produces explicit bounds on the gap between training and test performance for a single-timescale stochastic gradient descent-ascent method and two variants of its two-timescale counterpart. The bounds make visible a concrete trade-off that links stability level, generalization error, and choices such as step-size ratios and iteration counts. Experiments on realistic bilevel minimax tasks confirm that the predicted relationships hold in practice.

Core claim

The central claim is that algorithmic stability can be used to derive fine-grained generalization bounds for three representative first-order gradient-based solvers of bilevel minimax problems with lower-level minimax structure: single-timescale stochastic gradient descent-ascent together with two variants of two-timescale stochastic gradient descent-ascent, and that these bounds exhibit a precise trade-off among algorithmic stability, generalization gaps, and practical algorithmic settings.

What carries the argument

Algorithmic stability arguments that quantify how much the learned solution changes when a single training example is replaced, applied to the three bilevel minimax solvers

If this is right

The generalization gap of these solvers is provably controlled by their stability, which in turn depends on step-size and timescale choices.
Algorithm designers can trade stability against convergence speed by adjusting the single- versus two-timescale parameters.
The same stability lens applies to any bilevel minimax instance arising in hyperparameter optimization or reinforcement learning.
Bounds become tighter when the algorithm is run with smaller step sizes that increase stability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same stability technique could be applied to bilevel problems whose lower level is not itself minimax.
The identified stability-generalization trade-off supplies a concrete criterion for selecting step sizes when generalization rather than training loss is the goal.
Extending the analysis beyond smoothness assumptions would allow the bounds to cover a wider range of practical non-smooth bilevel tasks.

Load-bearing premise

The bilevel objective functions satisfy smoothness together with appropriate convexity or concavity properties that make stability bounds possible.

What would settle it

An experiment on a bilevel minimax task in which the measured generalization gap exceeds the stability-derived bound by more than the allowed slack under the stated regularity conditions.

Figures

Figures reproduced from arXiv: 2604.20115 by Peipei Yuan, Xuelin Zhang.

**Figure 1.** Figure 1: Generalization validation with varying T and K. convergence and generalization, as reflected in the setting of the maximum number of iterations (Bao et al., 2021). (b) Influence of sample size m1 and step sizes η on generalization [PITH_FULL_IMAGE:figures/full_fig_p023_1.png] view at source ↗

**Figure 2.** Figure 2: Generalization validation with varying m1 and K (with fixed T = 50). rates of 0.95 and 0.85, respectively. The result of [PITH_FULL_IMAGE:figures/full_fig_p024_2.png] view at source ↗

**Figure 3.** Figure 3: Generalization Gap with varying step size [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗

**Figure 4.** Figure 4: Generalization validation with varying η and T (TSGDA-1 with K = 1). SSGDA, we further bound its optimization error and excess risks, highlighting BMO’s intrinsic complexity. Deriving fine-grained lower bounds for BMO and MBO is left for our future work. References J. Bracken, J. T. McGill, Mathematical programs with optimization problems in the constraints, Operations Research 21 (1973) 37–44. K. Ji, J. … view at source ↗

read the original abstract

Bilevel optimization and bilevel minimax optimization have recently emerged as unifying frameworks for a range of machine-learning tasks, including hyperparameter optimization and reinforcement learning. The existing literature focuses on empirical efficiency and convergence guarantees, leaving a critical theoretical gap in understanding how well these algorithms generalize. To bridge this gap, we provide the first systematic generalization analysis for first-order gradient-based bilevel minimax solvers with lower-level minimax problems. Specifically, by leveraging algorithmic stability arguments, we derive fine-grained generalization bounds for three representative algorithms, including single-timescale stochastic gradient descent-ascent, and two variants of two-timescale stochastic gradient descent-ascent. Our results reveal a precise trade-off among algorithmic stability, generalization gaps, and practical settings. Furthermore, extensive empirical evaluations corroborate our theoretical insights on realistic optimization tasks with bilevel minimax structures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives the first stability-based generalization bounds for three first-order bilevel minimax solvers, but the bounds rest on unstated calmness conditions that likely fail for non-unique lower-level solutions.

read the letter

The core contribution is a systematic stability argument that produces generalization bounds for single-timescale stochastic gradient descent-ascent and two variants of two-timescale versions in the bilevel minimax setting. That is genuinely new; prior work stayed at convergence rates or empirical checks, so this organizes the stability-generalization trade-off for these specific algorithms. The empirical section on realistic tasks also lines up with the theory and shows the bounds are not vacuous in practice. Credit for shipping concrete bounds rather than hand-waving about generalization. The soft spot is exactly the one in the stress test. When the lower-level problem is only convex-concave, the solution set is a convex set, not a point, so the implicit gradient is set-valued. Standard smoothness does not automatically give the uniform Lipschitz or calmness constant needed to keep the stability parameter finite. The abstract and the claimed assumptions do not mention or verify this extra condition, which means the derived bounds can become unbounded on problems like GAN training or robust RL where non-uniqueness is common. If the proofs simply invoke the usual strong-convexity/concavity package without the extra step, the main claim only holds in a narrower regime than advertised. This is a real but fixable gap rather than a fatal flaw. The paper is aimed at theorists who already work on bilevel optimization and algorithmic stability. A serious referee should see it because the topic matters and the approach is coherent on its own terms, even if the lower-level handling needs tightening. I would send it out for review rather than desk-reject, but with a clear request to add the calmness argument or restrict the claims.

Referee Report

1 major / 2 minor

Summary. The manuscript provides the first systematic generalization analysis for first-order gradient-based bilevel minimax solvers. Leveraging algorithmic stability, it derives fine-grained generalization bounds for single-timescale SGDA and two variants of two-timescale SGDA, highlighting trade-offs with practical settings and supporting with empirical evaluations on bilevel minimax tasks.

Significance. This work is significant in bridging the gap between empirical success and theoretical understanding of generalization in bilevel minimax optimization, relevant to hyperparameter optimization and RL. The stability-based bounds offer insights into algorithm design, and the empirical corroboration adds credibility. Credit is given for addressing a new problem class with standard tools without circularity.

major comments (1)

[Assumptions and theoretical analysis (Sections 3-4)] The central stability argument for generalization may not control the implicit gradient error when lower-level saddle points are non-unique (i.e., only convex-concave rather than strongly convex-concave). The manuscript uses standard smoothness and convexity/concavity assumptions without an explicit calmness or uniform Lipschitz condition on the solution mapping. This is load-bearing for the bounds in the main theorems on the three algorithms, as the effective gradient could be set-valued, making the derived stability parameter potentially unbounded for problems like GAN training.

minor comments (2)

[Abstract] The abstract claims 'fine-grained' bounds but provides no explicit form or dependence; while the full paper likely details them, a high-level indication would improve accessibility.
[Empirical evaluations section] The empirical evaluations corroborate insights, but more details on error bars or statistical significance would strengthen the presentation.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The feedback on the assumptions underlying our stability analysis is particularly valuable, and we address it in detail below. We believe incorporating this input will strengthen the theoretical contributions.

read point-by-point responses

Referee: [Assumptions and theoretical analysis (Sections 3-4)] The central stability argument for generalization may not control the implicit gradient error when lower-level saddle points are non-unique (i.e., only convex-concave rather than strongly convex-concave). The manuscript uses standard smoothness and convexity/concavity assumptions without an explicit calmness or uniform Lipschitz condition on the solution mapping. This is load-bearing for the bounds in the main theorems on the three algorithms, as the effective gradient could be set-valued, making the derived stability parameter potentially unbounded for problems like GAN training.

Authors: We thank the referee for identifying this important subtlety in our analysis. Our manuscript indeed relies on standard smoothness and convex-concave assumptions for the lower-level problem without explicitly stating a calmness or uniform Lipschitz condition on the solution mapping. This omission means that when saddle points are non-unique, the implicit gradient may not be well-defined as a single-valued Lipschitz map, potentially rendering the stability parameter unbounded in the generalization bounds for the three algorithms considered. We agree this is a load-bearing assumption for the main theorems. To resolve the issue, we will revise the manuscript by adding an explicit uniform calmness assumption on the lower-level solution mapping (a standard condition in bilevel optimization to ensure Lipschitz continuity of the implicit gradient). Under this strengthened assumption, the stability arguments and derived bounds remain valid. We will also include a new remark discussing that strong convexity-concavity implies calmness automatically, while for applications such as GAN training, additional regularization may be required to satisfy the condition. This is a partial revision focused on clarifying and strengthening the assumptions without changing the core algorithmic results or empirical findings. revision: partial

Circularity Check

0 steps flagged

No significant circularity in stability-based generalization analysis

full rationale

The paper applies established algorithmic stability results to derive generalization bounds for three first-order bilevel minimax algorithms. The abstract and described claims show a direct use of standard stability arguments under smoothness and convexity/concavity assumptions, without any reduction of the bounds to fitted parameters, self-referential definitions, or load-bearing self-citations that collapse the central claim. The derivation remains independent of the target result and self-contained against external benchmarks for algorithmic stability.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are specified in the abstract; the work builds on standard algorithmic stability arguments from prior optimization literature.

pith-pipeline@v0.9.0 · 5440 in / 1004 out tokens · 39244 ms · 2026-05-10T01:21:10.151703+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

126 extracted references · 11 canonical work pages · 1 internal anchor

[1]

arXiv preprint arXiv:2401.14655 , year=

Distributionally robust optimization and robust statistics , author=. arXiv preprint arXiv:2401.14655 , year=

work page arXiv
[2]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Meta-weight-net: Learning an explicit mapping for sample weighting , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=
[3]

Towards Deep Learning Models Resistant to Adversarial Attacks

Towards deep learning models resistant to adversarial attacks , author=. arXiv preprint arXiv:1706.06083 , year=

work page internal anchor Pith review arXiv
[4]

International conference on machine learning , pages=

Wasserstein generative adversarial networks , author=. International conference on machine learning , pages=. 2017 , organization=

2017
[5]

Proceedings of the National Academy of Sciences , volume=

Manifold fitting with CycleGAN , author=. Proceedings of the National Academy of Sciences , volume=. 2024 , publisher=

2024
[6]

Optimization , volume=

Optimality conditions for bilevel programming problems , author=. Optimization , volume=. 1995 , publisher=

1995
[7]

arXiv preprint arXiv:2110.13755 , year=

Scholtes relaxation method for pessimistic bilevel optimization , author=. arXiv preprint arXiv:2110.13755 , year=

work page arXiv
[8]

The Second Conference on Parsimony and Learning (Proceedings Track) , year=

AdaProx: A Novel Method for Bilevel Optimization under Pessimistic Framework , author=. The Second Conference on Parsimony and Learning (Proceedings Track) , year=
[9]

Proceedings of the 2025 International Conference on Artificial Intelligence and Computational Intelligence , pages=

SemiBiGAN: Semi-Cycled Bi-Level Generative Adversarial Network for Face Super-Resolution , author=. Proceedings of the 2025 International Conference on Artificial Intelligence and Computational Intelligence , pages=

2025
[10]

2023 , eprint=

Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework , author=. 2023 , eprint=

2023
[11]

Proceedings of the 31st ACM International Conference on Multimedia , pages =

Liu, Yingchi and Liu, Zhu and Ma, Long and Liu, Jinyuan and Fan, Xin and Luo, Zhongxuan and Liu, Risheng , title =. Proceedings of the 31st ACM International Conference on Multimedia , pages =. 2023 , publisher =

2023
[12]

Advances in Neural Information Processing Systems , volume=

On the limitations of fractal dimension as a measure of generalization , author=. Advances in Neural Information Processing Systems , volume=
[13]

International Conference on Computational Learning Theory , pages=

Generalization error bounds using unlabeled data , author=. International Conference on Computational Learning Theory , pages=. 2005 , organization=

2005
[14]

Information Fusion , volume=

Stability-based PAC-Bayes analysis for multi-view learning algorithms , author=. Information Fusion , volume=. 2022 , publisher=

2022
[15]

Proceedings of the AAAI conference on artificial intelligence , volume=

Stability-based generalization analysis of the asynchronous decentralized SGD , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
[16]

The Visual Computer , volume=

Bilevel modeling investigated generative adversarial framework for image restoration , author=. The Visual Computer , volume=. 2023 , publisher=

2023
[17]

Advances in Neural Information Processing Systems , volume=

First-order minimax bilevel optimization , author=. Advances in Neural Information Processing Systems , volume=
[18]

International Conference on Learning Representations , year=

Min-max multi-objective bilevel optimization with applications in robust machine learning , author=. International Conference on Learning Representations , year=
[19]

Journal of machine learning research , volume=

Lower bounds and accelerated algorithms for bilevel optimization , author=. Journal of machine learning research , volume=
[20]

A stochastic momentum method for min-max bilevel optimization , author=. Proc. 13th Annu. Workshop Optim. Mach. Learn , year=
[21]

SIAM Journal on Optimization , volume=

Pessimistic bilevel optimization , author=. SIAM Journal on Optimization , volume=. 2013 , publisher=

2013
[22]

arXiv preprint arXiv:2406.01992 , year=

Overcoming lower-level constraints in bilevel optimization: A novel approach with regularized gap functions , author=. arXiv preprint arXiv:2406.01992 , year=

work page arXiv
[23]

Journal of Inequalities and Applications , volume=

Bilevel minimax theorems for non-continuous set-valued mappings , author=. Journal of Inequalities and Applications , volume=. 2014 , publisher=

2014
[24]

International Conference on Machine Learning , pages=

Stability and generalization of stochastic gradient methods for minimax problems , author=. International Conference on Machine Learning , pages=. 2021 , organization=

2021
[25]

International Conference on Learning Representations , year=

High probability generalization bounds with fast rates for minimax problems , author=. International Conference on Learning Representations , year=
[26]

International Conference on Machine Learning , pages=

Train simultaneously, generalize better: Stability of gradient-based minimax learners , author=. International Conference on Machine Learning , pages=. 2021 , organization=

2021
[27]

International Conference on Machine Learning , pages=

Federated minimax optimization: Improved convergence analyses and algorithms , author=. International Conference on Machine Learning , pages=. 2022 , organization=

2022
[28]

Advances in Neural Information Processing Systems , volume=

What is a Good Metric to Study Generalization of Minimax Learners? , author=. Advances in Neural Information Processing Systems , volume=
[29]

Advances in Neural Information Processing Systems , pages =

Fan Bao and Guoqiang Wu and Chongxuan Li and Jun Zhu and Bo Zhang , title =. Advances in Neural Information Processing Systems , pages =
[30]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence , pages =

Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization , author =. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence , pages =
[31]

arXiv preprint arXiv:2411.16081 , year=

Exploring the Generalization Capabilities of AID-based Bi-level Optimization , author=. arXiv preprint arXiv:2411.16081 , year=

work page arXiv
[32]

Advances in Neural Information Processing Systems , volume=

Lower bounds of uniform stability in gradient-based bilevel algorithms for hyperparameter optimization , author=. Advances in Neural Information Processing Systems , volume=
[33]

Advances in Neural Information Processing Systems , volume=

On the stability and generalization of meta-learning , author=. Advances in Neural Information Processing Systems , volume=
[34]

Advances in Neural Information Processing Systems , volume=

Generalization bounds for meta-learning: An information-theoretic analysis , author=. Advances in Neural Information Processing Systems , volume=
[35]

Advances in neural information processing systems , volume=

Generalization bounds for meta-learning via pac-bayes and uniform stability , author=. Advances in neural information processing systems , volume=
[36]

Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (

Pan, Xiaokang and Li, Xingyu and Liu, Jin and Sun, Tao and Sun, Kai and Chen, Lixing and Qu, Zhe , booktitle =. Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (. 2024 , volume =

2024
[37]

The Journal of Machine Learning Research , volume=

Learning an explicit hyper-parameter prediction function conditioned on tasks , author=. The Journal of Machine Learning Research , volume=. 2023 , publisher=

2023
[38]

Optimization Methods and Software , volume=

Bilevel optimization with a multi-objective lower-level problem: Risk-neutral and risk-averse formulations , author=. Optimization Methods and Software , volume=. 2024 , publisher=

2024
[39]

Journal of machine learning research , volume=

Efficiently escaping saddle points in bilevel optimization , author=. Journal of machine learning research , volume=
[40]

International Conference on Artificial Intelligence and Statistics , pages=

A single-timescale method for stochastic bilevel optimization , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

2022
[41]

Advances in Neural Information Processing Systems , volume=

Global convergence and variance reduction for a class of nonconvex-nonconcave minimax problems , author=. Advances in Neural Information Processing Systems , volume=
[42]

McGill , title =

Jerome Bracken and James T. McGill , title =. Operations Research , volume =
[43]

IEEE Transactions on Circuits and Systems for Video Technology , volume=

Improving the generalization of MAML in few-shot classification via bi-level constraint , author=. IEEE Transactions on Circuits and Systems for Video Technology , volume=. 2022 , publisher=

2022
[44]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Bi-level meta-learning for few-shot domain generalization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[45]

IEEE Transactions on Artificial Intelligence , year=

Safe multi-agent reinforcement learning with bilevel optimization in autonomous driving , author=. IEEE Transactions on Artificial Intelligence , year=
[46]

Journal of Machine Learning Research , volume=

Principled penalty-based methods for bilevel reinforcement learning and rlhf , author=. Journal of Machine Learning Research , volume=
[47]

arXiv preprint arXiv:2303.03944 , year=

On momentum-based gradient methods for bilevel optimization with nonconvex lower-level , author=. arXiv preprint arXiv:2303.03944 , year=

work page arXiv
[48]

Energies , volume=

On optimistic and pessimistic bilevel optimization models for demand response management , author=. Energies , volume=. 2021 , publisher=

2021
[49]

International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research , pages=

Decision-focused predictions via pessimistic bilevel optimization: a computational study , author=. International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research , pages=. 2024 , organization=

2024
[50]

ECAI 2024 , pages=

A first-order multi-gradient algorithm for multi-objective bi-level optimization , author=. ECAI 2024 , pages=. 2024 , publisher=

2024
[51]

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

Stackelberg games for adversarial prediction problems , author=. Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=
[52]

Advances in Neural Information Processing Systems , volume=

Multi-block min-max bilevel optimization with applications in multi-task deep auc maximization , author=. Advances in Neural Information Processing Systems , volume=
[53]

Applied Intelligence , volume=

Robust variable structure discovery based on tilted empirical risk minimization , author=. Applied Intelligence , volume=. 2023 , publisher=

2023
[54]

Pattern Recognition , volume=

The C-loss function for pattern classification , author=. Pattern Recognition , volume=. 2014 , publisher=

2014
[55]

The Annals of Statistics , volume=

Finite Sample Breakdown of M -and P -Estimators , author=. The Annals of Statistics , volume=. 1984 , publisher=

1984
[56]

Econometrica: Journal of the Econometric Society , pages=

Tail behavior of regression estimators and their breakdown points , author=. Econometrica: Journal of the Econometric Society , pages=. 1990 , publisher=

1990
[57]

Wiley Series in Probability and Mathematical Statistics , year=

Robust statistics , author=. Wiley Series in Probability and Mathematical Statistics , year=
[58]

Proceedings of the European conference on computer vision (ECCV) , pages=

Deep bilevel learning , author=. Proceedings of the European conference on computer vision (ECCV) , pages=
[59]

Proceedings of the 27th International Joint Conference on Artificial Intelligence , pages=

Adversarial metric learning , author=. Proceedings of the 27th International Joint Conference on Artificial Intelligence , pages=
[60]

Princeton Series in Applied Mathematics , year=

Robust Optimization , author=. Princeton Series in Applied Mathematics , year=
[61]

The 22nd International Conference on Artificial Intelligence and Statistics , pages=

Truncated back-propagation for bilevel optimization , author=. The 22nd International Conference on Artificial Intelligence and Statistics , pages=
[62]

Hyperparameter Optimization

Feurer, Matthias and Hutter, Frank. Hyperparameter Optimization. Automated Machine Learning. 2019

2019
[63]

International Conference on Learning Representations , year=

Meta-learning with differentiable closed-form solvers , author=. International Conference on Learning Representations , year=
[64]

Advances in Neural Information Processing Systems , volume=

Meta-learning with implicit gradients , author=. Advances in Neural Information Processing Systems , volume=
[65]

Advances in Neural Information Processing Systems , volume=

Convergence of meta-learning with task-specific adaptation over partial parameters , author=. Advances in Neural Information Processing Systems , volume=
[66]

IEEE Transactions on Signal Processing , volume=

Learning to continuously optimize wireless resource in a dynamic environment: A bilevel optimization perspective , author=. IEEE Transactions on Signal Processing , volume=. 2022 , publisher=

2022
[67]

International Conference on Learning Representations , year =

Liu, Hanxiao and Simonyan, Karen and Yang, Yiming , title =. International Conference on Learning Representations , year =
[68]

International Conference on Machine Learning , pages=

Bilevel optimization: Convergence analysis and enhanced design , author=. International Conference on Machine Learning , pages=
[69]

Advances in Neural Information Processing Systems , volume=

A near-optimal algorithm for stochastic bilevel optimization via double-momentum , author=. Advances in Neural Information Processing Systems , volume=
[70]

Statistics in Medicine , volume=

Robust estimation and variable selection for the accelerated failure time model , author=. Statistics in Medicine , volume=. 2021 , publisher=

2021
[71]

IEEE Transactions on Evolutionary Computation , volume=

A review on bilevel optimization: from classical to evolutionary approaches and applications , author=. IEEE Transactions on Evolutionary Computation , volume=. 2017 , publisher=

2017
[72]

Annals of operations research , volume=

An overview of bilevel optimization , author=. Annals of operations research , volume=. 2007 , publisher=

2007
[73]

Applied Mathematics and Computation , volume=

An extended Kuhn--Tucker approach for linear bilevel programming , author=. Applied Mathematics and Computation , volume=. 2005 , publisher=

2005
[74]

International Journal of Computational Intelligence and Applications , volume=

Optimistic variants of single-objective bilevel optimization for evolutionary algorithms , author=. International Journal of Computational Intelligence and Applications , volume=. 2020 , publisher=

2020
[75]

IEEE Transactions on Smart Grid , volume=

A Bilevel Voltage Regulation Operation for Distribution Systems With Self-Operated Microgrids , author=. IEEE Transactions on Smart Grid , volume=. 2021 , publisher=

2021
[76]

Bo Liu, Mao Ye, Stephen Wright, Peter Stone, and Qiang Liu

Stochastic hyperparameter optimization through hypernetworks , author=. arXiv preprint arXiv:1802.09419 , year=

work page arXiv
[77]

Journal of Machine Learning Research , volume=

Distributed learning with regularized least squares , author=. Journal of Machine Learning Research , volume=
[78]

International Conference on Machine Learning , pages=

Generalization analysis for contrastive representation learning , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023
[79]

The Journal of Machine Learning Research , volume=

Stability and generalization , author=. The Journal of Machine Learning Research , volume=. 2002 , publisher=

2002
[80]

Journal of Machine Learning Research , volume =

Takayuki Okuno and Akiko Takeda and Akihiro Kawana and Motokazu Watanabe , title =. Journal of Machine Learning Research , volume =

Showing first 80 references.