arxiv: 2605.14409 · v1 · submitted 2026-05-14 · 🧮 math.OC

Recognition: 2 theorem links

· Lean Theorem

On the Nature of Regularity Assumptions in Bilevel Optimization with Constrained Lower-level Problem

Xiaotian Jiang , Chang He , Mingyi Hong , Shuzhong Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:53 UTC · model grok-4.3

classification 🧮 math.OC

keywords bilevel optimizationregularity conditionsconstraint qualificationsprevalencerigidity theoremsactive setsbilevel programming

0 comments

The pith

Requiring lower-level regularity conditions at every upper-level point in bilevel optimization is non-prevalent, as structural invariants cannot be made consistent by small perturbations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the common requirements of linear independence constraint qualification, strict complementary slackness, and second-order sufficient conditions for the lower-level problem must hold at every upper-level variable x. This global demand is strong because it forces certain structural features of the lower-level problem to stay fixed for all x, and explicit constructions demonstrate that these features can differ at distinct points x. In contrast, the same conditions hold at almost every x after a generic random perturbation of the lower-level objective and constraints. The paper establishes that the difference between the two requirements, although limited to a measure-zero set, creates basic obstacles for both the theoretical development and the numerical solution of bilevel problems.

Core claim

When the regularity conditions are required at every upper-level variable x, rigidity theorems prove that structural quantities of the lower-level problem, such as active-set signatures, must remain invariant over the entire upper-level domain. Counterexamples are constructed in which these invariants take different values at two distinct points x, showing that no sufficiently small perturbation of the lower-level data can enforce the conditions everywhere. In comparison, random perturbations of the lower-level objective and constraints make each condition hold at almost every x with probability one. The gap between the everywhere and almost-everywhere versions introduces fundamental theory-

What carries the argument

Rigidity theorems establishing that active-set signatures and related structural quantities of the lower-level problem must be invariant across all upper-level variables whenever the regularity conditions hold at every x.

If this is right

If regularity conditions hold at every x, then active-set structures and multiplier signs must be identical at all upper-level points.
Counterexamples exist where these structural invariants differ at distinct values of x, so the everywhere requirement cannot be met by small perturbations.
The almost-everywhere versions of the conditions hold with probability one after random perturbation of the lower-level data.
The measure-zero difference between the two requirements produces essential obstacles for theoretical analysis and for algorithm design in bilevel optimization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Bilevel algorithms that assume everywhere regularity may need reformulation to accommodate generic problems where violations occur only on null sets.
Prevalence results indicate that bilevel problems can often be replaced by nearby ones satisfying the conditions almost everywhere for practical purposes.
Similar prevalence arguments may apply in other nested optimization problems that impose pointwise regularity on inner problems.

Load-bearing premise

The lower-level objective and constraint functions are smooth enough for active-set signatures and multiplier properties to be well-defined and constant when the regularity conditions hold at every upper-level point.

What would settle it

An explicit lower-level problem in which the active-set pattern or the sign pattern of multipliers changes between two upper-level points x1 and x2, such that no small perturbation of the objective and constraints can make the three regularity conditions hold simultaneously at both points.

Figures

Figures reproduced from arXiv: 2605.14409 by Chang He, Mingyi Hong, Shuzhong Zhang, Xiaotian Jiang.

**Figure 1.** Figure 1: A two-dimensional feasible set Y defined by four inequality constraints h1, h2, h3, h4 ≤ 0. The curves h −1 i (0) are the constraint boundaries. Under LICQ, the feasible set is an MGB whose natural stratification consists of three types of strata: the interior (points where no constraint is active), the boundary arcs (one-dimensional pieces where exactly one constraint is active), and the vertices (zero-di… view at source ↗

**Figure 2.** Figure 2: Left: a feasible set Y(x) defined by four inequality constraints, where Hi := (hi(x, ·))−1 (0) denotes the boundary of the i-th constraint. The feasible set is bounded by four arcs meeting at four corners. Right: the perturbed feasible set Ye(x) obtained by a small C 2 perturbation of the constraints, with perturbed boundaries Hei . Although the constraint boundaries are deformed, the stratification is pre… view at source ↗

**Figure 3.** Figure 3: The feasible set Y(x) (shaded) in Counterexample 2.3 at x = −1 (left) and x = 1 (right), where Hi := (hi(x, ·))−1 (0) denotes the boundary of the i-th constraint. At x = −1, H5 does not intersect the rectangle bounded by H1, . . . , H4, so the feasible set is that rectangle, with four vertices and four boundary edges. At x = 1, H5 cuts off a corner, producing a pentagon with five vertices and five boundary… view at source ↗

**Figure 4.** Figure 4: The feasible set Y(x) (shaded) in Counterexample 2.4 at x = 1 (left) and x = 2 (right), where Hi := (hi(x, ·))−1 (0) denotes the boundary of the i-th constraint. At x = 1, the ellipse H2 lies entirely inside the circle H1, so the feasible set is a disk with no vertices. At x = 2, H2 protrudes beyond H1, and the two constraint boundaries intersect transversally at four points (marked), producing a feasible … view at source ↗

**Figure 5.** Figure 5: The feasible set Y(x) (shaded) and the unique minimizer y ∗ (x) (marked) in Counterexample 2.7 at x = 0 (left) and x = 2 (right). At x = 0, the minimizer lies in the interior with no active constraint. At x = 2, the minimizer lies on the constraint boundary. The minimizer migrates across strata as x varies. SOSC yields a further rigidity: if LICQ, SCSC, and uniform SOSC all hold at every x, then the numbe… view at source ↗

**Figure 6.** Figure 6: The objective function g(x, y) restricted to the constraint boundary {y1 = 0}. At x = 0 (left), there is a unique local minimizer at y2 = 0 (filled circle). At x = 1 (right), the point y2 = 0 has become a local maximizer (open square), and two new local minimizers have appeared at y2 = ±5 (filled circles). The number of local minimizers on the constraint boundary stratum changes from one to two as x varies… view at source ↗

**Figure 7.** Figure 7: The constraint boundaries and feasible set [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: The feasible set (shaded), unconstrained minimizer, and constrained minimizer in [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: The objective function g(x, y) over the feasible set at x = 1, x = 0, and x = −1. For x > 0 (left), (0, 0) is a local minimizer. At x = 0 (center), it degenerates into a saddle point. For x < 0 (right), (0, 0) is no longer a KKT point. The dashed curve on each surface shows the restriction of g to y1 = 0. Recall that uniform SOSC only needs to be verified at valid local minimizers. For x ∈ [−1, 1], the set… view at source ↗

**Figure 10.** Figure 10: The reduced Lagrangian ℓ(y2) = 1 3 y 3 2 − xy2 on the boundary stratum {y1 = 0} in Example 4.4. At x = 1 (left), there is a local minimizer at y2 = 1 (filled circle) and a local maximizer at y2 = −1 (open circle). At x = 0 (center), the two critical points merge into a single degenerate point at y2 = 0 (open square) with ℓ ′′(0) = 0. At x = −1 (right), neither critical point exists. The local minimizer an… view at source ↗

read the original abstract

In this paper, we study the regularity assumptions commonly adopted in bilevel optimization with constrained lower-level problems, including the linear independence constraint qualification, the strict complementary slackness condition, and the second-order sufficient condition. These conditions are typically required to hold for the lower-level problem at every upper-level variable $x$. We first show that the requirement that these conditions hold at every upper-level variable $x$ is strong, in the sense that it is non-prevalent: there exist problems for which no sufficiently small perturbation of the lower-level objective and constraints can make the conditions hold at every $x$. To establish the result, we prove rigidity theorems showing that certain structural quantities of the lower-level problem must remain invariant across all $x$ whenever these conditions hold everywhere. We then construct explicit counterexamples in which these invariants differ between two values of $x$. In contrast, we show that the weaker requirement, that these conditions hold at almost every $x$, is a weak assumption, in the sense that it is prevalent: with probability one over a random perturbation of the lower-level objective and constraints, each condition holds at almost every $x$. We further analyze the gap between the two requirements. Although the ``every $x$'' and ``almost every $x$'' versions differ only on a measure-zero set, we show that this difference introduces fundamental difficulties in both theory and computation for bilevel optimization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Requiring LICQ/SCS/SOSC everywhere in constrained bilevel problems is non-generic, while the almost-everywhere version holds after generic small perturbations.

read the letter

The main point here is that the usual demand for linear independence, strict complementarity, and second-order sufficiency to hold at every upper-level x is a strong condition that fails to be prevalent. Small changes to the lower-level objective and constraints cannot make it true everywhere in some cases. The paper proves rigidity results that tie those regularity conditions to fixed structural features like active-set signatures across the whole domain, then gives explicit counterexamples where those features cannot stay constant. In contrast, the weaker almost-everywhere version turns out to be prevalent: random perturbations make the conditions hold almost everywhere with probability one. They also spell out why the measure-zero gap still creates real problems for proofs and algorithms. This is new work. Prior bilevel literature often just assumes the everywhere version without checking how generic it is, and these rigidity theorems plus the prevalence argument for the weaker version do not appear in the references. The counterexamples are concrete and separate the two cases cleanly. The argument stays within standard nonlinear programming definitions and does not rely on circular constructions. One soft spot is the global reach of the rigidity step. If the local constancy from the implicit-function theorem on the KKT system does not extend across the full x-domain without extra topological work, a perturbation might still achieve everywhere-regularity by allowing jumps on null sets. The abstract indicates they close this, but the full proofs would need checking on that point. Minor gaps in the measure arguments could also surface. This paper is for researchers who build convergence theory or algorithms for bilevel problems with constraints. It gives a clear reason to prefer the almost-everywhere assumption in many settings. I would send it to peer review.

Referee Report

1 major / 2 minor

Summary. The paper studies regularity conditions (LICQ, SCS, SOSC) for the lower-level problem in bilevel optimization, typically imposed at every upper-level x. It proves rigidity theorems establishing that these conditions force structural invariants (active-set signatures, multiplier signs) to be constant across all x. Counterexamples demonstrate that no single invariant can hold for all x in certain problems, showing the everywhere requirement is non-prevalent (no small perturbation of lower-level data achieves it). In contrast, the almost-everywhere version is prevalent: random perturbations make each condition hold a.e. with probability 1. The manuscript further examines theoretical and computational difficulties arising from the measure-zero gap between the two requirements.

Significance. If the central claims hold, the work provides a precise measure-theoretic and structural characterization of common assumptions in bilevel optimization. The rigidity theorems and explicit counterexamples rigorously separate the everywhere and a.e. cases, while the probabilistic prevalence result supplies a generic positive counterpart. This has direct implications for the scope of existing theory and algorithms that rely on global regularity, and the analysis of the gap between the two notions is a substantive contribution.

major comments (1)

[Rigidity theorems] Rigidity theorems (as described in the abstract and introduction): the argument that LICQ/SCS/SOSC everywhere implies global constancy of active-set signatures and related invariants relies on extending local constancy (via implicit-function theorem on the KKT system) to the full domain of x. If the proof supplies only local patches without a global topological or continuation argument (e.g., when the domain of x is disconnected), then the non-prevalence claim is at risk, since a perturbation could still achieve everywhere-regularity by allowing jumps on a measure-zero set.

minor comments (2)

[Abstract] The abstract states that the gap between every-x and a.e.-x versions 'introduces fundamental difficulties in both theory and computation,' but the manuscript should explicitly reference the section(s) containing this analysis so readers can locate the concrete examples or theorems.
[Introduction] Notation for the lower-level problem and the perturbation measure should be introduced with a brief reminder of the ambient function space (e.g., C^2 or Sobolev) to make the prevalence statements fully precise.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The feedback highlights an important point regarding the global extension in our rigidity theorems, which we address below.

read point-by-point responses

Referee: [Rigidity theorems] Rigidity theorems (as described in the abstract and introduction): the argument that LICQ/SCS/SOSC everywhere implies global constancy of active-set signatures and related invariants relies on extending local constancy (via implicit-function theorem on the KKT system) to the full domain of x. If the proof supplies only local patches without a global topological or continuation argument (e.g., when the domain of x is disconnected), then the non-prevalence claim is at risk, since a perturbation could still achieve everywhere-regularity by allowing jumps on a measure-zero set.

Authors: We thank the referee for this observation. Our rigidity proofs begin with local constancy of active-set signatures and multiplier signs via the implicit-function theorem on the KKT system. Global constancy then follows because the set of x where the regularity conditions hold is both open (by the implicit-function theorem and continuity of the data) and closed relative to the upper-level domain (by continuation along paths). We explicitly assume the upper-level domain is connected, which is standard in bilevel optimization (e.g., convex or interval domains). On each connected component the invariants are therefore constant. Our counterexamples are constructed precisely on connected domains where the invariants differ between two points, so no small perturbation can enforce the conditions everywhere. For disconnected domains the invariants remain constant per component, but this does not affect the non-prevalence result on connected domains. We will add an explicit statement of the connectedness assumption and a brief remark on the disconnected case in the revision. revision: partial

Circularity Check

0 steps flagged

No circularity; derivation uses original proofs and counterexamples

full rationale

The paper proves new rigidity theorems establishing invariance of structural quantities (active-set signatures, multiplier signs) under the everywhere-regularity assumption (LICQ/SCS/SOSC at all x), then constructs explicit counterexamples where these invariants differ at distinct x values to show non-prevalence. The almost-everywhere prevalence follows from standard measure-theoretic perturbation arguments on the lower-level data. All steps rely on direct mathematical arguments from standard NLP constraint qualifications rather than any reduction to fitted parameters, self-referential definitions, or load-bearing self-citations. The central claims remain independent of the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the standard definitions of LICQ, SCS, and SOSC from nonlinear programming theory and on basic measure-theoretic notions of prevalence; no new free parameters, ad-hoc axioms, or invented entities are introduced.

axioms (1)

standard math Standard definitions of linear independence constraint qualification (LICQ), strict complementarity slackness (SCS), and second-order sufficient condition (SOSC) from nonlinear programming.
These are classical constraint qualifications invoked throughout the analysis.

pith-pipeline@v0.9.0 · 5561 in / 1559 out tokens · 38118 ms · 2026-05-15T01:53:42.973453+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

rigidity theorems showing that certain structural quantities of the lower-level problem must remain invariant across all x whenever these conditions hold everywhere
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

LICQ, SCSC, and uniform SOSC together imply the number of local minimizers on each stratum is constant

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

91 extracted references · 91 canonical work pages

[1]

Algorithmica , volume=

Stackelberg network pricing games , author=. Algorithmica , volume=. 2012 , publisher=

work page 2012
[2]

Ba. A. Journal of optimization theory and applications , volume=. 2002 , publisher=

work page 2002
[3]

Advances in Neural Information Processing Systems , volume=

Convex-concave min-max stackelberg games , author=. Advances in Neural Information Processing Systems , volume=

work page
[4]

Advances in Neural Information Processing Systems , volume=

Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization , author=. Advances in Neural Information Processing Systems , volume=

work page
[5]

Journal of the Operations Research Society of Japan , volume=

Multi-leader-follower games: models, methods and applications , author=. Journal of the Operations Research Society of Japan , volume=. 2015 , publisher=

work page 2015
[6]

International conference on machine learning , pages=

What is local optimality in nonconvex-nonconcave minimax optimization? , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020
[7]

arXiv preprint arXiv:2202.03684 , year=

Efficiently escaping saddle points in bilevel optimization , author=. arXiv preprint arXiv:2202.03684 , year=

work page arXiv
[8]

Jensen–Steffensen inequality for strongly convex functions , volume =

Klaricic Bakula, Milica , year =. Jensen–Steffensen inequality for strongly convex functions , volume =. Journal of Inequalities and Applications , doi =

work page
[9]

Annals of operations research , volume=

Bilevel programming and price setting problems , author=. Annals of operations research , volume=. 2016 , publisher=

work page 2016
[10]

arXiv preprint arXiv:2406.10148 , year=

A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints , author=. arXiv preprint arXiv:2406.10148 , year=

work page arXiv
[11]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Efficient gradient approximation method for constrained bilevel optimization , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[12]

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

An implicit gradient method for constrained bilevel problems using barrier approximation , author=. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2023 , organization=

work page 2023
[13]

Advances in Neural Information Processing Systems , volume=

When demonstrations meet generative world models: A maximum likelihood framework for offline inverse reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=

work page
[14]

arXiv preprint arXiv:2406.06874 , year=

Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback , author=. arXiv preprint arXiv:2406.06874 , year=

work page arXiv
[15]

arXiv preprint arXiv:2405.17888 , year=

Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment , author=. arXiv preprint arXiv:2405.17888 , year=

work page arXiv
[16]

International Journal of Computational Intelligence Systems , volume=

Pessimistic bilevel optimization: a survey , author=. International Journal of Computational Intelligence Systems , volume=. 2018 , publisher=

work page 2018
[17]

Journal of Optimization Theory and Applications , volume=

Optimality conditions for optimistic bilevel programming problem using convexifactors , author=. Journal of Optimization Theory and Applications , volume=. 2012 , publisher=

work page 2012
[18]

2017 , publisher=

First-order methods in optimization , author=. 2017 , publisher=

work page 2017
[19]

International conference on machine learning , pages=

Bilevel optimization: Convergence analysis and enhanced design , author=. International conference on machine learning , pages=. 2021 , organization=

work page 2021
[20]

2023 , eprint=

On Penalty-based Bilevel Gradient Descent Method , author=. 2023 , eprint=

work page 2023
[21]

International Conference on Machine Learning , year=

Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach , author=. International Conference on Machine Learning , year=

work page
[22]

2018 , eprint=

Approximation Methods for Bilevel Programming , author=. 2018 , eprint=

work page 2018
[23]

2024 , eprint=

First-Order Methods for Linearly Constrained Bilevel Optimization , author=. 2024 , eprint=

work page 2024
[24]

SIAM Journal on Optimization , volume=

A two-timescale stochastic algorithm framework for bilevel optimization: Complexity analysis and application to actor-critic , author=. SIAM Journal on Optimization , volume=. 2023 , publisher=

work page 2023
[25]

2024 , eprint=

Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions , author=. 2024 , eprint=

work page 2024
[26]

International Conference on Machine Learning , year=

Hyperparameter optimization with approximate gradient , author=. International Conference on Machine Learning , year=

work page
[27]

Advances in Neural Information Processing Systems , volume=

Provably faster algorithms for bilevel optimization , author=. Advances in Neural Information Processing Systems , volume=

work page
[28]

International conference on machine learning , pages=

Gradient-based hyperparameter optimization through reversible learning , author=. International conference on machine learning , pages=. 2015 , organization=

work page 2015
[29]

International Conference on Machine Learning , year=

Forward and Reverse Gradient-Based Hyperparameter Optimization , author=. International Conference on Machine Learning , year=

work page
[30]

ArXiv , year=

On First-Order Meta-Learning Algorithms , author=. ArXiv , year=

work page
[31]

Asian Conference on Machine Learning , year=

Penalty Method for Inversion-Free Deep Bilevel Optimization , author=. Asian Conference on Machine Learning , year=

work page
[32]

2024 , eprint=

On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation , author=. 2024 , eprint=

work page 2024
[33]

Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =

A Single-Timescale Method for Stochastic Bilevel Optimization , author =. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =. 2022 , editor =

work page 2022
[34]

ArXiv , year=

A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization , author=. ArXiv , year=

work page
[35]

ArXiv , year=

A Fully First-Order Method for Stochastic Bilevel Optimization , author=. ArXiv , year=

work page
[36]

Yang, Yifan and Xiao, Peiyao and Ji, Kaiyi , journal=

work page
[37]

Advances in neural information processing systems , volume=

A near-optimal algorithm for stochastic bilevel optimization via double-momentum , author=. Advances in neural information processing systems , volume=

work page
[38]

arXiv preprint arXiv:2406.17386 , year=

Double Momentum Method for Lower-Level Constrained Bilevel Optimization , author=. arXiv preprint arXiv:2406.17386 , year=

work page arXiv
[39]

arXiv preprint arXiv:2110.00604 , year=

Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems , author=. arXiv preprint arXiv:2110.00604 , year=

work page arXiv
[40]

SIAM Journal on Optimization , volume=

Minimax Problems with Coupled Linear Constraints: Computational Complexity and Duality , author=. SIAM Journal on Optimization , volume=. 2023 , publisher=

work page 2023
[41]

2024 , eprint=

First-order penalty methods for bilevel optimization , author=. 2024 , eprint=

work page 2024
[42]

International Conference on Artificial Intelligence and Statistics , year=

Alternating Projected SGD for Equality-constrained Bilevel Optimization , author=. International Conference on Artificial Intelligence and Statistics , year=

work page
[43]

Annals of Operations Research , year=

An overview of bilevel optimization , author=. Annals of Operations Research , year=

work page
[44]

Mathematical Programming , year=

Gradient methods for minimizing composite functions , author=. Mathematical Programming , year=

work page
[45]

Soft Computing , year=

Bi-level programming problem in the supply chain and its solution algorithm , author=. Soft Computing , year=

work page
[46]

arXiv preprint arXiv:2402.03883 , year=

A framework for bilevel optimization on Riemannian manifolds , author=. arXiv preprint arXiv:2402.03883 , year=

work page arXiv
[47]

arXiv preprint arXiv:2402.02019 , year=

Riemannian Bilevel Optimization , author=. arXiv preprint arXiv:2402.02019 , year=

work page arXiv
[48]

arXiv preprint arXiv:2405.15816 , year=

Riemannian Bilevel Optimization , author=. arXiv preprint arXiv:2405.15816 , year=

work page arXiv
[49]

Annals of Telecommunications , year=

A bi-objective optimization model for segment routing traffic engineering , author=. Annals of Telecommunications , year=

work page
[50]

A bi-level optimal scheduling model for new-type power systems integrating large-scale renewable energy , volume =

Zhao, Huiru and Zhang, Chao and Zhao, Yihang , year =. A bi-level optimal scheduling model for new-type power systems integrating large-scale renewable energy , volume =. Clean Energy , doi =

work page
[51]

ArXiv , year=

Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs , author=. ArXiv , year=

work page
[52]

Bertsekas , volume=

Dimitri P. Bertsekas , volume=. Nonlinear programming: 3rd edition , journal=. 1997 , publisher=

work page 1997
[53]

2004 , publisher=

Convex optimization , author=. 2004 , publisher=

work page 2004
[54]

University Lecture , year=

Convex analysis and nonsmooth optimization , author=. University Lecture , year=

work page
[55]

2018 , url=

A Tutorial on Sensitivity and Stability in Nonlinear Programming and Variational Inequalities under Differentiability Assumptions , author=. 2018 , url=

work page 2018
[56]

SIAM Journal on Optimization , volume =

Efficient first order method for saddle point problems with higher order smoothness , author=. SIAM Journal on Optimization , volume =. 2024 , publisher=

work page 2024
[57]

Mathematical Programming , volume=

Lower bounds for finding stationary points II: first-order methods , author=. Mathematical Programming , volume=. 2021 , publisher=

work page 2021
[58]

arXiv preprint arXiv:2408.09661 , year=

Enhanced Barrier-Smoothing Technique for Bilevel Optimization with Nonsmooth Mappings , author=. arXiv preprint arXiv:2408.09661 , year=

work page arXiv
[59]

A gentle and incomplete introduction to bilevel optimization , author=

work page
[60]

European Journal of Operational Research , volume=

A survey on bilevel optimization under uncertainty , author=. European Journal of Operational Research , volume=. 2023 , publisher=

work page 2023
[61]

ArXiv , year=

Gradient-based Hyperparameter Optimization through Reversible Learning , author=. ArXiv , year=

work page
[62]

International conference on machine learning , pages=

Bilevel programming for hyperparameter optimization and meta-learning , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[63]

International conference on machine learning , pages=

Model-agnostic meta-learning for fast adaptation of deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

work page 2017
[64]

Conference on Uncertainty in Artificial Intelligence , pages=

Learning intrinsic rewards as a bi-level optimization problem , author=. Conference on Uncertainty in Artificial Intelligence , pages=. 2020 , organization=

work page 2020
[65]

2006 , publisher=

Numerical optimization , author=. 2006 , publisher=

work page 2006
[66]

arXiv preprint arXiv:2410.10670 , year=

A Barrier Function Approach for Bilevel Optimization with Coupled Lower-Level Constraints: Formulation, Approximation and Algorithms , author=. arXiv preprint arXiv:2410.10670 , year=

work page arXiv
[67]

arXiv preprint arXiv:2506.08164 , year=

BLUR: A Bi-Level Optimization Approach for LLM Unlearning , author=. arXiv preprint arXiv:2506.08164 , year=

work page arXiv
[68]

The Thirteenth International Conference on Learning Representations , year=

Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment , author=. The Thirteenth International Conference on Learning Representations , year=

work page
[69]

Computational Optimization and Applications , volume=

Gauss--Newton-type methods for bilevel optimization , author=. Computational Optimization and Applications , volume=. 2021 , publisher=

work page 2021
[70]

arXiv preprint arXiv:2510.01487 , year=

A Sensitivity-Based Method for Bilevel Optimization Problems: Theoretical Analysis and Computational Performance , author=. arXiv preprint arXiv:2510.01487 , year=

work page arXiv
[71]

Argonne National Laboratory, USA , year=

Regularizing bilevel nonlinear programs by lifting , author=. Argonne National Laboratory, USA , year=

work page
[72]

arXiv preprint arXiv:2509.01148 , year=

A Correspondence-Driven Approach for Bilevel Decision-making with Nonconvex Lower-Level Problems , author=. arXiv preprint arXiv:2509.01148 , year=

work page arXiv
[73]

arXiv preprint arXiv:2510.24710 , year=

A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization , author=. arXiv preprint arXiv:2510.24710 , year=

work page arXiv
[74]

arXiv preprint arXiv:2309.01753 , year=

On penalty methods for nonconvex bilevel optimization and first-order stochastic approximation , author=. arXiv preprint arXiv:2309.01753 , year=

work page arXiv
[75]

SIAM journal on Optimization , volume=

Qualification conditions in semialgebraic programming , author=. SIAM journal on Optimization , volume=. 2018 , publisher=

work page 2018
[76]

Mathematical programming , volume=

Complementarity and nondegeneracy in semidefinite programming , author=. Mathematical programming , volume=. 1997 , publisher=

work page 1997
[77]

SIAM Journal on Optimization , volume=

Generic minimizing behavior in semialgebraic optimization , author=. SIAM Journal on Optimization , volume=. 2016 , publisher=

work page 2016
[78]

Mathematics of operations research , volume=

Genericity results in linear conic programming—a tour d’horizon , author=. Mathematics of operations research , volume=. 2017 , publisher=

work page 2017
[79]

Genericity in Linear Algebra and Analysis with Application to Optimization , author=

work page
[80]

2013 , publisher=

Nonlinear optimization in finite dimensions: Morse theory, Chebyshev approximation, transversality, flows, parametric aspects , author=. 2013 , publisher=

work page 2013

Showing first 80 references.