pith. machine review for the scientific record. sign in

arxiv: 2605.14409 · v1 · submitted 2026-05-14 · 🧮 math.OC

Recognition: 2 theorem links

· Lean Theorem

On the Nature of Regularity Assumptions in Bilevel Optimization with Constrained Lower-level Problem

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:53 UTC · model grok-4.3

classification 🧮 math.OC
keywords bilevel optimizationregularity conditionsconstraint qualificationsprevalencerigidity theoremsactive setsbilevel programming
0
0 comments X

The pith

Requiring lower-level regularity conditions at every upper-level point in bilevel optimization is non-prevalent, as structural invariants cannot be made consistent by small perturbations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the common requirements of linear independence constraint qualification, strict complementary slackness, and second-order sufficient conditions for the lower-level problem must hold at every upper-level variable x. This global demand is strong because it forces certain structural features of the lower-level problem to stay fixed for all x, and explicit constructions demonstrate that these features can differ at distinct points x. In contrast, the same conditions hold at almost every x after a generic random perturbation of the lower-level objective and constraints. The paper establishes that the difference between the two requirements, although limited to a measure-zero set, creates basic obstacles for both the theoretical development and the numerical solution of bilevel problems.

Core claim

When the regularity conditions are required at every upper-level variable x, rigidity theorems prove that structural quantities of the lower-level problem, such as active-set signatures, must remain invariant over the entire upper-level domain. Counterexamples are constructed in which these invariants take different values at two distinct points x, showing that no sufficiently small perturbation of the lower-level data can enforce the conditions everywhere. In comparison, random perturbations of the lower-level objective and constraints make each condition hold at almost every x with probability one. The gap between the everywhere and almost-everywhere versions introduces fundamental theory-

What carries the argument

Rigidity theorems establishing that active-set signatures and related structural quantities of the lower-level problem must be invariant across all upper-level variables whenever the regularity conditions hold at every x.

If this is right

  • If regularity conditions hold at every x, then active-set structures and multiplier signs must be identical at all upper-level points.
  • Counterexamples exist where these structural invariants differ at distinct values of x, so the everywhere requirement cannot be met by small perturbations.
  • The almost-everywhere versions of the conditions hold with probability one after random perturbation of the lower-level data.
  • The measure-zero difference between the two requirements produces essential obstacles for theoretical analysis and for algorithm design in bilevel optimization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Bilevel algorithms that assume everywhere regularity may need reformulation to accommodate generic problems where violations occur only on null sets.
  • Prevalence results indicate that bilevel problems can often be replaced by nearby ones satisfying the conditions almost everywhere for practical purposes.
  • Similar prevalence arguments may apply in other nested optimization problems that impose pointwise regularity on inner problems.

Load-bearing premise

The lower-level objective and constraint functions are smooth enough for active-set signatures and multiplier properties to be well-defined and constant when the regularity conditions hold at every upper-level point.

What would settle it

An explicit lower-level problem in which the active-set pattern or the sign pattern of multipliers changes between two upper-level points x1 and x2, such that no small perturbation of the objective and constraints can make the three regularity conditions hold simultaneously at both points.

Figures

Figures reproduced from arXiv: 2605.14409 by Chang He, Mingyi Hong, Shuzhong Zhang, Xiaotian Jiang.

Figure 1
Figure 1. Figure 1: A two-dimensional feasible set Y defined by four inequality constraints h1, h2, h3, h4 ≤ 0. The curves h −1 i (0) are the constraint boundaries. Under LICQ, the feasible set is an MGB whose natural stratification consists of three types of strata: the interior (points where no constraint is active), the boundary arcs (one-dimensional pieces where exactly one constraint is active), and the vertices (zero-di… view at source ↗
Figure 2
Figure 2. Figure 2: Left: a feasible set Y(x) defined by four inequality constraints, where Hi := (hi(x, ·))−1 (0) denotes the boundary of the i-th constraint. The feasible set is bounded by four arcs meeting at four corners. Right: the perturbed feasible set Ye(x) obtained by a small C 2 perturbation of the constraints, with perturbed boundaries Hei . Although the constraint boundaries are deformed, the stratification is pre… view at source ↗
Figure 3
Figure 3. Figure 3: The feasible set Y(x) (shaded) in Counterexample 2.3 at x = −1 (left) and x = 1 (right), where Hi := (hi(x, ·))−1 (0) denotes the boundary of the i-th constraint. At x = −1, H5 does not intersect the rectangle bounded by H1, . . . , H4, so the feasible set is that rectangle, with four vertices and four boundary edges. At x = 1, H5 cuts off a corner, producing a pentagon with five vertices and five boundary… view at source ↗
Figure 4
Figure 4. Figure 4: The feasible set Y(x) (shaded) in Counterexample 2.4 at x = 1 (left) and x = 2 (right), where Hi := (hi(x, ·))−1 (0) denotes the boundary of the i-th constraint. At x = 1, the ellipse H2 lies entirely inside the circle H1, so the feasible set is a disk with no vertices. At x = 2, H2 protrudes beyond H1, and the two constraint boundaries intersect transversally at four points (marked), producing a feasible … view at source ↗
Figure 5
Figure 5. Figure 5: The feasible set Y(x) (shaded) and the unique minimizer y ∗ (x) (marked) in Counterexam￾ple 2.7 at x = 0 (left) and x = 2 (right). At x = 0, the minimizer lies in the interior with no active constraint. At x = 2, the minimizer lies on the constraint boundary. The minimizer migrates across strata as x varies. SOSC yields a further rigidity: if LICQ, SCSC, and uniform SOSC all hold at every x, then the numbe… view at source ↗
Figure 6
Figure 6. Figure 6: The objective function g(x, y) restricted to the constraint boundary {y1 = 0}. At x = 0 (left), there is a unique local minimizer at y2 = 0 (filled circle). At x = 1 (right), the point y2 = 0 has become a local maximizer (open square), and two new local minimizers have appeared at y2 = ±5 (filled circles). The number of local minimizers on the constraint boundary stratum changes from one to two as x varies… view at source ↗
Figure 7
Figure 7. Figure 7: The constraint boundaries and feasible set [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The feasible set (shaded), unconstrained minimizer, and constrained minimizer in [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The objective function g(x, y) over the feasible set at x = 1, x = 0, and x = −1. For x > 0 (left), (0, 0) is a local minimizer. At x = 0 (center), it degenerates into a saddle point. For x < 0 (right), (0, 0) is no longer a KKT point. The dashed curve on each surface shows the restriction of g to y1 = 0. Recall that uniform SOSC only needs to be verified at valid local minimizers. For x ∈ [−1, 1], the set… view at source ↗
Figure 10
Figure 10. Figure 10: The reduced Lagrangian ℓ(y2) = 1 3 y 3 2 − xy2 on the boundary stratum {y1 = 0} in Example 4.4. At x = 1 (left), there is a local minimizer at y2 = 1 (filled circle) and a local maximizer at y2 = −1 (open circle). At x = 0 (center), the two critical points merge into a single degenerate point at y2 = 0 (open square) with ℓ ′′(0) = 0. At x = −1 (right), neither critical point exists. The local minimizer an… view at source ↗
read the original abstract

In this paper, we study the regularity assumptions commonly adopted in bilevel optimization with constrained lower-level problems, including the linear independence constraint qualification, the strict complementary slackness condition, and the second-order sufficient condition. These conditions are typically required to hold for the lower-level problem at every upper-level variable $x$. We first show that the requirement that these conditions hold at every upper-level variable $x$ is strong, in the sense that it is non-prevalent: there exist problems for which no sufficiently small perturbation of the lower-level objective and constraints can make the conditions hold at every $x$. To establish the result, we prove rigidity theorems showing that certain structural quantities of the lower-level problem must remain invariant across all $x$ whenever these conditions hold everywhere. We then construct explicit counterexamples in which these invariants differ between two values of $x$. In contrast, we show that the weaker requirement, that these conditions hold at almost every $x$, is a weak assumption, in the sense that it is prevalent: with probability one over a random perturbation of the lower-level objective and constraints, each condition holds at almost every $x$. We further analyze the gap between the two requirements. Although the ``every $x$'' and ``almost every $x$'' versions differ only on a measure-zero set, we show that this difference introduces fundamental difficulties in both theory and computation for bilevel optimization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper studies regularity conditions (LICQ, SCS, SOSC) for the lower-level problem in bilevel optimization, typically imposed at every upper-level x. It proves rigidity theorems establishing that these conditions force structural invariants (active-set signatures, multiplier signs) to be constant across all x. Counterexamples demonstrate that no single invariant can hold for all x in certain problems, showing the everywhere requirement is non-prevalent (no small perturbation of lower-level data achieves it). In contrast, the almost-everywhere version is prevalent: random perturbations make each condition hold a.e. with probability 1. The manuscript further examines theoretical and computational difficulties arising from the measure-zero gap between the two requirements.

Significance. If the central claims hold, the work provides a precise measure-theoretic and structural characterization of common assumptions in bilevel optimization. The rigidity theorems and explicit counterexamples rigorously separate the everywhere and a.e. cases, while the probabilistic prevalence result supplies a generic positive counterpart. This has direct implications for the scope of existing theory and algorithms that rely on global regularity, and the analysis of the gap between the two notions is a substantive contribution.

major comments (1)
  1. [Rigidity theorems] Rigidity theorems (as described in the abstract and introduction): the argument that LICQ/SCS/SOSC everywhere implies global constancy of active-set signatures and related invariants relies on extending local constancy (via implicit-function theorem on the KKT system) to the full domain of x. If the proof supplies only local patches without a global topological or continuation argument (e.g., when the domain of x is disconnected), then the non-prevalence claim is at risk, since a perturbation could still achieve everywhere-regularity by allowing jumps on a measure-zero set.
minor comments (2)
  1. [Abstract] The abstract states that the gap between every-x and a.e.-x versions 'introduces fundamental difficulties in both theory and computation,' but the manuscript should explicitly reference the section(s) containing this analysis so readers can locate the concrete examples or theorems.
  2. [Introduction] Notation for the lower-level problem and the perturbation measure should be introduced with a brief reminder of the ambient function space (e.g., C^2 or Sobolev) to make the prevalence statements fully precise.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The feedback highlights an important point regarding the global extension in our rigidity theorems, which we address below.

read point-by-point responses
  1. Referee: [Rigidity theorems] Rigidity theorems (as described in the abstract and introduction): the argument that LICQ/SCS/SOSC everywhere implies global constancy of active-set signatures and related invariants relies on extending local constancy (via implicit-function theorem on the KKT system) to the full domain of x. If the proof supplies only local patches without a global topological or continuation argument (e.g., when the domain of x is disconnected), then the non-prevalence claim is at risk, since a perturbation could still achieve everywhere-regularity by allowing jumps on a measure-zero set.

    Authors: We thank the referee for this observation. Our rigidity proofs begin with local constancy of active-set signatures and multiplier signs via the implicit-function theorem on the KKT system. Global constancy then follows because the set of x where the regularity conditions hold is both open (by the implicit-function theorem and continuity of the data) and closed relative to the upper-level domain (by continuation along paths). We explicitly assume the upper-level domain is connected, which is standard in bilevel optimization (e.g., convex or interval domains). On each connected component the invariants are therefore constant. Our counterexamples are constructed precisely on connected domains where the invariants differ between two points, so no small perturbation can enforce the conditions everywhere. For disconnected domains the invariants remain constant per component, but this does not affect the non-prevalence result on connected domains. We will add an explicit statement of the connectedness assumption and a brief remark on the disconnected case in the revision. revision: partial

Circularity Check

0 steps flagged

No circularity; derivation uses original proofs and counterexamples

full rationale

The paper proves new rigidity theorems establishing invariance of structural quantities (active-set signatures, multiplier signs) under the everywhere-regularity assumption (LICQ/SCS/SOSC at all x), then constructs explicit counterexamples where these invariants differ at distinct x values to show non-prevalence. The almost-everywhere prevalence follows from standard measure-theoretic perturbation arguments on the lower-level data. All steps rely on direct mathematical arguments from standard NLP constraint qualifications rather than any reduction to fitted parameters, self-referential definitions, or load-bearing self-citations. The central claims remain independent of the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the standard definitions of LICQ, SCS, and SOSC from nonlinear programming theory and on basic measure-theoretic notions of prevalence; no new free parameters, ad-hoc axioms, or invented entities are introduced.

axioms (1)
  • standard math Standard definitions of linear independence constraint qualification (LICQ), strict complementarity slackness (SCS), and second-order sufficient condition (SOSC) from nonlinear programming.
    These are classical constraint qualifications invoked throughout the analysis.

pith-pipeline@v0.9.0 · 5561 in / 1559 out tokens · 38118 ms · 2026-05-15T01:53:42.973453+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

91 extracted references · 91 canonical work pages

  1. [1]

    Algorithmica , volume=

    Stackelberg network pricing games , author=. Algorithmica , volume=. 2012 , publisher=

  2. [2]

    Ba. A. Journal of optimization theory and applications , volume=. 2002 , publisher=

  3. [3]

    Advances in Neural Information Processing Systems , volume=

    Convex-concave min-max stackelberg games , author=. Advances in Neural Information Processing Systems , volume=

  4. [4]

    Advances in Neural Information Processing Systems , volume=

    Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization , author=. Advances in Neural Information Processing Systems , volume=

  5. [5]

    Journal of the Operations Research Society of Japan , volume=

    Multi-leader-follower games: models, methods and applications , author=. Journal of the Operations Research Society of Japan , volume=. 2015 , publisher=

  6. [6]

    International conference on machine learning , pages=

    What is local optimality in nonconvex-nonconcave minimax optimization? , author=. International conference on machine learning , pages=. 2020 , organization=

  7. [7]

    arXiv preprint arXiv:2202.03684 , year=

    Efficiently escaping saddle points in bilevel optimization , author=. arXiv preprint arXiv:2202.03684 , year=

  8. [8]

    Jensen–Steffensen inequality for strongly convex functions , volume =

    Klaricic Bakula, Milica , year =. Jensen–Steffensen inequality for strongly convex functions , volume =. Journal of Inequalities and Applications , doi =

  9. [9]

    Annals of operations research , volume=

    Bilevel programming and price setting problems , author=. Annals of operations research , volume=. 2016 , publisher=

  10. [10]

    arXiv preprint arXiv:2406.10148 , year=

    A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints , author=. arXiv preprint arXiv:2406.10148 , year=

  11. [11]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Efficient gradient approximation method for constrained bilevel optimization , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  12. [12]

    ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

    An implicit gradient method for constrained bilevel problems using barrier approximation , author=. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2023 , organization=

  13. [13]

    Advances in Neural Information Processing Systems , volume=

    When demonstrations meet generative world models: A maximum likelihood framework for offline inverse reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=

  14. [14]

    arXiv preprint arXiv:2406.06874 , year=

    Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback , author=. arXiv preprint arXiv:2406.06874 , year=

  15. [15]

    arXiv preprint arXiv:2405.17888 , year=

    Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment , author=. arXiv preprint arXiv:2405.17888 , year=

  16. [16]

    International Journal of Computational Intelligence Systems , volume=

    Pessimistic bilevel optimization: a survey , author=. International Journal of Computational Intelligence Systems , volume=. 2018 , publisher=

  17. [17]

    Journal of Optimization Theory and Applications , volume=

    Optimality conditions for optimistic bilevel programming problem using convexifactors , author=. Journal of Optimization Theory and Applications , volume=. 2012 , publisher=

  18. [18]

    2017 , publisher=

    First-order methods in optimization , author=. 2017 , publisher=

  19. [19]

    International conference on machine learning , pages=

    Bilevel optimization: Convergence analysis and enhanced design , author=. International conference on machine learning , pages=. 2021 , organization=

  20. [20]

    2023 , eprint=

    On Penalty-based Bilevel Gradient Descent Method , author=. 2023 , eprint=

  21. [21]

    International Conference on Machine Learning , year=

    Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach , author=. International Conference on Machine Learning , year=

  22. [22]

    2018 , eprint=

    Approximation Methods for Bilevel Programming , author=. 2018 , eprint=

  23. [23]

    2024 , eprint=

    First-Order Methods for Linearly Constrained Bilevel Optimization , author=. 2024 , eprint=

  24. [24]

    SIAM Journal on Optimization , volume=

    A two-timescale stochastic algorithm framework for bilevel optimization: Complexity analysis and application to actor-critic , author=. SIAM Journal on Optimization , volume=. 2023 , publisher=

  25. [25]

    2024 , eprint=

    Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions , author=. 2024 , eprint=

  26. [26]

    International Conference on Machine Learning , year=

    Hyperparameter optimization with approximate gradient , author=. International Conference on Machine Learning , year=

  27. [27]

    Advances in Neural Information Processing Systems , volume=

    Provably faster algorithms for bilevel optimization , author=. Advances in Neural Information Processing Systems , volume=

  28. [28]

    International conference on machine learning , pages=

    Gradient-based hyperparameter optimization through reversible learning , author=. International conference on machine learning , pages=. 2015 , organization=

  29. [29]

    International Conference on Machine Learning , year=

    Forward and Reverse Gradient-Based Hyperparameter Optimization , author=. International Conference on Machine Learning , year=

  30. [30]

    ArXiv , year=

    On First-Order Meta-Learning Algorithms , author=. ArXiv , year=

  31. [31]

    Asian Conference on Machine Learning , year=

    Penalty Method for Inversion-Free Deep Bilevel Optimization , author=. Asian Conference on Machine Learning , year=

  32. [32]

    2024 , eprint=

    On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation , author=. 2024 , eprint=

  33. [33]

    Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =

    A Single-Timescale Method for Stochastic Bilevel Optimization , author =. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =. 2022 , editor =

  34. [34]

    ArXiv , year=

    A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization , author=. ArXiv , year=

  35. [35]

    ArXiv , year=

    A Fully First-Order Method for Stochastic Bilevel Optimization , author=. ArXiv , year=

  36. [36]

    Yang, Yifan and Xiao, Peiyao and Ji, Kaiyi , journal=

  37. [37]

    Advances in neural information processing systems , volume=

    A near-optimal algorithm for stochastic bilevel optimization via double-momentum , author=. Advances in neural information processing systems , volume=

  38. [38]

    arXiv preprint arXiv:2406.17386 , year=

    Double Momentum Method for Lower-Level Constrained Bilevel Optimization , author=. arXiv preprint arXiv:2406.17386 , year=

  39. [39]

    arXiv preprint arXiv:2110.00604 , year=

    Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems , author=. arXiv preprint arXiv:2110.00604 , year=

  40. [40]

    SIAM Journal on Optimization , volume=

    Minimax Problems with Coupled Linear Constraints: Computational Complexity and Duality , author=. SIAM Journal on Optimization , volume=. 2023 , publisher=

  41. [41]

    2024 , eprint=

    First-order penalty methods for bilevel optimization , author=. 2024 , eprint=

  42. [42]

    International Conference on Artificial Intelligence and Statistics , year=

    Alternating Projected SGD for Equality-constrained Bilevel Optimization , author=. International Conference on Artificial Intelligence and Statistics , year=

  43. [43]

    Annals of Operations Research , year=

    An overview of bilevel optimization , author=. Annals of Operations Research , year=

  44. [44]

    Mathematical Programming , year=

    Gradient methods for minimizing composite functions , author=. Mathematical Programming , year=

  45. [45]

    Soft Computing , year=

    Bi-level programming problem in the supply chain and its solution algorithm , author=. Soft Computing , year=

  46. [46]

    arXiv preprint arXiv:2402.03883 , year=

    A framework for bilevel optimization on Riemannian manifolds , author=. arXiv preprint arXiv:2402.03883 , year=

  47. [47]

    arXiv preprint arXiv:2402.02019 , year=

    Riemannian Bilevel Optimization , author=. arXiv preprint arXiv:2402.02019 , year=

  48. [48]

    arXiv preprint arXiv:2405.15816 , year=

    Riemannian Bilevel Optimization , author=. arXiv preprint arXiv:2405.15816 , year=

  49. [49]

    Annals of Telecommunications , year=

    A bi-objective optimization model for segment routing traffic engineering , author=. Annals of Telecommunications , year=

  50. [50]

    A bi-level optimal scheduling model for new-type power systems integrating large-scale renewable energy , volume =

    Zhao, Huiru and Zhang, Chao and Zhao, Yihang , year =. A bi-level optimal scheduling model for new-type power systems integrating large-scale renewable energy , volume =. Clean Energy , doi =

  51. [51]

    ArXiv , year=

    Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs , author=. ArXiv , year=

  52. [52]

    Bertsekas , volume=

    Dimitri P. Bertsekas , volume=. Nonlinear programming: 3rd edition , journal=. 1997 , publisher=

  53. [53]

    2004 , publisher=

    Convex optimization , author=. 2004 , publisher=

  54. [54]

    University Lecture , year=

    Convex analysis and nonsmooth optimization , author=. University Lecture , year=

  55. [55]

    2018 , url=

    A Tutorial on Sensitivity and Stability in Nonlinear Programming and Variational Inequalities under Differentiability Assumptions , author=. 2018 , url=

  56. [56]

    SIAM Journal on Optimization , volume =

    Efficient first order method for saddle point problems with higher order smoothness , author=. SIAM Journal on Optimization , volume =. 2024 , publisher=

  57. [57]

    Mathematical Programming , volume=

    Lower bounds for finding stationary points II: first-order methods , author=. Mathematical Programming , volume=. 2021 , publisher=

  58. [58]

    arXiv preprint arXiv:2408.09661 , year=

    Enhanced Barrier-Smoothing Technique for Bilevel Optimization with Nonsmooth Mappings , author=. arXiv preprint arXiv:2408.09661 , year=

  59. [59]

    A gentle and incomplete introduction to bilevel optimization , author=

  60. [60]

    European Journal of Operational Research , volume=

    A survey on bilevel optimization under uncertainty , author=. European Journal of Operational Research , volume=. 2023 , publisher=

  61. [61]

    ArXiv , year=

    Gradient-based Hyperparameter Optimization through Reversible Learning , author=. ArXiv , year=

  62. [62]

    International conference on machine learning , pages=

    Bilevel programming for hyperparameter optimization and meta-learning , author=. International conference on machine learning , pages=. 2018 , organization=

  63. [63]

    International conference on machine learning , pages=

    Model-agnostic meta-learning for fast adaptation of deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

  64. [64]

    Conference on Uncertainty in Artificial Intelligence , pages=

    Learning intrinsic rewards as a bi-level optimization problem , author=. Conference on Uncertainty in Artificial Intelligence , pages=. 2020 , organization=

  65. [65]

    2006 , publisher=

    Numerical optimization , author=. 2006 , publisher=

  66. [66]

    arXiv preprint arXiv:2410.10670 , year=

    A Barrier Function Approach for Bilevel Optimization with Coupled Lower-Level Constraints: Formulation, Approximation and Algorithms , author=. arXiv preprint arXiv:2410.10670 , year=

  67. [67]

    arXiv preprint arXiv:2506.08164 , year=

    BLUR: A Bi-Level Optimization Approach for LLM Unlearning , author=. arXiv preprint arXiv:2506.08164 , year=

  68. [68]

    The Thirteenth International Conference on Learning Representations , year=

    Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment , author=. The Thirteenth International Conference on Learning Representations , year=

  69. [69]

    Computational Optimization and Applications , volume=

    Gauss--Newton-type methods for bilevel optimization , author=. Computational Optimization and Applications , volume=. 2021 , publisher=

  70. [70]

    arXiv preprint arXiv:2510.01487 , year=

    A Sensitivity-Based Method for Bilevel Optimization Problems: Theoretical Analysis and Computational Performance , author=. arXiv preprint arXiv:2510.01487 , year=

  71. [71]

    Argonne National Laboratory, USA , year=

    Regularizing bilevel nonlinear programs by lifting , author=. Argonne National Laboratory, USA , year=

  72. [72]

    arXiv preprint arXiv:2509.01148 , year=

    A Correspondence-Driven Approach for Bilevel Decision-making with Nonconvex Lower-Level Problems , author=. arXiv preprint arXiv:2509.01148 , year=

  73. [73]

    arXiv preprint arXiv:2510.24710 , year=

    A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization , author=. arXiv preprint arXiv:2510.24710 , year=

  74. [74]

    arXiv preprint arXiv:2309.01753 , year=

    On penalty methods for nonconvex bilevel optimization and first-order stochastic approximation , author=. arXiv preprint arXiv:2309.01753 , year=

  75. [75]

    SIAM journal on Optimization , volume=

    Qualification conditions in semialgebraic programming , author=. SIAM journal on Optimization , volume=. 2018 , publisher=

  76. [76]

    Mathematical programming , volume=

    Complementarity and nondegeneracy in semidefinite programming , author=. Mathematical programming , volume=. 1997 , publisher=

  77. [77]

    SIAM Journal on Optimization , volume=

    Generic minimizing behavior in semialgebraic optimization , author=. SIAM Journal on Optimization , volume=. 2016 , publisher=

  78. [78]

    Mathematics of operations research , volume=

    Genericity results in linear conic programming—a tour d’horizon , author=. Mathematics of operations research , volume=. 2017 , publisher=

  79. [79]

    Genericity in Linear Algebra and Analysis with Application to Optimization , author=

  80. [80]

    2013 , publisher=

    Nonlinear optimization in finite dimensions: Morse theory, Chebyshev approximation, transversality, flows, parametric aspects , author=. 2013 , publisher=

Showing first 80 references.