{"total":18,"items":[{"citing_arxiv_id":"2605.21341","ref_index":21,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Semiparametric Efficient Bilevel Gradient Estimation","primary_cat":"stat.ML","submitted_at":"2026-05-20T16:07:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces a cross-fitted orthogonal hypergradient estimator derived from the efficient influence function that achieves asymptotic normality and uniform control for bilevel gradient estimation under quadratic losses.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.20898","ref_index":10,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Continuous-Time Analysis for Minimax and Bilevel Problems","primary_cat":"math.OC","submitted_at":"2026-05-20T08:38:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces a modular unified Lyapunov template for continuous-time analysis of minimax, bilevel (via penalty), and min-min-max problems with explicit time-scale thresholds.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12718","ref_index":57,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"CHAL: Council of Hierarchical Agentic Language","primary_cat":"cs.AI","submitted_at":"2026-05-12T20:26:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Soricut, J. Stanway, A. Burak Pol, A. Mensch, et al. Gemini: A family of highly capable multimodal models.arXiv preprint arXiv:2312.11805, 2023. [55] E. L. Gettier. Is justified true belief knowledge?Analysis, 23:121-123, 1963. [56] S. Ghadimi and M. Wang. Approximation methods for bilevel programming.arXiv e-prints, art. arXiv:1802.02246, 2018. 12 [57] C. Gilligan.In a Different Voice: Psychological Theory and Women's Development. Harvard University Press, Cambridge, MA, 1982. [58] T. Giovannelli, G. D. Kent, and L. N. Vicente. Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems.Journal of Global Optimization, 92: 569-614, 2025. [59] Tommaso Giovannelli, Griffin Dean Kent, and Luis Nunes Vicente."},{"citing_arxiv_id":"2605.12693","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback","primary_cat":"cs.LG","submitted_at":"2026-05-12T19:43:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"IGT-OMD reduces gradient transport error from quadratic to linear in delay length for delayed bilevel optimization and achieves sublinear regret with adaptive steps.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11476","ref_index":44,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Barrier-Metric First-Order Method for Linearly Constrained Bilevel Optimization","primary_cat":"math.OC","submitted_at":"2026-05-12T03:44:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A barrier-smoothed first-order method achieves stationarity rates of tilde O(K to the -2/3) deterministic and tilde O(K to the -2/5) stochastic for linearly constrained bilevel optimization.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"Lemma 8(Moving-anchor recursion for the exact tracker).Fix an outer iterationk. Define Jk+1 := zk+1 −y ⋆ µ(xk+1) 2 y⋆µ(xk+1). Let κz,k := \u0010 1− y⋆ µ(xk+1)−y ⋆ µ(xk) y⋆µ(xk) \u0011−1 . Then, for anyj k >0, Jk+1 ≤κ 2 z,k \u0010 1 + 2ξαkjk + (ℓη ∗,1)2ξ2 ℓ2 f,0α2 k + 4ℓ2 g,0β2 k \u0001\u0011 zk+1 −y ⋆ µ(xk) 2 y⋆µ(xk) +κ 2 z,k \u0010 (ℓη ∗,0)2ξ2α2 k + ξαk 2jk (ℓη ∗,0)2 + ξ2 2 α2 k \u0011 ∥qx k ∥2 2.(44) Proof.Anchor change.Whenever y⋆ µ(xk+1)−y ⋆ µ(xk) y⋆µ(xk) <1, Lemma 1, applied to the two exact anchors, gives ∥u∥2 y⋆µ(xk+1) ≤κ 2 z,k∥u∥2 y⋆µ(xk) ∀u. Hence Jk+1 =∥z k+1 −y ⋆ µ(xk+1)∥2 y⋆µ(xk+1) ≤κ 2 z,k∥zk+1 −y ⋆ µ(xk+1)∥2 y⋆µ(xk).(45) Expanding the squared norm at the old anchory ⋆ µ(xk), we obtain ∥zk+1 −y ⋆ µ(xk+1)∥2 y⋆µ(xk) =∥z k+1 −y ⋆ µ(xk)∥2"},{"citing_arxiv_id":"2605.10288","ref_index":15,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"BROS: Bias-Corrected Randomized Subspaces for Memory-Efficient Single-Loop Bilevel Optimization","primary_cat":"cs.LG","submitted_at":"2026-05-11T09:50:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BROS achieves memory-efficient single-loop stochastic bilevel optimization with O(ε^{-2}) sample complexity by performing updates in randomized subspaces and using Rademacher bi-probe correction for unbiased estimation.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"and the algorithm accesses these objectives through stochastic oracles. Full-space hypergradient methods.A large body of algorithms has been developed for SBO. A common route is to estimate the hypergradient∇Φ(x)either by approximating the inverse Hessian or by maintaining an auxiliary linear-system variable updated through Hessian-vector products (HVPs) and Jacobian-vector products (JVPs) [15, 29]. This route includes early stochastic bilevel methods [15, 24, 28] and recent single-loop methods such as SOBA/SABA [10] and MA-SOBA [7], which serve as the main baselines for our theoretical arXiv:2605.10288v2 [cs.LG] 12 May 2026 2 convergence rate comparison. Other variance-reduced or momentum-based baselines such as SPABA [9], SUSTAIN [29], and MRBO [59] improve sample complexity but require finite-sum structure or stronger oracle"},{"citing_arxiv_id":"2605.08006","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Penalty-Based First-Order Methods for Bilevel Optimization with Minimax and Constrained Lower-Level Problems","primary_cat":"math.OC","submitted_at":"2026-05-08T16:59:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Penalty-based first-order methods find ε-KKT points in bilevel minimax problems with Õ(ε^{-4}) deterministic and Õ(ε^{-9}) stochastic oracle complexity, improving prior bounds for constrained lower-level cases via Lagrangian duality.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Bilevel programming for hyperparameter optimization and meta-learning. InInternational Conference on Machine Learning, pages 1568-1577, 2018. [17] Lucy L Gao, Jane Ye, Haian Yin, Shangzhi Zeng, and Jin Zhang. Value function based difference-of-convex algorithm for bilevel hyperparameter selection problems. InInternational conference on machine learning, pages 7164-7182. PMLR, 2022. 11 [18] Saeed Ghadimi and Mengdi Wang. Approximation methods for bilevel programming.arXiv preprint arXiv:1802.02246, 2018. [19] Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.Advances in neural information processing systems, 27, 2014. [20] Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, and Saverio Salzo."},{"citing_arxiv_id":"2605.06431","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Second-Order Bilevel Optimization with Accelerated Convergence Rates","primary_cat":"math.OC","submitted_at":"2026-05-07T15:35:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Second-order bilevel methods achieve Õ(ε^{-1.5}) iteration complexity for second-order stationary points, faster than first-order approaches, with a lazy variant improving computational efficiency by √d.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.08131","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Interactive Inverse Reinforcement Learning of Interaction Scenarios via Bi-level Optimization","primary_cat":"cs.LG","submitted_at":"2026-05-01T05:01:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Interactive IRL is cast as bi-level optimization with an inner loop learning expert rewards and an outer loop learning interaction policies, solved by the convergent BISIRL algorithm.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"Suppose Assumption 6.1 holds, with the choices of parameters given by p(k) = 1 k , αk = 1 Lf √ K , tk =⌈ 4√k+1 2 ⌉, the following convergence guarantee holds: 1 K PK−1 k=0 E[∥∇f(θ l(k), θ∗ e(θl(k)))∥2]≤ O( 1√ K ). Theorem 6.3 indicates that the expected hypergradient decays at a rate of O( 1√ K ), matching the convergence rate of standard bilevel optimization [9]. This implies that the bias introduced by SPSA does not slow convergence compared to standard bilevel optimization. Corollary 6.4 shows that the linear reward function rθe is a sufficient condition for the convergence of the policyπθe. Convergence of the cumulative reward difference has been widely used to infer convergence of the learned expert policy [29, 30]."},{"citing_arxiv_id":"2604.20115","ref_index":104,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"On the Stability and Generalization of First-order Bilevel Minimax Optimization","primary_cat":"cs.LG","submitted_at":"2026-04-22T02:27:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.19072","ref_index":278,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection","primary_cat":"cs.LG","submitted_at":"2026-04-21T04:27:12+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.17396","ref_index":25,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Representation-Guided Parameter-Efficient LLM Unlearning","primary_cat":"cs.CL","submitted_at":"2026-04-19T11:59:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.11712","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Distributed Bilevel Framework for the Macroscopic Optimization of Multi-Agent Systems","primary_cat":"math.OC","submitted_at":"2026-04-13T16:46:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A distributed bilevel algorithm optimizes emergent macroscopic behavior in multi-agent systems by combining local exponential-family state estimation with hypergradient microscopic updates and proves convergence via timescale separation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.04090","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization","primary_cat":"cs.LG","submitted_at":"2026-04-05T12:12:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Derives upper bounds on on-average argument stability for single- and two-timescale SGD in bilevel optimization under NC-NC, C-C, and SC-SC regimes, linking stability directly to generalization gaps.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.06457","ref_index":5,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Achieving Better Local Regret Bound for Online Non-Convex Bilevel Optimization","primary_cat":"cs.LG","submitted_at":"2026-02-06T07:40:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Algorithms achieve optimal regret bounds of Ω(1+V_T) for standard bilevel local regret with O(T log T) inner gradients and Ω(T/W²) for window-averaged regret using adaptive and window-based analyses.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.16399","ref_index":2,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to LLM Fine-Tuning","primary_cat":"cs.LG","submitted_at":"2026-01-23T02:12:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A Hessian-free single-loop actor-critic algorithm achieves finite-time convergence to the unregularized bi-level RL optimum using attenuating entropy regularization under a special Polyak-Lojasiewicz condition.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.01126","ref_index":28,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization","primary_cat":"cs.LG","submitted_at":"2025-11-03T00:29:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces a novel search direction enabling sublinear stochastic bilevel regret guarantees for first- and zeroth-order online bilevel optimization algorithms without relying on window smoothing.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2401.03893","ref_index":6,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Finite-Time Decoupled Convergence in Nonlinear Two-Time-Scale Stochastic Approximation","primary_cat":"math.OC","submitted_at":"2024-01-08T13:44:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Under nested local linearity, nonlinear two-time-scale SA achieves finite-time decoupled convergence; nonlinearity in the slow update alone can destroy it.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}