arxiv: 2605.11627 · v1 · submitted 2026-05-12 · 🧮 math.OC

Recognition: 2 theorem links

· Lean Theorem

Proximal Limited-Memory Quasi-Newton Methods for Nonsmooth Nonconvex Optimization

Alberto De Marchi, Christian Kanzow, Simeon vom Dahl

Pith reviewed 2026-05-13 01:25 UTC · model grok-4.3

classification 🧮 math.OC

keywords proximal quasi-Newtonlimited-memory methodsnonsmooth nonconvex optimizationglobal convergenceKurdyka-Lojasiewicz propertyproximal operatorscompact representation

0 comments

The pith

A proximal limited-memory quasi-Newton scheme converges globally for nonsmooth nonconvex problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an algorithm that minimizes the sum of a continuously differentiable function and a proper lower semicontinuous prox-bounded possibly nonsmooth function, where both may be nonconvex. It combines proximal operators with limited-memory quasi-Newton updates and adapts a regularization parameter to enforce sufficient decrease at each step. Global convergence to stationary points is shown under mild conditions on the two functions, followed by convergence of the full sequence with rates when the objective satisfies the Kurdyka-Lojasiewicz property. The subproblems are solved efficiently by using compact representations of the limited-memory updates, and numerical tests indicate practical speedups over existing approaches.

Core claim

We introduce a proximal limited-memory quasi-Newton scheme for minimizing the sum of a continuously differentiable function and a proper, lower semicontinuous and prox-bounded, possibly nonsmooth, function. Both functions might be nonconvex. The method builds upon the computation of scaled proximal operators and is globalized by adaptively updating a regularization parameter based on a criterion of sufficient decrease. We prove global convergence under mild assumptions and then establish convergence of the entire sequence (with rates) under the Kurdyka--Lojasiewicz property. To efficiently solve the subproblems, we exploit the compact representation of limited-memory quasi-Newton updates. We

What carries the argument

The proximal limited-memory quasi-Newton update that uses a compact representation to approximate the inverse Hessian of the smooth part and compute the scaled proximal operator at each iteration.

If this is right

Global convergence to stationary points holds under the stated mild assumptions on the smooth and nonsmooth parts.
Convergence of the entire sequence together with explicit rates follows when the objective satisfies the Kurdyka-Lojasiewicz property.
Subproblems remain tractable because the limited-memory updates admit a compact representation that avoids storing full matrices.
A rank-one limited-memory Kleinmichel formula preserves positive definiteness under the same condition required for BFGS.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The compact representation may extend the method to problem dimensions where full-matrix storage is impossible.
The observed numerical speedups suggest the scheme could replace slower first-order proximal methods on large-scale instances.
Similar limited-memory proximal ideas might improve other nonconvex algorithms that rely on proximal mappings.

Load-bearing premise

The nonsmooth function is proper, lower semicontinuous and prox-bounded, the smooth part is continuously differentiable, and the adaptive regularization guarantees sufficient decrease at every step.

What would settle it

A concrete example or counterexample in which the iterates satisfy the sufficient-decrease condition at every iteration yet do not approach a stationary point, or in which the full sequence fails to converge when the objective satisfies the Kurdyka-Lojasiewicz property.

Figures

Figures reproduced from arXiv: 2605.11627 by Alberto De Marchi, Christian Kanzow, Simeon vom Dahl.

**Figure 1.** Figure 1: Comparison on regularized logistic regression problems, in terms of relative runtimes. Accuracy level: low (left) and high (right). Regularization: ℓ1 (top) and capped-ℓ1 (bottom). The effect of adopting a nonmonotone globalization is illustrated with pair-wise relative runtime profiles in [PITH_FULL_IMAGE:figures/full_fig_p027_1.png] view at source ↗

**Figure 2.** Figure 2: Comparison of solver variants with monotone and nonmonotone globalization, in terms of relative runtimes. Results aggregated for all logistic regression problems. Accuracy level: low (left) and high (right). Sample size: 120 problems for each accuracy level. 7.2 Regularized Student’s t-regression We consider the regularized Student’s t-regression problem, which seeks to minimize x 1 m Xm i=1 log 1 + (Ax … view at source ↗

**Figure 3.** Figure 3: Comparison on regularized Student’s t-regression problems, in terms of relative runtimes. Accuracy level: low (left) and high (right). Regularization: ℓ1 (top) and capped-ℓ1 (bottom). regularizer ℓ1 capped ℓ1 accuracy low high low high SPG-nm 0.004 47.992 0.002 0.138 PANOCplus-nm 0.006 24.737 0.003 1.529 R2N-nm 0.035 0.671 0.003 0.371 RPQN-lkm-nm 0.003 0.122 0.002 0.096 RPQN-lsr1-nm 0.002 1.340 0.002 0.85… view at source ↗

**Figure 4.** Figure 4: Comparison on discretized obstacle problems, in terms of relative runtimes. Accuracy level: low (left) and high (right). 8 Conclusion A proximal method with adaptive quadratic regularization and limited-memory quasi-Newton models was introduced in this work. Global convergence was established under weak assumptions; in particular, no global Lipschitz continuity is required. Moreover, convergence of the it… view at source ↗

read the original abstract

We introduce a proximal limited--memory quasi--Newton scheme for minimizing the sum of a continuously differentiable function and a proper, lower semicontinuous and prox-bounded, possibly nonsmooth, function. Both functions might be nonconvex. The method builds upon the computation of scaled proximal operators and is globalized by adaptively updating a regularization parameter based on a criterion of sufficient decrease. We prove global convergence under mild assumptions and then establish convergence of the entire sequence (with rates) under the Kurdyka--Lojasiewicz property. To efficiently solve the subproblems, we exploit the compact representation of limited-memory quasi-Newton updates. We derive also a compact representation of the limited--memory Kleinmichel formula, a rank-one quasi-Newton scheme that preserves positive definiteness under the same condition as the BFGS update. Numerical results show a significant speed up compared to other methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a practical limited-memory proximal quasi-Newton method for nonsmooth nonconvex problems, with global convergence and KL rates plus compact representations for efficiency.

read the letter

The main thing to know is that the authors introduce a proximal limited-memory quasi-Newton scheme for minimizing a smooth function plus a proper lower semicontinuous prox-bounded nonsmooth term, both allowed to be nonconvex. They globalize it with an adaptive regularization parameter that enforces sufficient decrease, prove global convergence to stationary points under standard assumptions, and get full-sequence convergence with rates when the Kurdyka-Lojasiewicz property holds. Subproblems are solved efficiently via compact limited-memory representations, including a new compact form of the Kleinmichel formula that keeps positive definiteness like BFGS does.

Referee Report

2 major / 2 minor

Summary. The paper introduces a proximal limited-memory quasi-Newton algorithm for minimizing f(x) + g(x) where f is continuously differentiable (possibly nonconvex) and g is proper, lower semicontinuous, and prox-bounded (possibly nonsmooth and nonconvex). The method employs scaled proximal operators with limited-memory quasi-Newton updates, globalized by an adaptive regularization parameter enforcing sufficient decrease. It claims global convergence to stationary points under mild assumptions, followed by convergence of the full sequence (with rates) under the Kurdyka-Łojasiewicz property. Compact representations of the limited-memory updates are derived, including a new compact form of the Kleinmichel formula that preserves positive definiteness under the same conditions as BFGS. Numerical experiments report significant speedups relative to other methods.

Significance. If the convergence results hold, the work supplies a practical, theoretically supported extension of limited-memory quasi-Newton methods to nonsmooth nonconvex problems, addressing a gap where such approximations are often needed for scalability. The compact representations and Kleinmichel variant are concrete contributions that could aid implementation, and the reported numerical gains indicate potential practical impact in large-scale optimization.

major comments (2)

[convergence analysis] The global convergence argument (abstract and convergence section) rests on the adaptive regularization ensuring sufficient decrease at each step while using the limited-memory quasi-Newton direction; however, the manuscript must explicitly verify that the compact representation preserves the descent property required for the decrease condition, as the curvature approximation could otherwise violate the assumptions used in the proximal-operator analysis.
[KL convergence theorem] The claim of full-sequence convergence with rates under the KL property (abstract) depends on the subproblem solver and the properties of the scaled proximal operator; the interaction between the limited-memory update and the KL inequality needs a self-contained argument showing that the sequence remains in the region where the KL exponent applies, without additional hidden restrictions on the prox-boundedness constant.

minor comments (2)

[abstract] The abstract states that both functions 'might be nonconvex' but the assumptions list only C1 for the smooth part; clarify whether the nonconvexity of f is fully covered by the stated hypotheses or requires an additional local Lipschitz gradient assumption.
[algorithm description] Notation for the limited-memory matrices (e.g., the compact form of the Kleinmichel update) should be introduced with an explicit reference to the standard L-BFGS two-loop recursion to aid readers unfamiliar with the rank-one variant.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and the constructive comments. We are pleased that the referee recognizes the significance of extending limited-memory quasi-Newton methods to nonsmooth nonconvex problems. We address the major comments below and will revise the manuscript accordingly to incorporate the suggested clarifications.

read point-by-point responses

Referee: [convergence analysis] The global convergence argument (abstract and convergence section) rests on the adaptive regularization ensuring sufficient decrease at each step while using the limited-memory quasi-Newton direction; however, the manuscript must explicitly verify that the compact representation preserves the descent property required for the decrease condition, as the curvature approximation could otherwise violate the assumptions used in the proximal-operator analysis.

Authors: We agree that an explicit verification is beneficial for clarity. In the revised manuscript, we will add a short lemma (or remark) in the global convergence section demonstrating that the compact representations of the limited-memory updates (including the new Kleinmichel form) are mathematically equivalent to the standard updates and therefore preserve the curvature conditions, positive definiteness, and descent property required for the sufficient decrease condition. This equivalence follows directly from the derivations in Section 3, and the added lemma will make the connection to the proximal-operator analysis explicit without altering the existing proof structure. revision: yes
Referee: [KL convergence theorem] The claim of full-sequence convergence with rates under the KL property (abstract) depends on the subproblem solver and the properties of the scaled proximal operator; the interaction between the limited-memory update and the KL inequality needs a self-contained argument showing that the sequence remains in the region where the KL exponent applies, without additional hidden restrictions on the prox-boundedness constant.

Authors: We thank the referee for this observation. In the revised version, we will expand the KL convergence section with a self-contained argument that explicitly shows the sequence remains in the relevant neighborhood for the KL inequality. This will use the adaptive regularization and the prox-boundedness assumption to bound the iterates, confirming that the KL exponent applies uniformly without imposing any new or hidden restrictions on the prox-boundedness constant. The argument will also tie the subproblem solver properties back to the scaled proximal operator to support the full-sequence convergence and rates as stated. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation chain consists of a standard proximal quasi-Newton framework globalized via adaptive regularization to enforce sufficient decrease, followed by a separate KL-property argument for full-sequence convergence and rates. These steps rely on established proximal-operator properties, lower semicontinuity, and prox-boundedness assumptions that are stated independently of the algorithm's limited-memory representation or Kleinmichel formula. The compact representations are introduced solely for subproblem efficiency and do not enter the convergence proofs. No equation or claim reduces to a fitted parameter, self-defined quantity, or load-bearing self-citation whose validity depends on the present paper.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard nonsmooth optimization assumptions and the Kurdyka-Lojasiewicz property; no explicit free parameters or invented entities are identifiable.

axioms (2)

domain assumption The nonsmooth function is proper, lower semicontinuous, and prox-bounded
Invoked to ensure the proximal operator is well-defined and the method is applicable.
domain assumption Kurdyka-Lojasiewicz property
Used to obtain convergence rates for the entire sequence.

pith-pipeline@v0.9.0 · 5450 in / 1328 out tokens · 38575 ms · 2026-05-13T01:25:17.566912+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation washburn_uniqueness_aczel unclear
We prove global convergence under mild assumptions and then establish convergence of the entire sequence (with rates) under the Kurdyka–Łojasiewicz property.
IndisputableMonolith/Foundation/BranchSelection branch_selection unclear
The method builds upon the computation of scaled proximal operators and is globalized by adaptively updating a regularization parameter based on a criterion of sufficient decrease.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages

[1]

F. J. Arag´ on Artacho, R. M. T. Fleming, and P. T. Vuong. Accelerating the DC algorithm for smooth functions.Mathematical Programming, 169(1):95–118, 2018. doi:10.1007/s10107-017-1180-1

work page doi:10.1007/s10107-017-1180-1 2018
[2]

A. Y. Aravkin, R. Baraldi, and D. Orban. A proximal quasi-Newton trust-region method for nonsmooth regularized optimization.SIAM Journal on Optimization, 32(2):900–929, 2022. doi:10.1137/21M1409536

work page doi:10.1137/21m1409536 2022
[3]

Attouch, J

H. Attouch, J. Bolte, P. Redont, and A. Soubeyran. Proximal alternating minimization and projec- tion methods for nonconvex problems: An approach based on the Kurdyka- Lojasiewicz inequality. Mathematics of Operations Research, 35:438–457, 2010. doi:10.1287/moor.1100.0449

work page doi:10.1287/moor.1100.0449 2010
[4]

Becker and J

S. Becker and J. Fadili. A quasi-Newton proximal splitting method. InAdvances in Neural Infor- mation Processing Systems, volume 25, 2012

work page 2012
[5]

Becker, J

S. Becker, J. Fadili, and P. Ochs. On quasi-Newton forward-backward splitting: Proximal calculus and convergence.SIAM Journal on Optimization, 29(4):2445–2481, 2019. doi:10.1137/18M1167152

work page doi:10.1137/18m1167152 2019
[6]

Bian and X

W. Bian and X. Chen. Linearly constrained non-Lipschitz optimization for image restoration.SIAM Journal on Imaging Sciences, 8(4):2294–2322, 2015. doi:10.1137/140985639. 31

work page doi:10.1137/140985639 2015
[7]

E. G. Birgin, J. M. Mart´ ınez, and M. Raydan. Nonmonotone spectral projected gra- dient methods on convex sets.SIAM Journal on Optimization, 10(4):1196–1211, 2000. doi:10.1137/S1052623497330963

work page doi:10.1137/s1052623497330963 2000
[8]

E. G. Birgin, J. M. Mart´ ınez, and M. Raydan. Spectral projected gradient methods: Review and perspectives.Journal of Statistical Software, 60(3):1–21, 2014. doi:10.18637/jss.v060.i03

work page doi:10.18637/jss.v060.i03 2014
[9]

Bolte, A

J. Bolte, A. Daniilidis, A. Lewis, and M. Shiota. Clarke subgradients of stratifiable functions.SIAM Journal on Optimization, 18:556–572, 2007. doi:10.1137/060670080

work page doi:10.1137/060670080 2007
[10]

Bolte, T

J. Bolte, T. Nguyen, J. Peypouquet, and B. Suter. From error bounds to the complexity of first-order descent methods for convex functions.Mathematical Programming, 165, 10 2016. doi:10.1007/s10107-016-1091-6

work page doi:10.1007/s10107-016-1091-6 2016
[11]

R. H. Byrd, J. Nocedal, and R. B. Schnabel. Representations of quasi-Newton matrices and their use in limited memory methods.Mathematical Programming, 63(1):129–156, 1994. doi:10.1007/BF01582063

work page doi:10.1007/bf01582063 1994
[12]

De Marchi

A. De Marchi. Proximal gradient methods beyond monotony.Journal of Nonsmooth Analysis and Optimization, 4, 2023. doi:10.46298/jnsao-2023-10290

work page doi:10.46298/jnsao-2023-10290 2023
[13]

De Marchi and A

A. De Marchi and A. Themelis. Proximal gradient algorithms under local Lipschitz gradient conti- nuity.Journal of Optimization Theory and Applications, 194(3):771–794, 2022. doi:10.1007/s10957- 022-02048-5

work page doi:10.1007/s10957- 2022
[14]

Diouane, M

Y. Diouane, M. L. Habiboullah, and D. Orban. A proximal modified quasi-Newton method for nonsmooth regularized optimization.SIAM Journal on Optimization, 36(2):534–563, 2026. doi:10.1137/24M169761X

work page doi:10.1137/24m169761x 2026
[15]

Gollier, M

M. Gollier, M. L. Habiboullah, G. Leconte, R. Baraldi, A. De Marchi, D. Orban, and Y. Diouane. RegularizedOptimization.jl: A Julia framework for regularized and nonsmooth optimization.Jour- nal of Open Source Software, 11(118):9344, 2026. doi:10.21105/joss.09344

work page doi:10.21105/joss.09344 2026
[16]

Grippo, F

L. Grippo, F. Lampariello, and S. Lucidi. A nonmonotone line search technique for Newton’s method.SIAM Journal on Numerical Analysis, 23(4):707–716, 1986. doi:10.1137/0723046

work page doi:10.1137/0723046 1986
[17]

X. Jia, C. Kanzow, and P. Mehlitz. Convergence analysis of the proximal gradient method in the presence of the Kurdyka– Lojasiewicz property without global Lipschitz assumptions.SIAM Journal on Optimization, 33(4):3038–3056, 2023. doi:10.1137/23M1548293

work page doi:10.1137/23m1548293 2023
[18]

Jin.Lectures on Nonsmooth Optimization

Q. Jin.Lectures on Nonsmooth Optimization. Springer, 2025. doi:10.1007/978-3-031-91417-1

work page doi:10.1007/978-3-031-91417-1 2025
[19]

Kanzow and T

C. Kanzow and T. Lechner. Efficient regularized proximal quasi-Newton methods for large-scale nonconvex composite optimization problems.Pacific Journal of Optimization, 20:537–568, 2024. doi:10.61208/pjo-2023-036

work page doi:10.61208/pjo-2023-036 2024
[20]

Kanzow and L

C. Kanzow and L. Lehmann. Convergence of nonmonotone proximal gradient methods under the Kurdyka- Lojasiewicz property without a global Lipschitz assumption.Journal of Optimization Theory and Applications, 207(1):4, 2025. doi:10.1007/s10957-025-02762-w

work page doi:10.1007/s10957-025-02762-w 2025
[21]

Kanzow and P

C. Kanzow and P. Mehlitz. Convergence properties of monotone and nonmonotone proximal gradient methods revisited.Journal of Optimization Theory and Applications, 195:1–23, 2022. doi:10.1007/s10957-022-02101-3

work page doi:10.1007/s10957-022-02101-3 2022
[22]

Kanzow and D

C. Kanzow and D. Steck. Regularization of limited memory quasi-Newton methods for large- scale nonconvex minimization.Mathematical Programming Computation, 15(3):417–444, 2023. doi:10.1007/s12532-023-00238-4

work page doi:10.1007/s12532-023-00238-4 2023
[23]

Kleinmichel

H. Kleinmichel. Quasi-Newton-Verfahren vom Rang-Eins-Typ zur L¨ osung unrestringierter Min- imierungsprobleme, Teil 1.Numerische Mathematik, 38(2):219–228, 1981. doi:10.1007/BF01397091

work page doi:10.1007/bf01397091 1981
[24]

Kleinmichel

H. Kleinmichel. Quasi-Newton-Verfahren vom Rang-Eins-Typ zur L¨ osung unrestringierter Min- imierungsprobleme, Teil 2.Numerische Mathematik, 38(2):229–244, 1981. doi:10.1007/BF01397092. 32

work page doi:10.1007/bf01397092 1981
[25]

Koh, S.-J

K. Koh, S.-J. Kim, and S. Boyd. An interior-point method for large-scale l1-regularized logistic regression.Journal of Machine Learning Research, 8(54):1519–1555, 2007

work page 2007
[26]

Lechner.Proximal Methods for Nonconvex Composite Optimization Problems

T. Lechner.Proximal Methods for Nonconvex Composite Optimization Problems. PhD Thesis, Institute of Mathematics, University of W¨ urzburg, 2022. doi:10.25972/OPUS-28907

work page doi:10.25972/opus-28907 2022
[27]

J. Lee, Y. Sun, and M. Saunders. Proximal Newton-type methods for minimizing composite func- tions.SIAM Journal on Optimization, 24:1420–1443, 2014. doi:10.1137/130921428

work page doi:10.1137/130921428 2014
[28]

Li and Z

H. Li and Z. Lin. Accelerated proximal gradient methods for nonconvex programming. InAdvances in Neural Information Processing Systems 28, volume 28, 2015

work page 2015
[29]

Mathematical Programming , author =

D. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization.Mathe- matical Programming, 45:503–528, 1989. doi:10.1007/BF01589116

work page doi:10.1007/bf01589116 1989
[30]

R. Liu, S. Pan, Y. Wu, and X. Yang. An inexact regularized proximal Newton method for nonconvex and nonsmooth optimization.Computational Optimization and Applications, 88(2):603–641, 2024. doi:10.1007/s10589-024-00560-0

work page doi:10.1007/s10589-024-00560-0 2024
[31]

Mordukhovich.Variational Analysis and Applications, volume 30

B. Mordukhovich.Variational Analysis and Applications, volume 30. Springer, 2018. doi:10.1007/978-3-319-92775-6

work page doi:10.1007/978-3-319-92775-6 2018
[32]

J. Nocedal. Updating quasi-Newton matrices with limited storage.Mathematics of Computation, 35(151):773–782, 1980. doi:10.1090/S0025-5718-1980-0572855-7

work page doi:10.1090/s0025-5718-1980-0572855-7 1980
[33]

Rockafellar and R

R. Rockafellar and R. J.-B. Wets.Variational Analysis. Springer-Verlag, Heidelberg, Berlin, New York, 1998

work page 1998
[34]

Physica D: Nonlinear Phenomena60(1-4), 259–268 (1992)

L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60:259–268, 1992. doi:10.1016/0167-2789(92)90242-F

work page doi:10.1016/0167-2789(92)90242-f 1992
[35]

Scheinberg and X

K. Scheinberg and X. Tang. Practical inexact proximal quasi-Newton method with global complexity analysis.Mathematical Programming, 160(1–2):495–529, 2016. doi:10.1007/s10107-016-0997-3

work page doi:10.1007/s10107-016-0997-3 2016
[36]

Spellucci

P. Spellucci. A modified rank one update which converges Q-superlinearly.Computational Opti- mization and Applications, 19:273–296, 2001. doi:10.1023/A:1011259905470

work page doi:10.1023/a:1011259905470 2001
[37]

Stella, A

L. Stella, A. Themelis, P. Sopasakis, and P. Patrinos. A simple and efficient algorithm for nonlinear model predictive control. In2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 1939–1944, 2017. doi:10.1109/CDC.2017.8263933

work page doi:10.1109/cdc.2017.8263933 1939
[38]

Themelis, L

A. Themelis, L. Stella, and P. Patrinos. Forward-backward envelope for the sum of two nonconvex functions: Further properties and nonmonotone linesearch algorithms.SIAM Journal on Optimiza- tion, 28(3):2274–2303, 2018. doi:10.1137/16M1080240

work page doi:10.1137/16m1080240 2018
[39]

Tibshirani

R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996. doi:10.1111/j.2517-6161.1996.tb02080.x

work page doi:10.1111/j.2517-6161.1996.tb02080.x 1996
[40]

Ueda and N

K. Ueda and N. Yamashita. A regularized Newton method without line search for unconstrained op- timization.Computational Optimization and Applications, 59(1):321–351, 2014. doi:10.1007/s10589- 014-9656-x

work page doi:10.1007/s10589- 2014
[41]

vom Dahl and C

S. vom Dahl and C. Kanzow. An inexact regularized proximal Newton method without line search. Computational Optimization and Applications, 89(3):585–624, 2024. doi:10.1007/s10589-024-00600- 9

work page doi:10.1007/s10589-024-00600- 2024
[42]

S. J. Wright, R. D. Nowak, and M. A. T. Figueiredo. Sparse reconstruction by sep- arable approximation.IEEE Transactions on Signal Processing, 57(7):2479–2493, 2009. doi:10.1109/TSP.2009.2016892

work page doi:10.1109/tsp.2009.2016892 2009
[43]

Zhang and W

H. Zhang and W. W. Hager. A nonmonotone line search technique and its applica- tion to unconstrained optimization.SIAM Journal on Optimization, 14(4):1043–1056, 2004. doi:10.1137/S1052623403428208. 33

work page doi:10.1137/s1052623403428208 2004