arxiv: 2605.10219 · v1 · submitted 2026-05-11 · 🧮 math.OC · cs.CC· cs.LG

Recognition: 2 theorem links

· Lean Theorem

Parameterized Complexity of Stationarity Testing for Piecewise-Affine Functions and Shallow CNN Losses

Yuhan Ye

Pith reviewed 2026-05-12 03:27 UTC · model grok-4.3

classification 🧮 math.OC cs.CCcs.LG

keywords parameterized complexitypiecewise-affine functionsstationarity testingW[1]-hardnessXP algorithmsReLU CNN lossesnonsmooth optimizationExponential Time Hypothesis

0 comments

The pith

Testing approximate stationarity for continuous piecewise-affine functions is XP-tractable in fixed dimension for some cases but W[1]-hard for others, with ETH lower bounds excluding subexponential dependence on dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes the parameterized complexity of checking approximate first-order stationarity at a given point for continuous piecewise-affine functions, using the ambient dimension as the parameter. It supplies XP algorithms for the tractable variants of the problem and proves W[1]-hardness for the remaining variants, together with Exponential Time Hypothesis lower bounds showing that no algorithm of the form rho(d) times size to the o(d) exists. The same classification applies to testing local minimality of such functions and carries over to stationarity testing inside the weight space of shallow ReLU convolutional neural network losses. A reader would care because piecewise-affine functions serve as a standard model for the local geometry of ReLU-type training objectives, so the results clarify when basic first-order checks remain computationally feasible as dimension grows.

Core claim

The central claim is that the parameterized complexity of approximate stationarity testing for continuous piecewise-affine functions, with dimension d as parameter, consists of XP algorithms on the tractable sides, W[1]-hardness on the complementary sides, and ETH-based lower bounds ruling out any runtime rho(d) size^{o(d)} for computable rho. These results immediately yield the corresponding parameterized picture for testing local minimality of continuous PA functions and extend directly to stationarity testing for a family of shallow ReLU CNN training losses when stationarity is checked in the trainable weight space.

What carries the argument

The ambient dimension d as the parameter for XP algorithms and W[1]-hardness reductions on the stationarity-testing problem for continuous piecewise-affine functions and shallow ReLU CNN losses.

If this is right

XP algorithms exist for the tractable sides of stationarity testing when dimension is fixed.
W[1]-hardness holds for the complementary sides, so no FPT algorithm is expected unless FPT equals W[1].
Exponential Time Hypothesis lower bounds prohibit algorithms whose dependence on input size is subexponential in dimension.
The identical complexity classification holds for testing local minimality of continuous piecewise-affine functions.
The same XP-versus-W[1]-hardness picture applies to stationarity testing for shallow ReLU CNN losses in the weight space.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practical stationarity checks for neural-network losses may remain feasible only when dimension stays small or when additional structure beyond the piecewise-affine model is exploited.
The hardness results suggest that exact first-order verification inside even simple CNN training objectives is unlikely to scale with dimension without further restrictions.
Similar parameterized analyses could be applied to other nonsmooth optimization primitives that share the same polyhedral local geometry.
Average-case or smoothed versions of the stationarity-testing problem might admit better parameterized runtimes than the worst-case picture obtained here.

Load-bearing premise

That continuous piecewise-affine functions adequately capture the local polyhedral geometry of ReLU-type losses and that the chosen notion of approximate first-order stationarity is the right one for the parameterized analysis.

What would settle it

Discovery of an algorithm solving the stationarity-testing problem in time f(d) times size to the o(d) for some computable f, or a polynomial-time algorithm for any of the W[1]-hard cases when dimension is fixed, would refute the stated lower bounds.

Figures

Figures reproduced from arXiv: 2605.10219 by Yuhan Ye.

read the original abstract

We study the parameterized complexity of testing approximate first-order stationarity at a prescribed point for continuous piecewise-affine (PA) functions, a basic task in nonsmooth optimization. PA functions form a canonical model for nonsmooth stationarity testing and capture the local polyhedral geometry that appears in ReLU-type training losses. Recent work by Tian and So (SODA 2025) shows that testing approximate stationarity notions for PA functions is computationally intractable in the worst case, and identifies fixed-dimensional tractability as an open direction. We address this direction from the viewpoint of parameterized complexity, with the ambient dimension $d$ as the parameter. In this paper, we give XP algorithms in fixed dimension for the tractable sides, and prove W[1]-hardness for the complementary sides. Moreover, lower bounds under the Exponential Time Hypothesis rule out algorithms running in time $\rho(d)\size^{o(d)}$ for any computable function $\rho$, where $\size$ denotes the total binary encoding length of the stationarity-testing instance. As a further consequence, our results yield the corresponding parameterized complexity picture for testing local minimality of continuous PA functions. We further extend our hardness results to a family of shallow ReLU CNN training losses, with stationarity tested in the trainable weight space. Thus, the same parameterized-complexity picture also appears for simple CNN training losses.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper settles the open fixed-dimension question for stationarity testing of continuous PA functions with XP algorithms on the tractable side, W[1]-hardness and ETH lower bounds on the hard side, plus a direct extension to shallow ReLU CNN losses.

read the letter

This paper closes the fixed-d tractability gap left by Tian and So's SODA 2025 work. It classifies the parameterized complexity of testing approximate first-order stationarity for continuous piecewise-affine functions, with d as the parameter: XP algorithms when the instance is tractable in fixed dimension, and W[1]-hardness plus ETH lower bounds ruling out any ρ(d)·size^{o(d)} algorithm on the complementary cases. The same hardness carries over to stationarity testing in the weight space of shallow ReLU CNN losses via a direct embedding into a PA instance. It also notes the corresponding picture for local minimality testing of PA functions.

Referee Report

0 major / 3 minor

Summary. The manuscript studies the parameterized complexity of testing approximate first-order stationarity at a given point for continuous piecewise-affine (PA) functions, taking ambient dimension d as the parameter. It establishes XP algorithms (n^{f(d)} time) for the tractable variants via dynamic programming over hyperplane arrangements, W[1]-hardness for the complementary variants via dimension-preserving reductions, and ETH-based lower bounds excluding any algorithm running in time ρ(d)·size^{o(d)} for computable ρ. The same complexity picture is shown to hold for testing local minimality of PA functions and is extended to stationarity testing (in weight space) for a family of shallow ReLU CNN training losses.

Significance. If the technical claims hold, the work resolves the fixed-dimensional tractability question left open by Tian and So (SODA 2025) and supplies a complete parameterized-complexity classification for a canonical model of nonsmooth stationarity testing. The extension to shallow CNN losses demonstrates that the same hardness/tractability dichotomy appears in a concrete neural-network setting. The reliance on standard parameterized reductions and explicit dynamic-programming constructions over arrangements is a methodological strength.

minor comments (3)

[Abstract / §1] The abstract and introduction refer to “the tractable sides” and “the complementary sides” without immediately naming the precise stationarity notions (e.g., which subdifferential or directional-derivative conditions) that fall on each side; a short clarifying sentence or table would improve readability.
[§3 (or wherever the ETH statement appears)] In the ETH lower-bound statement, the quantity “size” is defined as the total binary encoding length of the instance; it would be helpful to state explicitly whether this includes the bit length of the rational coefficients of the PA pieces or only the combinatorial description.
[§5] The extension to shallow ReLU CNN losses is described as an “embedding” of the weight-space stationarity test into an equivalent PA instance; a brief paragraph sketching the dimension and size blow-up of this reduction would make the claim easier to verify.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our manuscript, for highlighting its significance in resolving the open question from Tian and So (SODA 2025), and for recommending minor revision. We are pleased that the parameterized complexity classification for stationarity testing of PA functions and its extension to shallow ReLU CNN losses is viewed as a methodological strength.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core results consist of XP algorithms via dynamic programming on hyperplane arrangements for fixed dimension, W[1]-hardness via standard parameterized reductions, and ETH-based lower bounds excluding subexponential dependence on dimension. These follow directly from established parameterized complexity machinery and the prior intractability result of Tian and So (SODA 2025), with no self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations. The CNN extension is a straightforward embedding reduction into an equivalent PA instance. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the standard mathematical definition of continuous piecewise-affine functions, the formal definitions of the XP and W[1] classes in parameterized complexity, and the Exponential Time Hypothesis as a widely accepted assumption for conditional lower bounds. No free parameters are fitted and no new entities are postulated.

axioms (1)

domain assumption Exponential Time Hypothesis (ETH)
Invoked to obtain the fine-grained lower bounds that rule out ρ(d) size^{o(d)} algorithms.

pith-pipeline@v0.9.0 · 5546 in / 1455 out tokens · 58703 ms · 2026-05-12T03:27:42.489870+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean, AlexanderDuality.lean reality_from_one_distinction unclear
Lemma 2.4 (Characterization of Fréchet stationarity): ... s + Y ⊆ X ... Lemma 2.5 (Clarke): conv(Se(X,Y)) via simultaneously exposed vertices

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages

[1]

Rubin , title =

Blaine Rister and Daniel L. Rubin , title =. Neural Networks , volume =

work page
[2]

M. K. Kozlov and S. P. Tarasov and L. G. Khachiyan , title =

work page
[3]

Advances in Neural Information Processing Systems , volume =

Stargalla, Moritz and Hertrich, Christoph and Reichman, Daniel , title =. Advances in Neural Information Processing Systems , volume =

work page
[4]

2025 , eprint =

Li, Hanyang and Cui, Ying , title =. 2025 , eprint =

work page 2025
[5]

Proceedings of the 42nd International Conference on Machine Learning , pages =

An Online Adaptive Sampling Algorithm for Stochastic Difference-of-convex Optimization with Time-varying Distributions , author =. Proceedings of the 42nd International Conference on Machine Learning , pages =. 2025 , volume =

work page 2025
[6]

Clarke , title =

Frank H. Clarke , title =

work page
[7]

Tyrrell Rockafellar and Roger J.-B

R. Tyrrell Rockafellar and Roger J.-B. Wets , title =

work page
[8]

Mordukhovich , title =

Boris S. Mordukhovich , title =

work page
[9]

Stefan Scholtes , title =

work page
[10]

Sergei Ovchinnikov , title =. Beitr

work page
[11]

Murty and Santosh N

Katta G. Murty and Santosh N. Kabadi , title =. Mathematical Programming , volume =

work page
[12]

Mathematical Programming , volume =

Yurii Nesterov , title =. Mathematical Programming , volume =

work page
[13]

Advances in Mathematics , volume =

Amir Ali Ahmadi and Jeffrey Zhang , title =. Advances in Mathematics , volume =

work page
[14]

Mathematical Programming , volume =

Amir Ali Ahmadi and Jeffrey Zhang , title =. Mathematical Programming , volume =

work page
[15]

Jiajin Li and Anthony Man-Cho So and Wing-Kin Ma , title =

work page
[16]

Damek Davis and Dmitriy Drusvyatskiy , title =

work page
[17]

Kakade and Jason D

Damek Davis and Dmitriy Drusvyatskiy and Sham M. Kakade and Jason D. Lee , title =. Foundations of Computational Mathematics , volume =

work page
[18]

Proceedings of the 37th International Conference on Machine Learning , series =

Jingzhao Zhang and Hongzhou Lin and Stefanie Jegelka and Suvrit Sra and Ali Jadbabaie , title =. Proceedings of the 37th International Conference on Machine Learning , series =

work page
[19]

Proceedings of the 39th International Conference on Machine Learning , series =

Lai Tian and Kaiwen Zhou and Anthony Man-Cho So , title =. Proceedings of the 39th International Conference on Machine Learning , series =

work page
[20]

Advances in Neural Information Processing Systems , volume =

Guy Kornowski and Ohad Shamir , title =. Advances in Neural Information Processing Systems , volume =

work page
[21]

Proceedings of the 36th Conference on Learning Theory , series =

Michael Jordan and Guy Kornowski and Tianyi Lin and Ohad Shamir and Manolis Zampetakis , title =. Proceedings of the 36th Conference on Learning Theory , series =

work page
[22]

Mathematical Programming , volume =

Lai Tian and Anthony Man-Cho So , title =. Mathematical Programming , volume =

work page
[23]

2023 , url =

Lai Tian and Anthony Man-Cho So , title =. 2023 , url =

work page 2023
[24]

Proceedings of the 2025 Annual

Lai Tian and Anthony Man-Cho So , title =. Proceedings of the 2025 Annual

work page 2025
[25]

Fomin and

Marek Cygan and Fedor V. Fomin and. Parameterized Algorithms , publisher =

work page
[26]

International Conference on Learning Representations , year =

Raman Arora and Amitabh Basu and Poorya Mianjy and Anirbit Mukherjee , title =. International Conference on Learning Representations , year =

work page
[27]

Guido F. Mont. On the Number of Linear Regions of Deep Neural Networks , booktitle =

work page
[28]

Journal of Artificial Intelligence Research , volume =

Vincent Froese and Christoph Hertrich and Rolf Niedermeier , title =. Journal of Artificial Intelligence Research , volume =

work page
[29]

SIAM Journal on Computing , volume =

Herbert Edelsbrunner and Joseph O'Rourke and Raimund Seidel , title =. SIAM Journal on Computing , volume =

work page
[30]

Advances in Neural Information Processing Systems , volume =

Vincent Froese and Christoph Hertrich , title =. Advances in Neural Information Processing Systems , volume =

work page
[31]

Proceedings of the 38th Conference on Learning Theory , series =

Vincent Froese and Moritz Grillo and Martin Skutella , title =. Proceedings of the 38th Conference on Learning Theory , series =

work page
[32]

Proceedings of the 38th Conference on Learning Theory , series =

Vincent Froese and Moritz Grillo and Christoph Hertrich and Martin Skutella , title =. Proceedings of the 38th Conference on Learning Theory , series =

work page
[33]

International Conference on Learning Representations , note =

Vincent Froese and Moritz Grillo and Christoph Hertrich and Moritz Stargalla , title =. International Conference on Learning Representations , note =

work page
[34]

European Journal of Control , volume =

Adrian Kulmburg and Matthias Althoff , title =. European Journal of Control , volume =

work page
[35]

Bodlaender and Peter Gritzmann and Victor Klee and Jan van Leeuwen , title =

Hans L. Bodlaender and Peter Gritzmann and Victor Klee and Jan van Leeuwen , title =. Combinatorica , volume =

work page
[36]

Liebling , title =

Jean-Antoine Ferrez and Komei Fukuda and Thomas M. Liebling , title =. European Journal of Operational Research , volume =

work page
[37]

Theoretical Computer Science , volume =

Vladimir Shenmaier , title =. Theoretical Computer Science , volume =

work page
[38]

Sadra Sadraddini and Russ Tedrake , title =. 2019

work page 2019
[39]

Pfetsch , title =

Volker Kaibel and Marc E. Pfetsch , title =. Algebra, Geometry, and Software Systems , pages =

work page
[40]

Lectures on Polytopes , series =

G. Lectures on Polytopes , series =

work page
[41]

Memoirs of the American Mathematical Society , volume =

Thomas Zaslavsky , title =. Memoirs of the American Mathematical Society , volume =

work page
[42]

2020 , note =

Komei Fukuda , title =. 2020 , note =

work page 2020
[43]

Dill and Kyle Julian and Mykel J

Guy Katz and Clark Barrett and David L. Dill and Kyle Julian and Mykel J. Kochenderfer , title =. Computer Aided Verification , series =

work page
[44]

Huang and Duligur Ibeling and Kyle Julian and Christopher Lazarus and Rachel Lim and Parth Shah and Shantanu Thakoor and Haoze Wu and Aleksandar Zeljic and David L

Guy Katz and Derek A. Huang and Duligur Ibeling and Kyle Julian and Christopher Lazarus and Rachel Lim and Parth Shah and Shantanu Thakoor and Haoze Wu and Aleksandar Zeljic and David L. Dill and Mykel J. Kochenderfer and Clark Barrett , title =. Computer Aided Verification , series =

work page
[45]

Advances in Neural Information Processing Systems , volume =

Aladin Virmaux and Kevin Scaman , title =. Advances in Neural Information Processing Systems , volume =

work page
[46]

Dimakis , title =

Matt Jordan and Alexandros G. Dimakis , title =. Advances in Neural Information Processing Systems , volume =

work page
[47]

Advances in Neural Information Processing Systems , volume =

Christoph Hertrich and Amitabh Basu and Marco Di Summa and Martin Skutella , title =. Advances in Neural Information Processing Systems , volume =

work page
[48]

arXiv preprint arXiv:2411.03006 , year =

Christoph Hertrich and Georg Loho , title =. arXiv preprint arXiv:2411.03006 , year =

work page arXiv
[49]

International Conference on Learning Representations , year =

Chulhee Yun and Suvrit Sra and Ali Jadbabaie , title =. International Conference on Learning Representations , year =

work page
[50]

Ying Cui and Jong-Shi Pang , title =

work page
[51]

arXiv preprint arXiv:2505.07143 , year =

Hanyang Li and Ying Cui , title =. arXiv preprint arXiv:2505.07143 , year =

work page arXiv
[52]

Geometric Algorithms and Combinatorial Optimization , series =

Martin Gr. Geometric Algorithms and Combinatorial Optimization , series =

work page
[53]

Yurii Nesterov and Arkadii Nemirovskii , title =

work page