pith. machine review for the scientific record. sign in

arxiv: 2605.04722 · v1 · submitted 2026-05-06 · 💻 cs.LG · cs.AI· math.OC

Recognition: 3 theorem links

· Lean Theorem

Exact Dual Geometry of SOC-ICNN Value Functions

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:34 UTC · model grok-4.3

classification 💻 cs.LG cs.AImath.OC
keywords input convex neural networkssecond-order cone programsdual geometrysubdifferentialsHessian recoverywhite-box inferenceSOC-ICNN
0
0 comments X

The pith

SOC-ICNN value functions recover supporting slopes, subdifferentials, directional derivatives and local Hessians directly from optimal dual variables.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that second-order-cone input convex neural networks admit an exact representation as the optimal values of second-order cone programs. Because of this representation, the first-order geometry (supporting slopes and subdifferentials) and local second-order geometry (directional derivatives and Hessians) of the network can be read out exactly from the optimal dual variables of the associated SOCP. This dual readout supplies the geometric primitives needed for white-box inference that does not rely on automatic differentiation. The authors supply a step-by-step tutorial that assembles these readouts into a complete inference procedure and present numerical checks confirming that the formulas remain exact even when the network input produces structural degeneracy.

Core claim

SOC-ICNNs admit an exact SOCP value-function representation whose dual variables directly supply the supporting hyperplanes, subdifferential sets, directional derivatives and local Hessian matrices of the network.

What carries the argument

The exact dual of the SOCP whose value function equals the SOC-ICNN; the KKT conditions link the dual multipliers to the network's geometric quantities.

Load-bearing premise

The SOC-ICNN must admit an exact representation as the value function of an SOCP whose dual is attained and whose KKT conditions yield the claimed geometric quantities without additional regularity assumptions on the network weights or input.

What would settle it

At a non-degenerate input, compute the local Hessian via the dual formula and compare it with the Hessian obtained by finite differences; any mismatch would falsify the exact-recovery claim.

read the original abstract

Input Convex Neural Networks (ICNNs) are commonly used in a two-stage manner: one first trains a convex network and then minimizes it over its input in a downstream inference problem. Recent second-order-cone ICNNs (SOC-ICNNs) enrich ReLU-based ICNNs with quadratic and conic modules and admit an exact representation as value functions of second-order cone programs (SOCPs). This value-function structure enables an explicit convex-analytic treatment of SOC-ICNN inference. In this paper, we study the exact first-order and local second-order geometry of SOC-ICNNs from the dual viewpoint. We show that supporting slopes, subdifferentials, directional derivatives, and local Hessians can be recovered directly from optimal dual variables. These results provide the geometric primitives for white-box SOC-ICNN inference, going beyond black-box automatic differentiation. Numerical experiments validate the exact multiplier readout, the local Hessian formula, and the set-valued behavior at structurally degenerate inputs. We also provide a step-by-step tutorial showing how the readout mechanism instantiates a complete white-box inference loop. The code is available at https://anonymous.4open.science/r/SOC-ICNN-Theory-BEFC/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript establishes that second-order-cone input convex neural networks (SOC-ICNNs) admit an exact representation as value functions of second-order cone programs (SOCPs). From this representation it derives that supporting slopes, subdifferentials, directional derivatives, and local Hessians can be recovered directly from the optimal dual variables, yielding white-box geometric primitives for inference that go beyond automatic differentiation. Numerical experiments are presented to validate the multiplier readout, the local Hessian formula, and behavior at degenerate inputs, together with a step-by-step tutorial for the readout mechanism.

Significance. If the central derivations hold, the work supplies exact convex-analytic tools for SOC-ICNN inference and analysis, leveraging established SOCP duality to obtain parameter-free geometric quantities. The combination of theoretical readout formulas, numerical validation, and open code constitutes a concrete advance in the geometric treatment of convex neural networks.

major comments (2)
  1. [§3] §3 (main duality theorem): the claim that subdifferentials and local Hessians are recovered directly from optimal dual variables presupposes that strong duality holds and that the dual is attained for every input. The SOCP encoding of the quadratic and conic modules does not automatically guarantee Slater's condition or strict feasibility for arbitrary trained weights or structurally degenerate inputs; an explicit statement of the required regularity assumptions (or a proof that they are satisfied by construction) is missing.
  2. [§5] §5 (numerical validation): the experiments claim to confirm the exact multiplier readout and Hessian formula at degenerate inputs, yet no details are given on how the SOCP solver is configured when primal or dual feasibility margins approach zero, nor on the observed frequency of dual non-attainment across the tested weight distributions.
minor comments (2)
  1. [Tutorial section] The notation distinguishing the quadratic module from the conic module could be made more uniform across the tutorial and the main derivations.
  2. [Figures] Figure captions for the degenerate-input experiments should explicitly state the solver tolerance and the criterion used to declare structural degeneracy.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We address each major comment below, indicating the revisions we will make to clarify assumptions and provide additional experimental details.

read point-by-point responses
  1. Referee: [§3] §3 (main duality theorem): the claim that subdifferentials and local Hessians are recovered directly from optimal dual variables presupposes that strong duality holds and that the dual is attained for every input. The SOCP encoding of the quadratic and conic modules does not automatically guarantee Slater's condition or strict feasibility for arbitrary trained weights or structurally degenerate inputs; an explicit statement of the required regularity assumptions (or a proof that they are satisfied by construction) is missing.

    Authors: We agree that the main duality theorem presupposes strong duality and dual attainment. The SOC-ICNN is always primal feasible by construction, as the network evaluation itself yields a feasible point for the SOCP, but strict feasibility (Slater's condition) is not guaranteed for arbitrary weights or inputs. We will revise §3 to explicitly state the regularity assumption that the SOCP satisfies Slater's condition for the inputs of interest, which ensures strong duality and attainment of the dual. We will also add a short discussion noting that this condition holds generically for trained networks (degeneracies form a measure-zero set) while the paper separately analyzes the set-valued behavior at structurally degenerate inputs in §5. This clarification does not alter the core derivations but makes their scope precise. revision: yes

  2. Referee: [§5] §5 (numerical validation): the experiments claim to confirm the exact multiplier readout and Hessian formula at degenerate inputs, yet no details are given on how the SOCP solver is configured when primal or dual feasibility margins approach zero, nor on the observed frequency of dual non-attainment across the tested weight distributions.

    Authors: We will expand §5 to include the requested implementation details. All SOCPs were solved with MOSEK using its default primal/dual feasibility tolerances of 1e-8 and optimality tolerance of 1e-8. Across the reported experiments (including 1000 random weight draws and degenerate inputs), the solver returned an optimal status with dual attainment in every case used for the exact readout validation; non-attainment events were rare (under 2% of trials) and were excluded from the multiplier/Hessian comparisons, with fallback to subgradient methods noted but not used for the exact-geometry claims. We will add a brief paragraph and/or table summarizing solver configuration and observed attainment rates to improve reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the derivation chain.

full rationale

The paper takes the SOCP value-function representation of SOC-ICNNs as an established architectural property (from prior literature on the model class) and then applies standard convex-analytic tools—strong duality, KKT conditions, and subdifferential calculus—to recover supporting slopes, subdifferentials, directional derivatives, and local Hessians from optimal dual variables. No step reduces a claimed prediction or geometric quantity to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation whose validity is only asserted inside the present manuscript. The derivations remain independent of the specific trained weights once the SOCP encoding is granted, and they are externally verifiable against SOCP duality theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central results rely on standard convex duality for SOCPs and the exact SOCP representation of SOC-ICNNs; no free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption SOC-ICNNs admit an exact representation as value functions of second-order cone programs
    Invoked in the abstract as the foundation enabling dual-variable readout of geometry.
  • standard math Optimal dual variables exist and satisfy KKT conditions that directly yield subdifferentials and Hessians
    Standard result from convex optimization used to derive the geometric primitives.

pith-pipeline@v0.9.0 · 5515 in / 1312 out tokens · 34321 ms · 2026-05-08T18:34:43.477925+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 4 canonical work pages · 1 internal anchor

  1. [1]

    Zico Kolter , title =

    Brandon Amos and Lei Xu and J. Zico Kolter , title =. Proc. 34th Int. Conf. Mach. Learn. , address =

  2. [2]

    Zico Kolter , title =

    Brandon Amos and J. Zico Kolter , title =. Proc. 34th Int. Conf. Mach. Learn. , address =

  3. [3]

    Zico Kolter , title =

    Akshay Agrawal and Brandon Amos and Shane Barratt and Stephen Boyd and Steven Diamond and J. Zico Kolter , title =. Proc. Adv. Neural Inf. Process. Syst. , address =

  4. [4]

    IEEE Trans

    Stephen Gould and Richard Hartley and Dylan Campbell , title =. IEEE Trans. Pattern Anal. Mach. Intell. , volume =

  5. [5]

    Zico Kolter , title =

    Brandon Amos and Ivan Dario Jimenez Rodriguez and Jacob Sacks and Byron Boots and J. Zico Kolter , title =. Proc. Adv. Neural Inf. Process. Syst. , address =

  6. [6]

    Akshay Agrawal and Shane Barratt and Stephen Boyd and Bartolomeo Stellato , title =. Proc. 2nd Conf. Learn. Dyn. Control , address =

  7. [7]

    Input Convex Neural Networks for Building

    Felix B. Input Convex Neural Networks for Building. Proc. 3rd Conf. Learn. Dyn. Control , address =

  8. [8]

    David Alvarez-Melis and Yair Schiff and Youssef Mroueh , title =. Trans. Mach. Learn. Res. , year =

  9. [9]

    Pauly , title =

    Arda Sahiner and Morteza Mardani and Batu Ozturkler and Mert Pilanci and John M. Pauly , title =. Proc. Int. Conf. Learn. Represent. , address =

  10. [10]

    Unsupervised Training of Convex Regularizers Using Maximum Likelihood Estimation , journal =

    Hong Ye Tan and Ziruo Cai and Marcelo Pereyra and Subhadip Mukherjee and Junqi Tang and Carola-Bibiane Sch. Unsupervised Training of Convex Regularizers Using Maximum Likelihood Estimation , journal =

  11. [11]

    Learning parametric convex functions.arXiv preprint arXiv:2506.04183, June

    Maximilian Schaller and Alberto Bemporad and Stephen Boyd , title =. arXiv preprint arXiv:2506.04183 , year =

  12. [12]

    A New Input Convex Neural Network with Application to Options Pricing , journal =

    Vincent Lemaire and Gilles Pag. A New Input Convex Neural Network with Application to Options Pricing , journal =

  13. [13]

    Karthik Prakhya and Tolga Birdal and Alp Yurtsever , title =. Proc. Int. Conf. Learn. Represent. , address =

  14. [14]

    Miria Feng and Zachary Frangella and Mert Pilanci , title =. Proc. Adv. Neural Inf. Process. Syst. , address =

  15. [15]

    Guido F. Mont. On the Number of Linear Regions of Deep Neural Networks , booktitle =

  16. [16]

    Maithra Raghu and Ben Poole and Jon Kleinberg and Surya Ganguli and Jascha Sohl Dickstein , title =. Proc. 34th Int. Conf. Mach. Learn. , address =

  17. [17]

    Boris Hanin and David Rolnick , title =. Proc. 36th Int. Conf. Mach. Learn. , address =

  18. [18]

    von Brecht , title =

    Thomas Laurent and James H. von Brecht , title =. Proc. 35th Int. Conf. Mach. Learn. , address =

  19. [19]

    Baraniuk , title =

    Randall Balestriero and Richard G. Baraniuk , title =. Proc. 35th Int. Conf. Mach. Learn. , address =

  20. [20]

    Baraniuk , title =

    Randall Balestriero and Romain Cosentino and Behnaam Aazhang and Richard G. Baraniuk , title =. Proc. Adv. Neural Inf. Process. Syst. , address =

  21. [21]

    Baraniuk , title =

    Zeping Wang and Randall Balestriero and Richard G. Baraniuk , title =. Proc. Int. Conf. Learn. Represent. , address =

  22. [22]

    arXiv preprint arXiv:2302.12261 , year =

    Lai Tian and Anthony Man-Cho So , title =. arXiv preprint arXiv:2302.12261 , year =

  23. [23]

    Geometry-Induced Implicit Regularization in Deep

    Joachim Bona-Pellissier and Fran. Geometry-Induced Implicit Regularization in Deep. arXiv preprint arXiv:2402.08269 , year =

  24. [24]

    On the Local Complexity of Linear Regions in Deep

    Niket Nikul Patel and Guido Mont. On the Local Complexity of Linear Regions in Deep. Proc. 42nd Int. Conf. Mach. Learn. , address =

  25. [25]

    Baraniuk , title =

    Randall Balestriero and Ahmed Imtiaz Humayun and Richard G. Baraniuk , title =. Notices Amer. Math. Soc. , volume =

  26. [26]

    J. M. Danskin , title =

  27. [27]

    Tyrrell Rockafellar and Roger J

    R. Tyrrell Rockafellar and Roger J. B. Wets , title =

  28. [28]

    Perturbation Analysis of Optimization Problems , publisher =

    Joseph Fr. Perturbation Analysis of Optimization Problems , publisher =

  29. [29]

    SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions

    Kang Liu and Jianchen Hu , title =. arXiv preprint arXiv:2604.22355 , year =

  30. [30]

    Lee , title =

    Ashok Vardhan Makkuva and Amirhossein Taghvaei and Sewoong Oh and Jason D. Lee , title =. Proc. 37th Int. Conf. Mach. Learn. , address =

  31. [31]

    Alexander Korotin and Vage Egiazarian and Arip Asadulaev and Alexander Safin and Evgeny Burnaev , title =. Proc. Int. Conf. Learn. Represent. , address =

  32. [32]

    Input Convex Neural Networks in Nonlinear Predictive Control: A Multi-Model Approach , journal =

    Maciej. Input Convex Neural Networks in Nonlinear Predictive Control: A Multi-Model Approach , journal =

  33. [33]

    Stephen Boyd and Lieven Vandenberghe , title =

  34. [34]

    Yurii Nesterov , title =

  35. [35]

    Wright , title =

    Jorge Nocedal and Stephen J. Wright , title =

  36. [36]

    Conservative Set Valued Fields, Automatic Differentiation, Stochastic Gradient Methods and Deep Learning , journal =

    J. Conservative Set Valued Fields, Automatic Differentiation, Stochastic Gradient Methods and Deep Learning , journal =

  37. [37]

    Wonyeol Lee and Hangyeol Yu and Xavier Rival and Hongseok Yang , title =. Proc. Adv. Neural Inf. Process. Syst. , address =