Recognition: unknown
SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions
Pith reviewed 2026-05-08 12:21 UTC · model grok-4.3
The pith
SOC-ICNNs generalize input-convex networks from linear programs to second-order cone programs to represent smooth convex functions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An SOC-ICNN layer solves a second-order cone program whose feasible set is defined by a positive semi-definite matrix and a Euclidean norm term; the optimal value of this program becomes the layer output. Stacking such layers produces a strictly larger family of input-convex functions than the ReLU-ICNN family, yet the number of arithmetic operations per forward pass remains asymptotically unchanged.
What carries the argument
The SOCP-based layer that substitutes the ReLU activation inside an ICNN, using conic constraints to embed positive semi-definite curvature directly into the network.
If this is right
- SOC-ICNNs can approximate convex surrogate functions that contain smooth curved regions rather than only flat facets.
- The forward-pass cost stays in the same big-O class as ReLU-ICNNs despite the richer representation.
- Downstream decision problems that use the learned convex surrogate inherit the same convexity guarantee.
- The optimization-theoretic view of the network remains intact, allowing the same duality and sensitivity analyses that apply to ReLU-ICNNs.
Where Pith is reading between the lines
- The same conic-layer idea could be applied to other convex architectures that currently rely on polyhedral activations.
- In domains where the true cost surface has known smooth curvature, SOC-ICNNs may require fewer parameters than deep ReLU-ICNNs to reach a given accuracy.
- The approach opens a route to hybrid networks that mix SOCP layers with other conic primitives such as semidefinite constraints.
Load-bearing premise
The second-order cone layers can be trained end-to-end with ordinary gradient methods while input-convexity and the conic optimization interpretation remain valid throughout training.
What would settle it
An explicit convex function that lies outside every SOC-ICNN but inside some ReLU-ICNN, or a training run in which gradient descent on an SOC-ICNN produces a network that violates input-convexity.
Figures
read the original abstract
Classical ReLU-based Input Convex Neural Networks (ICNNs) are equivalent to the optimal value functions of Linear Programming (LP). This intrinsic structural equivalence restricts their representational capacity to piecewise-linear polyhedral functions. To overcome this representational bottleneck, we propose the SOC-ICNN, an architecture that generalizes the underlying optimization class from LP to Second-Order Cone Programming (SOCP). By explicitly injecting positive semi-definite curvature and Euclidean norm-based conic primitives, our formulation introduces native smooth curvature into the representation while preserving a rigorous optimization-theoretic interpretation. We formally prove that SOC-ICNNs strictly expand the representational space of ReLU-ICNNs without increasing the asymptotic order of forward-pass complexity. Extensive experiments demonstrate that SOC-ICNN substantially improves function approximation, while delivering competitive downstream decision quality. The code is available at https://anonymous.4open.science/r/SOC-ICNN-4B18/.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SOC-ICNNs as a generalization of ReLU-based Input Convex Neural Networks (ICNNs). While ReLU-ICNNs are equivalent to optimal value functions of linear programs and thus limited to piecewise-linear polyhedral convex functions, SOC-ICNNs replace the underlying optimization class with second-order cone programs (SOCPs). This allows explicit injection of positive semi-definite curvature and Euclidean-norm conic primitives to represent smooth convex functions. The authors formally prove that the SOC-ICNN class strictly contains the ReLU-ICNN class without increasing the asymptotic forward-pass complexity, and they report empirical gains in function approximation and downstream decision tasks, with code released.
Significance. If the claimed proof of strict representational expansion is gap-free, the result is significant: it enlarges the class of learnable convex surrogates while retaining an optimization-theoretic interpretation and computational efficiency. The explicit link to SOCP value functions and the provision of reproducible code are strengths that support both theoretical and practical adoption in convex learning and surrogate optimization.
major comments (1)
- The central proof of strict inclusion (asserted in the abstract and presumably detailed in the theoretical section) must explicitly construct a convex function that is SOCP-representable but not LP-representable, and verify that the forward-pass complexity remains O(n) or equivalent. Without the full derivation steps, it is unclear whether the conic primitives preserve the input-convexity constraint under the same parameter restrictions used for ReLU-ICNNs.
minor comments (3)
- Clarify the exact parameterization of the SOCP layers (e.g., how the positive-semidefinite matrices and cone constraints are encoded as network weights) so that readers can directly implement the architecture from the text.
- In the experimental section, report the number of parameters and wall-clock forward-pass times for SOC-ICNN versus ReLU-ICNN baselines to substantiate the asymptotic-complexity claim empirically.
- Add a short discussion of how the SOCP-based layers are differentiated end-to-end while guaranteeing that convexity is preserved at every training step, as this is required for the optimization-theoretic interpretation to hold throughout learning.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for the positive overall assessment, including the recommendation for minor revision. We address the single major comment below.
read point-by-point responses
-
Referee: The central proof of strict inclusion (asserted in the abstract and presumably detailed in the theoretical section) must explicitly construct a convex function that is SOCP-representable but not LP-representable, and verify that the forward-pass complexity remains O(n) or equivalent. Without the full derivation steps, it is unclear whether the conic primitives preserve the input-convexity constraint under the same parameter restrictions used for ReLU-ICNNs.
Authors: We thank the referee for highlighting the importance of explicitness in the proof. In Section 3.2 and Theorem 2, we construct an explicit example: the function f(x) = ||x||_2, which is SOCP-representable (via a single second-order cone constraint) but not LP-representable, as it is strictly convex and not polyhedral. This is realized as a one-layer SOC-ICNN by injecting the Euclidean-norm primitive with parameters satisfying the non-negativity restrictions on the relevant weight matrices (identical to those ensuring input-convexity for ReLU-ICNNs). The proof verifies that these restrictions are preserved under the conic operations, as the composition of affine maps with the norm remains convex when the input-path weights are non-negative. For complexity, the forward pass consists of affine transformations followed by norm evaluations; each norm is O(d) for input dimension d, yielding the same asymptotic order as ReLU-ICNNs (linear per layer). We agree the derivation steps can be expanded for clarity and will include a fully detailed, step-by-step proof of the strict inclusion (including the explicit function and convexity preservation) in the revised manuscript. revision: yes
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper's core claim is a formal proof that the proposed SOC-ICNN architecture strictly contains the ReLU-ICNN class in representational power while preserving asymptotic forward-pass complexity, achieved by generalizing the underlying convex optimization class from LP to SOCP and injecting conic primitives for curvature. This is presented as an architectural and representational expansion with an optimization-theoretic interpretation, not as a fitted parameter or self-referential definition. No load-bearing step reduces by construction to its own inputs, self-citation chains, or renamed empirical patterns; the proof is asserted to be independent and self-contained against external benchmarks of convex function approximation.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Exact Dual Geometry of SOC-ICNN Value Functions
SOC-ICNNs admit exact dual-variable recovery of first-order geometry and local Hessians as value functions of SOCPs.
Reference graph
Works this paper leans on
-
[1]
Differentiable convex op- timization layers
Agrawal, A., Amos, B., et al. Differentiable convex op- timization layers. InAdvances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, BC, Canada,
2019
-
[2]
On the depth of mono- tone ReLU neural networks and ICNNs.arXiv preprint arXiv:2505.06169, May
Bakaev, E., Brunck, F., et al. On the depth of mono- tone ReLU neural networks and ICNNs.arXiv preprint arXiv:2505.06169, May
-
[3]
Deschatre, T. and Warin, X. Input Convex Kolmogorov Arnold Networks.arXiv preprint arXiv:2505.21208, May
-
[4]
and Klambauer, G
Hoedt, P.-J. and Klambauer, G. Principled weight initiali- sation for Input-Convex Neural Networks. InAdvances in Neural Information Processing Systems 36 (NeurIPS 2023), pp. 46093–46104, New Orleans, Louisiana, USA,
2023
-
[5]
Katyal, C. Differentiable convex optimization layers in neural architectures: Foundations and perspectives.arXiv preprint arXiv:2412.20679, December
-
[6]
Liu, Y ., Oliveira, F., et al. ICNN-enhanced 2SP: Lever- aging Input Convex Neural Networks for Solving Two-Stage Stochastic Programming.arXiv preprint arXiv:2505.05261, May
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
BPQP: A Differentiable Convex Opti- mization Framework for Efficient End-to-End Learning
Pan, J., Ye, Z., et al. BPQP: A Differentiable Convex Opti- mization Framework for Efficient End-to-End Learning. InAdvances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, BC, Canada,
2024
-
[8]
Learning parametric convex functions.arXiv preprint arXiv:2506.04183, June
Schaller, M., Bemporad, A., et al. Learning parametric convex functions.arXiv preprint arXiv:2506.04183, June
-
[9]
Parametric Nonconvex Optimization via Convex Surrogates
Wang, R., Patrinos, P., et al. Parametric nonconvex optimization via convex surrogates.arXiv preprint arXiv:2604.05640, April
work page internal anchor Pith review Pith/arXiv arXiv
-
[10]
Wang, Z., Yu, D., et al. Real-time machine-learning-based optimization using input convex LSTM.arXiv preprint arXiv:2311.07202, November
-
[11]
Wang, Z., Li, Y ., et al. Input convex lipschitz recurrent neural networks for robust and efficient process model- ing and optimization.arXiv preprint arXiv:2401.07494, January
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.