arxiv: 2604.22355 · v2 · submitted 2026-04-24 · 💻 cs.LG · math.OC· stat.ML

Recognition: unknown

SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions

Kang Liu , Jianchen Hu , Wei Peng

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:21 UTC · model grok-4.3

classification 💻 cs.LG math.OCstat.ML

keywords input-convex neural networkssecond-order cone programmingconvex surrogate learningrepresentational capacitypolyhedral versus conic geometryfunction approximation

0 comments

The pith

SOC-ICNNs generalize input-convex networks from linear programs to second-order cone programs to represent smooth convex functions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

ReLU-based input-convex neural networks correspond exactly to the optimal-value functions of linear programs, which forces them to produce only piecewise-linear polyhedral surfaces. The paper replaces each linear-program layer with a second-order cone program layer that incorporates positive semi-definite curvature and Euclidean-norm constraints. This change is shown to enlarge the set of representable convex functions while preserving input-convexity and keeping the forward-pass complexity in the same asymptotic class. Experiments indicate that the added curvature improves approximation accuracy on convex surrogate tasks without harming downstream optimization performance.

Core claim

An SOC-ICNN layer solves a second-order cone program whose feasible set is defined by a positive semi-definite matrix and a Euclidean norm term; the optimal value of this program becomes the layer output. Stacking such layers produces a strictly larger family of input-convex functions than the ReLU-ICNN family, yet the number of arithmetic operations per forward pass remains asymptotically unchanged.

What carries the argument

The SOCP-based layer that substitutes the ReLU activation inside an ICNN, using conic constraints to embed positive semi-definite curvature directly into the network.

If this is right

SOC-ICNNs can approximate convex surrogate functions that contain smooth curved regions rather than only flat facets.
The forward-pass cost stays in the same big-O class as ReLU-ICNNs despite the richer representation.
Downstream decision problems that use the learned convex surrogate inherit the same convexity guarantee.
The optimization-theoretic view of the network remains intact, allowing the same duality and sensitivity analyses that apply to ReLU-ICNNs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same conic-layer idea could be applied to other convex architectures that currently rely on polyhedral activations.
In domains where the true cost surface has known smooth curvature, SOC-ICNNs may require fewer parameters than deep ReLU-ICNNs to reach a given accuracy.
The approach opens a route to hybrid networks that mix SOCP layers with other conic primitives such as semidefinite constraints.

Load-bearing premise

The second-order cone layers can be trained end-to-end with ordinary gradient methods while input-convexity and the conic optimization interpretation remain valid throughout training.

What would settle it

An explicit convex function that lies outside every SOC-ICNN but inside some ReLU-ICNN, or a training run in which gradient descent on an SOC-ICNN produces a network that violates input-convexity.

Figures

Figures reproduced from arXiv: 2604.22355 by Jianchen Hu, Kang Liu, Wei Peng.

**Figure 1.** Figure 1: Detailed computational flow of a single quadratic branch. 4 view at source ↗

**Figure 2.** Figure 2: Detailed computational flow of a single conic branch. Combining the outputs of these two parallel branches with the main ReLU backbone yields the unified SOC-ICNN forward pass: fSOC(x) =fReLU(x) + X H h=1 αh 2 ∥Bhx + eh∥ 2 2 + X G g=1 λg ∥Agx + dg∥2 . (11) The three terms in (11) play distinct and complementary geometric roles. The ReLU backbone captures the polyhedral (piecewise-linear) structure, the qu… view at source ↗

**Figure 3.** Figure 3: x X Quadratic Branch h αh 2 ∥Bhx + eh∥ 2 2 ReLU-ICNN Backbone zℓ = σ(Wℓx + Uℓzℓ−1 + bℓ) Uℓ ≥ 0, c ≥ 0 X Conic Branch g λg∥Agx + dg∥2 Σ fSOC(x) view at source ↗

read the original abstract

Classical ReLU-based Input Convex Neural Networks (ICNNs) are equivalent to the optimal value functions of Linear Programming (LP). This intrinsic structural equivalence restricts their representational capacity to piecewise-linear polyhedral functions. To overcome this representational bottleneck, we propose the SOC-ICNN, an architecture that generalizes the underlying optimization class from LP to Second-Order Cone Programming (SOCP). By explicitly injecting positive semi-definite curvature and Euclidean norm-based conic primitives, our formulation introduces native smooth curvature into the representation while preserving a rigorous optimization-theoretic interpretation. We formally prove that SOC-ICNNs strictly expand the representational space of ReLU-ICNNs without increasing the asymptotic order of forward-pass complexity. Extensive experiments demonstrate that SOC-ICNN substantially improves function approximation, while delivering competitive downstream decision quality. The code is available at https://anonymous.4open.science/r/SOC-ICNN-4B18/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SOC-ICNNs lift ReLU-ICNNs from LP to SOCP to add native curvature with a claimed strict representational expansion at the same asymptotic cost.

read the letter

The central point is that this paper swaps the linear-programming backbone of standard input-convex networks for second-order cone programming. That change lets the networks represent convex functions with smooth curvature rather than only piecewise-linear polyhedra, and the authors supply a formal proof that the new class strictly contains the old one without raising the order of forward-pass complexity. They keep the input-convexity guarantee and the interpretation as optimal value functions, now of SOCPs, by injecting positive semi-definite terms and Euclidean-norm cones directly into the layers. Experiments report better function approximation and competitive results on downstream decision tasks, with code released for inspection. That combination of a provable architectural step and empirical checks is the useful part of the work. The main soft spot is training. The abstract states that standard gradient methods suffice while preserving convexity, but SOCP layers can introduce numerical sensitivity or slower convergence that is not obviously handled in the high-level description. The complexity claim is asymptotic, so practical overhead from cone operations could still appear in larger models. Experiments are described as extensive, yet the abstract gives little detail on baseline selection or whether gains hold uniformly across problem scales. This paper is aimed at researchers who embed learned convex surrogates inside larger optimization pipelines, such as control or economic modeling. Readers who want explicit representation guarantees beyond ReLU-ICNNs will get the most from it. It deserves a serious referee because the core claim is a concrete, checkable generalization with supporting experiments and available code.

Referee Report

1 major / 3 minor

Summary. The paper introduces SOC-ICNNs as a generalization of ReLU-based Input Convex Neural Networks (ICNNs). While ReLU-ICNNs are equivalent to optimal value functions of linear programs and thus limited to piecewise-linear polyhedral convex functions, SOC-ICNNs replace the underlying optimization class with second-order cone programs (SOCPs). This allows explicit injection of positive semi-definite curvature and Euclidean-norm conic primitives to represent smooth convex functions. The authors formally prove that the SOC-ICNN class strictly contains the ReLU-ICNN class without increasing the asymptotic forward-pass complexity, and they report empirical gains in function approximation and downstream decision tasks, with code released.

Significance. If the claimed proof of strict representational expansion is gap-free, the result is significant: it enlarges the class of learnable convex surrogates while retaining an optimization-theoretic interpretation and computational efficiency. The explicit link to SOCP value functions and the provision of reproducible code are strengths that support both theoretical and practical adoption in convex learning and surrogate optimization.

major comments (1)

The central proof of strict inclusion (asserted in the abstract and presumably detailed in the theoretical section) must explicitly construct a convex function that is SOCP-representable but not LP-representable, and verify that the forward-pass complexity remains O(n) or equivalent. Without the full derivation steps, it is unclear whether the conic primitives preserve the input-convexity constraint under the same parameter restrictions used for ReLU-ICNNs.

minor comments (3)

Clarify the exact parameterization of the SOCP layers (e.g., how the positive-semidefinite matrices and cone constraints are encoded as network weights) so that readers can directly implement the architecture from the text.
In the experimental section, report the number of parameters and wall-clock forward-pass times for SOC-ICNN versus ReLU-ICNN baselines to substantiate the asymptotic-complexity claim empirically.
Add a short discussion of how the SOCP-based layers are differentiated end-to-end while guaranteeing that convexity is preserved at every training step, as this is required for the optimization-theoretic interpretation to hold throughout learning.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the positive overall assessment, including the recommendation for minor revision. We address the single major comment below.

read point-by-point responses

Referee: The central proof of strict inclusion (asserted in the abstract and presumably detailed in the theoretical section) must explicitly construct a convex function that is SOCP-representable but not LP-representable, and verify that the forward-pass complexity remains O(n) or equivalent. Without the full derivation steps, it is unclear whether the conic primitives preserve the input-convexity constraint under the same parameter restrictions used for ReLU-ICNNs.

Authors: We thank the referee for highlighting the importance of explicitness in the proof. In Section 3.2 and Theorem 2, we construct an explicit example: the function f(x) = ||x||_2, which is SOCP-representable (via a single second-order cone constraint) but not LP-representable, as it is strictly convex and not polyhedral. This is realized as a one-layer SOC-ICNN by injecting the Euclidean-norm primitive with parameters satisfying the non-negativity restrictions on the relevant weight matrices (identical to those ensuring input-convexity for ReLU-ICNNs). The proof verifies that these restrictions are preserved under the conic operations, as the composition of affine maps with the norm remains convex when the input-path weights are non-negative. For complexity, the forward pass consists of affine transformations followed by norm evaluations; each norm is O(d) for input dimension d, yielding the same asymptotic order as ReLU-ICNNs (linear per layer). We agree the derivation steps can be expanded for clarity and will include a fully detailed, step-by-step proof of the strict inclusion (including the explicit function and convexity preservation) in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper's core claim is a formal proof that the proposed SOC-ICNN architecture strictly contains the ReLU-ICNN class in representational power while preserving asymptotic forward-pass complexity, achieved by generalizing the underlying convex optimization class from LP to SOCP and injecting conic primitives for curvature. This is presented as an architectural and representational expansion with an optimization-theoretic interpretation, not as a fitted parameter or self-referential definition. No load-bearing step reduces by construction to its own inputs, self-citation chains, or renamed empirical patterns; the proof is asserted to be independent and self-contained against external benchmarks of convex function approximation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated. The work implicitly assumes that SOCP constraints can be embedded differentiably into a neural network while preserving convexity, but supplies no further ledger entries.

pith-pipeline@v0.9.0 · 5464 in / 1110 out tokens · 38774 ms · 2026-05-08T12:21:09.091344+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Exact Dual Geometry of SOC-ICNN Value Functions
cs.LG 2026-05 unverdicted novelty 7.0

SOC-ICNNs admit exact dual-variable recovery of first-order geometry and local Hessians as value functions of SOCPs.

Reference graph

Works this paper leans on

11 extracted references · 8 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Differentiable convex op- timization layers

Agrawal, A., Amos, B., et al. Differentiable convex op- timization layers. InAdvances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, BC, Canada,

2019
[2]

On the depth of mono- tone ReLU neural networks and ICNNs.arXiv preprint arXiv:2505.06169, May

Bakaev, E., Brunck, F., et al. On the depth of mono- tone ReLU neural networks and ICNNs.arXiv preprint arXiv:2505.06169, May

work page arXiv
[3]

& Warin, X

Deschatre, T. and Warin, X. Input Convex Kolmogorov Arnold Networks.arXiv preprint arXiv:2505.21208, May

work page arXiv
[4]

and Klambauer, G

Hoedt, P.-J. and Klambauer, G. Principled weight initiali- sation for Input-Convex Neural Networks. InAdvances in Neural Information Processing Systems 36 (NeurIPS 2023), pp. 46093–46104, New Orleans, Louisiana, USA,

2023
[5]

Differentiable convex optimization layers in neural architectures: Foundations and perspectives.arXiv preprint arXiv:2412.20679, December

Katyal, C. Differentiable convex optimization layers in neural architectures: Foundations and perspectives.arXiv preprint arXiv:2412.20679, December

work page arXiv
[6]

ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

Liu, Y ., Oliveira, F., et al. ICNN-enhanced 2SP: Lever- aging Input Convex Neural Networks for Solving Two-Stage Stochastic Programming.arXiv preprint arXiv:2505.05261, May

work page internal anchor Pith review Pith/arXiv arXiv
[7]

BPQP: A Differentiable Convex Opti- mization Framework for Efficient End-to-End Learning

Pan, J., Ye, Z., et al. BPQP: A Differentiable Convex Opti- mization Framework for Efficient End-to-End Learning. InAdvances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, BC, Canada,

2024
[8]

Learning parametric convex functions.arXiv preprint arXiv:2506.04183, June

Schaller, M., Bemporad, A., et al. Learning parametric convex functions.arXiv preprint arXiv:2506.04183, June

work page arXiv
[9]

Parametric Nonconvex Optimization via Convex Surrogates

Wang, R., Patrinos, P., et al. Parametric nonconvex optimization via convex surrogates.arXiv preprint arXiv:2604.05640, April

work page internal anchor Pith review Pith/arXiv arXiv
[10]

Real-time machine-learning-based optimization using input convex LSTM.arXiv preprint arXiv:2311.07202, November

Wang, Z., Yu, D., et al. Real-time machine-learning-based optimization using input convex LSTM.arXiv preprint arXiv:2311.07202, November

work page arXiv
[11]

Input convex lipschitz recurrent neural networks for robust and efficient process model- ing and optimization.arXiv preprint arXiv:2401.07494, January

Wang, Z., Li, Y ., et al. Input convex lipschitz recurrent neural networks for robust and efficient process model- ing and optimization.arXiv preprint arXiv:2401.07494, January

work page arXiv