Shortcomings and capacities of real-constrained neural networks in complex spaces
Pith reviewed 2026-06-28 06:50 UTC · model grok-4.3
The pith
Enforcing real pre-activations in a complex hypothesis class yields an asymptotic storage-capacity ratio relative to fully complex pre-activations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We find the asymptotic ratio between the storage capacities when enforcing real pre-activations in a complex hypothesis class as opposed to complex ones in the same class. Our methods depend on Gardner volume comparisons at critical capacity. Our proof relies on an application of the Harish-Chandra-Itzykson-Zuber formula, nonstandard in literature, to obtain a more robust approximation for the final asymptotic ratio.
What carries the argument
Gardner volume comparisons at critical capacity, using the Harish-Chandra-Itzykson-Zuber formula integrated over unitary and orthogonal compact manifolds with the Weyl integration formula and Haar measure.
If this is right
- The ratio supplies a precise quantitative limit on capacity loss from the real pre-activation constraint.
- The same Gardner-volume plus HCIZ strategy applies to other activation constraints within complex hypothesis classes.
- Fully complex pre-activations achieve strictly higher asymptotic storage capacity than real-constrained ones in the same class.
- The approximation for the ratio is more robust than prior methods that did not invoke the HCIZ formula.
Where Pith is reading between the lines
- Designers of complex-valued networks could use the ratio to decide when the added cost of complex activations is justified by the capacity gain.
- The result suggests testing whether other real-valued restrictions, such as real weights, produce similar capacity reductions.
- Finite-size corrections to the asymptotic ratio might be derivable by extending the same manifold integration techniques.
- The capacity gap implies that training algorithms for complex networks should prioritize preserving complex pre-activation statistics.
Load-bearing premise
The Harish-Chandra-Itzykson-Zuber formula applies validly when the integration is performed over unitary and orthogonal compact manifolds via the Weyl integration formula and Haar measure.
What would settle it
Numerical estimation of storage capacities for large finite networks with real versus complex pre-activations, checking whether their empirical ratio converges to the derived asymptotic value.
Figures
read the original abstract
We find the asymptotic ratio between the storage capacities when enforcing real pre-activations in a complex hypothesis class as opposed to complex ones in the same class. Our methods depend on Gardner volume comparisons at critical capacity. Our proof relies on an application of the Harish-Chandra-Itzykson-Zuber (HCIZ) formula, nonstandard in literature. With the HCIZ formula, we may obtain a more robust approximation for the final asymptotic ratio. This strategy is applicable to our work specifically since we integrate over the unitary and orthogonal compact manifolds, facilitated via the Weyl integration formula and the Haar measure.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to derive the asymptotic ratio between storage capacities of neural networks in a complex hypothesis class when enforcing real pre-activations versus allowing complex pre-activations. The derivation proceeds by comparing Gardner volumes at critical capacity and applies the Harish-Chandra-Itzykson-Zuber (HCIZ) formula, reduced via the Weyl integration formula and Haar measure over unitary and orthogonal compact manifolds, to obtain a more robust approximation for the ratio.
Significance. If the central derivation holds, the result would quantify a specific shortcoming of real constraints within complex-valued networks and supply an explicit asymptotic ratio. The nonstandard use of HCIZ for these volume comparisons could represent a methodological contribution to capacity calculations, but only if the analytic conditions for the integral representation are verified.
major comments (1)
- [HCIZ application in the main proof] The derivation of the headline asymptotic ratio (via Gardner-volume comparison at criticality) rests on an application of the HCIZ formula after Weyl reduction to the maximal torus. The manuscript provides no explicit check that the resulting effective potential satisfies the analyticity requirements or that the saddle-point contour remains valid once the real pre-activation constraint is imposed; this step is load-bearing for the claimed ratio.
minor comments (1)
- [Abstract] The abstract states that the HCIZ approach is 'nonstandard in literature'; a short paragraph contrasting the present usage with existing applications in statistical mechanics or random-matrix theory would help readers assess novelty.
Simulated Author's Rebuttal
We thank the referee for their detailed review and for highlighting the need for explicit verification of analytic conditions in our application of the HCIZ formula. We address the single major comment below.
read point-by-point responses
-
Referee: [HCIZ application in the main proof] The derivation of the headline asymptotic ratio (via Gardner-volume comparison at criticality) rests on an application of the HCIZ formula after Weyl reduction to the maximal torus. The manuscript provides no explicit check that the resulting effective potential satisfies the analyticity requirements or that the saddle-point contour remains valid once the real pre-activation constraint is imposed; this step is load-bearing for the claimed ratio.
Authors: We agree that the manuscript does not currently contain an explicit verification that the effective potential remains analytic or that the saddle-point contour is valid after imposing the real pre-activation constraint. This verification is indeed necessary to rigorously justify the HCIZ application in the presence of the constraint. In the revised version we will add a dedicated appendix that establishes these conditions: we will show analyticity of the reduced potential on the maximal torus by direct differentiation under the integral sign (permitted by compactness of the unitary and orthogonal groups) and confirm contour validity by appealing to the absence of singularities inside the relevant contour for the Gardner volume at criticality, using the same Weyl reduction already employed in the main text. revision: yes
Circularity Check
No circularity: derivation applies external HCIZ formula to Gardner volumes
full rationale
The paper derives the asymptotic storage capacity ratio by comparing Gardner volumes at criticality, using the HCIZ integral representation after Weyl reduction to the maximal torus with Haar measure over U(n) and O(n). The abstract explicitly frames this as an application of a known formula (Harish-Chandra-Itzykson-Zuber) that is external to the present work, with no equations or steps shown that reduce the claimed ratio to a fitted parameter, self-citation chain, or definitional tautology. No self-citations are referenced as load-bearing, and the method is described as nonstandard in application but not as importing uniqueness or ansatzes from prior author work. The derivation is therefore self-contained against the external integral identity.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
A. B. Balantekin. Character expansions, itzykson-zuber integrals, and the qcd partition function. Physical Review D, 62(8), 2000. ISSN 1089-4918. doi: 10.1103/physrevd.62.085017. URL http: //dx.doi.org/10.1103/PhysRevD.62.085017
-
[2]
Malatesta, Gabriele Perugini, and Riccardo Zecchina
Carlo Baldassi, Enrico M. Malatesta, Gabriele Perugini, and Riccardo Zecchina. Typical and atypical solutions in non-convex neural networks with discrete and continuous weights, 2023. URLhttps: //arxiv.org/abs/2304.13871
arXiv 2023
-
[3]
Adriano Barra, Francesco Guerra, and Emanuele Mingione. Interpolating the sherrington–kirkpatrick replica trick.Philosophical Magazine, 92(1-3):78–97, January 2012. ISSN 1478-6443. doi: 10.1080/ 14786435.2011.637979. URLhttp://dx.doi.org/10.1080/14786435.2011.637979
-
[4]
J. A. Barrachina, C. Ren, G. Vieillard, C. Morisseau, and J.-P. Ovarlez. About the equivalence between complex-valued and real-valued fully connected neural networks - application to polinsar images. In 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6,
2021
-
[5]
doi: 10.1109/MLSP52302.2021.9596542
-
[6]
Jose Agustin Barrachina, Chenfang Ren, Christele Morisseau, Gilles Vieillard, and Jean-Philippe Ovarlez. Complex-valued vs. real-valued neural networks for classification perspectives: An example on non-circular data, 2021. URLhttps://arxiv.org/abs/2009.08340
arXiv 2021
-
[7]
On guan’s examples of simply connected non-kahler compact complex manifolds
Fedor A Bogomolov. On guan’s examples of simply connected non-kahler compact complex manifolds. American Journal of Mathematics, 118(5):1037–1046, 1996. doi: 10.1353/ajm.1996.0038
-
[8]
Lucas Böttcher and Mason A. Porter. Complex networks with complex weights.Physical Review E, 109,
-
[9]
doi: 10.1103/physreve.109.024314
-
[10]
Complex network for complex problems: A comparative study of cnn and complex-valued cnn
Soumick Chatterjee, Pavan Tummala, Oliver Speck, and Andreas Nürnberger. Complex network for complex problems: A comparative study of cnn and complex-valued cnn. In2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS), page 1–5. IEEE, December 2022. doi: 10.1109/ipas55744.2022.10053060. URLhttp://dx.doi.org/10.1109/IPA...
-
[11]
A C C Coolen, J E Barrett, P Paga, and C J Perez-Vicente. Replica analysis of overfitting in regression models for time-to-event data.Journal of Physics A: Mathematical and Theoretical, 50(37):375001, August 2017. ISSN 1751-8121. doi: 10.1088/1751-8121/aa812f. URL http://dx.doi.org/10.1088/ 1751-8121/aa812f. 22
-
[12]
On the capacity of neural networks, 2022
Leonardo Cruciani. On the capacity of neural networks, 2022. URLhttps://arxiv.org/abs/2211. 07531
2022
-
[13]
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, 2014
Yann Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, 2014. URLhttps://arxiv.org/abs/1406.2572
Pith/arXiv arXiv 2014
-
[14]
Ricardo Estrada and Ram Kanwal.A Distributional Approach to Asymptotics. 01 2002. ISBN 978-1- 4612-6410-1. doi: 10.1007/978-0-8176-8130-2
-
[15]
The replica-symmetric free energy for ising spin glasses with orthogonally invariant couplings, 2024
Zhou Fan and Yihong Wu. The replica-symmetric free energy for ising spin glasses with orthogonally invariant couplings, 2024. URLhttps://arxiv.org/abs/2105.02797
arXiv 2024
-
[16]
Diffusion models and the manifold hypothesis: Log-domain smoothing is geometry adaptive, 2025
Tyler Farghly, Peter Potaptchik, Samuel Howard, George Deligiannidis, and Jakiw Pidstrigach. Diffusion models and the manifold hypothesis: Log-domain smoothing is geometry adaptive, 2025. URLhttps: //arxiv.org/abs/2510.02305
arXiv 2025
-
[17]
Peter J. Forrester. Meet andréief, bordeaux 1886, and andreev, kharkov 1882-83, 2018. URLhttps: //arxiv.org/abs/1806.10411
Pith/arXiv arXiv 2018
-
[18]
E. Gardner. Maximum storage capacity in neural networks.Europhysics Letters, 4(4):481, aug 1987. doi: 10.1209/0295-5075/4/4/016. URLhttps://doi.org/10.1209/0295-5075/4/4/016
-
[19]
E Gardner. The space of interactions in neural network models.Journal of Physics A: Mathematical and General, 21(1):257, jan 1988. doi: 10.1088/0305-4470/21/1/030. URLhttps://doi.org/10.1088/ 0305-4470/21/1/030
-
[20]
E. Gardner and Bernard Derrida. Optimal storage properties of neural network models.Journal of Physics A: Mathematical and Theoretical, 21(1):271–284, 1988. doi: 10.1088/0305-4470/21/1/031. URL https://hal.science/hal-03285587
-
[21]
Sharp conditions for the bbm formula and asymptotics of heat content-type energies, 2025
Luca Gennaioli and Giorgio Stefani. Sharp conditions for the bbm formula and asymptotics of heat content-type energies, 2025. URLhttps://arxiv.org/abs/2502.14655
arXiv 2025
-
[22]
Wallach.Symmetry, Representations, and Invariants
Roe Goodman and Nolan R. Wallach.Symmetry, Representations, and Invariants. Graduate Texts in Mathematics. Springer New York, 2009. doi: 10.1007/978-0-387-79852-3
- [23]
-
[24]
Spherical integrals of sublinear rank, 2023
Jonathan Husson and Justin Ko. Spherical integrals of sublinear rank, 2023. URLhttps://arxiv.org/ abs/2208.03642
arXiv 2023
-
[25]
Alan T. James. Zonal polynomials of the real positive definite symmetric matrices.Annals of Mathematics, 74(3):456–469, 1961. ISSN 0003486X, 19398980. URLhttp://www.jstor.org/stable/1970291
arXiv 1961
-
[26]
Alan T. James. Distributions of matrix variates and latent roots derived from normal samples.Annals of Mathematical Statistics, 35:475–501, 1964. URLhttps://api.semanticscholar.org/CorpusID: 120381426
1964
-
[27]
Y Kabashima. Inference from correlated patterns: a unified theory for perceptron learning and linear vector channels.Journal of Physics: Conference Series, 95:012001, January 2008. ISSN 1742-6596. doi: 10.1088/1742-6596/95/1/012001. URLhttp://dx.doi.org/10.1088/1742-6596/95/1/012001
-
[28]
Generalized random energy model at complex temperatures,
Zakhar Kabluchko and Anton Klimovsky. Generalized random energy model at complex temperatures,
-
[29]
URLhttps://arxiv.org/abs/1402.2142
-
[30]
H. Kleinert. Hubbard-stratonovich transformation: Successes, failure, and cure, 2011. URLhttps: //arxiv.org/abs/1104.5161. 23
Pith/arXiv arXiv 2011
-
[31]
Fourier neural operator for parametric partial differential equations,
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations,
-
[32]
URLhttps://arxiv.org/abs/2010.08895
Pith/arXiv arXiv 2010
-
[33]
Terry A. Loring. The moore-osgood theorem on exchanging limits, 2010. URLhttps://math.unm.edu/ ~loring/links/analysis_f10/exchange.pdf
2010
-
[34]
I. G. Macdonald.Symmetric Functions and Hall Polynomials. Oxford Mathematical Monographs. Oxford University Press, New York, 2nd edition, 1995
1995
-
[35]
Antoine Maillard, Florent Krzakala, Marc Mézard, and Lenka Zdeborová. Perturbative construction of mean-field equations in extensive-rank matrix factorization and denoising.Journal of Statistical Mechanics: Theory and Experiment, 2022(8):083301, aug 2022. doi: 10.1088/1742-5468/ac7e4c. URL https://doi.org/10.1088/1742-5468/ac7e4c
- [36]
-
[37]
Muirhead
Robb J. Muirhead. Aspects of multivariate statistical theory. InWiley Series in Probability and Statistics,
-
[38]
URLhttps://api.semanticscholar.org/CorpusID:123513635
-
[39]
Evaluation of complex-valued neural networks on real-valued classification tasks, 2018
Nils Mönning and Suresh Manandhar. Evaluation of complex-valued neural networks on real-valued classification tasks, 2018. URLhttps://arxiv.org/abs/1811.12351
Pith/arXiv arXiv 2018
-
[40]
G Parisi, F Ricci-Tersenghi, and D Yllanes. Explicit generation of the branching tree of states in spin glasses.Journal of Statistical Mechanics: Theory and Experiment, 2015(5):P05002, May 2015. ISSN 1742-5468. doi: 10.1088/1742-5468/2015/05/p05002. URL http://dx.doi.org/10.1088/1742-5468/ 2015/05/P05002
-
[41]
An introduction to schur polynomials, 2018
Amritanshu Prasad. An introduction to schur polynomials, 2018. URLhttps://arxiv.org/abs/1802. 06073
2018
-
[42]
B Schlittgen and T Wettig. Generalizations of some integrals over the unitary group.Journal of Physics A: Mathematical and General, 36(12):3195–3201, March 2003. ISSN 0305-4470. doi: 10.1088/0305-4470/ 36/12/319. URLhttp://dx.doi.org/10.1088/0305-4470/36/12/319
-
[43]
Hermann, Paris, 1966
Laurent Schwartz.Théorie des distributions. Hermann, Paris, 1966
1966
-
[44]
Mariya Shcherbina and Brunello Tirozzi. Rigorous solution of the gardner problem.Communications in Mathematical Physics, 234(3):383–422, mar 2003. doi: 10.1007/s00220-002-0783-3. URLhttps: //doi.org/10.1007/s00220-002-0783-3
-
[45]
Jin Young Shin, Jae Yong Lee, and Hyung Ju Hwang. Pseudo-differential neural operator: Generalized fourier neural operator for learning solution operators of partial differential equations, 2024. URL https://arxiv.org/abs/2201.11967
arXiv 2024
-
[46]
Takashi Shinzato and Yoshiyuki Kabashima. Perceptron capacity revisited: classification ability for correlated patterns.Journal of Physics A: Mathematical and Theoretical, 41(32):324013, 2008. ISSN 1751-8121. doi: 10.1088/1751-8113/41/32/324013. URL http://dx.doi.org/10.1088/1751-8113/41/ 32/324013
-
[47]
Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, and Ari S. Morcos. Beyond neural scaling laws: beating power law scaling via data pruning. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=UmvSlP-PyV
2022
-
[48]
On a weighted version of the bbm formula, 2025
Giorgio Stefani. On a weighted version of the bbm formula, 2025. URLhttps://arxiv.org/abs/2504. 06736. 24
2025
-
[49]
Fast ergodic search with kernel functions, 2025
Max Muchen Sun, Ayush Gaggar, Peter Trautman, and Todd Murphey. Fast ergodic search with kernel functions, 2025. URLhttps://arxiv.org/abs/2403.01536
arXiv 2025
-
[50]
The harish-chandra-itzykson-zuber integral formula, February 8 2013
Terence Tao. The harish-chandra-itzykson-zuber integral formula, February 8 2013. URL https://terrytao.wordpress.com/2013/02/08/ the-harish-chandra-itzykson-zuber-integral-formula/
2013
-
[51]
Storage capacity evaluation of the quantum perceptron using the replica method, 2024
Mitsuru Urushibata and Masayuki Ohzeki. Storage capacity evaluation of the quantum perceptron using the replica method, 2024. URLhttps://arxiv.org/abs/2404.14785
arXiv 2024
-
[52]
Jasson Vindas and Ricardo Estrada. On the support of tempered distributions.Journal of Mathematical Analysis and Applications, 414(1):321–330, 2014. doi: 10.1016/j.jmaa.2014.01.009
-
[53]
An explicit formula for zonal polynomials, 2024
Haoming Wang. An explicit formula for zonal polynomials, 2024. URLhttps://arxiv.org/abs/2410. 13558
2024
-
[54]
Weisstein
Eric W. Weisstein. Sifting property. From MathWorld—A Wolfram Resource. URLhttps://mathworld. wolfram.com/SiftingProperty.html
-
[55]
Princeton University Press, Princeton, NJ, 1940
Hermann Weyl.The Classical Groups: Their Invariants and Representations. Princeton University Press, Princeton, NJ, 1940
1940
-
[56]
Storage capacity of perceptron with variable selection, 2025
Yingying Xu, Masayuki Ohzeki, and Yoshiyuki Kabashima. Storage capacity of perceptron with variable selection, 2025. URLhttps://arxiv.org/abs/2512.01861
arXiv 2025
-
[57]
Tianchi Yu, Yiming Qi, Ivan Oseledets, and Shiyi Chen. Spectral informed neural networks.Journal of Computational and Applied Mathematics, 477:117178, May 2026. ISSN 0377-0427. doi: 10.1016/j.cam. 2025.117178. URLhttp://dx.doi.org/10.1016/j.cam.2025.117178
-
[58]
Zavatone-Veth
Jacob A. Zavatone-Veth. Expository notes on the pattern storage capacity of perceptrons with constrained weight distributions. Expository Notes, May 2025. Harvard University. Originally drafted June 2022, edited May 2025
2025
-
[59]
William P. Ziemer. Modern real analysis.Graduate Texts in Mathematics, 2017. doi: 10.1007/ 978-3-319-64629-9. URL https://www.math.purdue.edu/~torresm/pubs/Modern-real-analysis. pdf
2017
-
[60]
Tomasz M. Łapiński. Approximations of the sum of states by laplace’s method for a system of particles with a finite number of energy levels and application to limit theorems.Mathematical Physics, Analysis and Geometry, 23(1), March 2020. ISSN 1572-9656. doi: 10.1007/s11040-020-9330-8. URL http://dx.doi.org/10.1007/s11040-020-9330-8. A Additional proofs A....
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.