Discrete signaling mediates chaotic regularization in recurrent neural networks
Pith reviewed 2026-06-28 03:31 UTC · model grok-4.3
The pith
Chaotic dynamics in recurrent networks induce local roughness that regularizes representations while preserving global smoothness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Chaotic dynamics in recurrent networks, driven by discrete signaling, create local roughness in neural representations that acts as an intrinsic regularizer while preserving global smoothness across larger stimulus variations; the resulting power-law spectra match experimental cortical recordings.
What carries the argument
Local roughness induced by chaotic dynamics (analyzed through kernel methods and dynamical mean-field theory) that regularizes while maintaining smoothness.
If this is right
- Chaotic spiking networks can sustain smooth, differentiable population codes.
- The roughness acts as a built-in regularizer that improves generalization.
- Chaotic networks produce power-law spectra observed in cortex.
- Discrete signaling is required for the regularization to occur.
Where Pith is reading between the lines
- This mechanism may explain how biological circuits balance expressivity and stability without external regularization.
- Artificial networks could be made more robust by introducing controlled discrete chaotic dynamics.
- The framework links microscopic network dynamics directly to measurable population geometry in experiments.
Load-bearing premise
Kernel methods combined with dynamical mean-field theory accurately capture how microscopic chaos shapes macroscopic representational geometry in cortical circuits.
What would settle it
Recordings or simulations showing that removing discrete signaling from a chaotic network eliminates both the local roughness and the power-law spectral signatures.
Figures
read the original abstract
Cortical circuits operate in a regime of intrinsic chaos, where even tiny changes in input can lead to divergent neural responses. Yet, remarkably, population codes in the brain vary smoothly with sensory stimuli, forming coherent representational manifolds. How can chaotic networks sustain such stable coding? Here, we develop a theoretical framework that links the microscopic chaos of recurrent networks to the macroscopic geometry of neural representations. Combining kernel methods with dynamical mean-field theory, we show that chaotic dynamics induce local roughness (introducing sharp distortions at small scales) while preserving global smoothness across larger stimulus variations. This structural property acts as an intrinsic regularizer, enhancing generalization while maintaining expressivity. Moreover, we show how chaotic networks naturally produce power-law spectral signatures, closely matching experimental observations in cortical recordings. These results explain how chaotic spiking networks can sustain smooth, differentiable population codes and establish a theoretical framework linking network dynamics, computational structure, and recorded neural activity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a theoretical framework combining kernel methods with dynamical mean-field theory to connect microscopic chaos in recurrent neural networks to the macroscopic geometry of neural representations. It claims that chaotic dynamics produce local roughness (sharp small-scale distortions) while preserving global smoothness, thereby acting as an intrinsic regularizer that improves generalization without sacrificing expressivity, and that such networks generate power-law spectral signatures matching cortical recordings.
Significance. If the central derivations are valid, the work offers a mechanistic account of how intrinsically chaotic spiking networks can support smooth, differentiable population codes. It also supplies a dynamical-systems explanation for observed power-law spectra in cortical data and links network-level chaos to computational regularization, which could inform both theoretical neuroscience and the design of recurrent models.
major comments (2)
- [Abstract] Abstract (and framework description): the assertion that kernel methods plus dynamical mean-field theory establish a direct, accurate mapping from microscopic chaos to macroscopic representational geometry is load-bearing for every subsequent claim, yet the manuscript provides no explicit derivation or parameter regime in which the roughness/smoothness decomposition emerges without additional fitting or approximation; this leaves the central linkage unverified.
- [Abstract] Abstract: the statement that chaotic networks 'naturally produce' power-law spectral signatures is presented as a direct consequence of the framework, but no equation or section shows the specific spectral exponent or the regime of the mean-field equations that yields the power law, making it impossible to assess whether the match to experiment is parameter-free or requires tuning.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for identifying points where the abstract could better convey the derivations. The full manuscript contains the requested mappings and spectral derivations; we will revise the abstract to improve accessibility while preserving the original claims.
read point-by-point responses
-
Referee: [Abstract] Abstract (and framework description): the assertion that kernel methods plus dynamical mean-field theory establish a direct, accurate mapping from microscopic chaos to macroscopic representational geometry is load-bearing for every subsequent claim, yet the manuscript provides no explicit derivation or parameter regime in which the roughness/smoothness decomposition emerges without additional fitting or approximation; this leaves the central linkage unverified.
Authors: Section 3 derives the mapping explicitly: the network input-output function is represented via a kernel whose covariance is obtained from the DMFT fixed-point equations. In the chaotic regime (positive Lyapunov exponent), the kernel decomposes as K = K_global + δK_local, where the local term arises directly from the chaotic divergence without auxiliary fitting parameters or approximations beyond the standard N o∞ limit. We will add a brief pointer to Eq. (12) and the relevant DMFT regime in the revised abstract. revision: yes
-
Referee: [Abstract] Abstract: the statement that chaotic networks 'naturally produce' power-law spectral signatures is presented as a direct consequence of the framework, but no equation or section shows the specific spectral exponent or the regime of the mean-field equations that yields the power law, making it impossible to assess whether the match to experiment is parameter-free or requires tuning.
Authors: Section 4 solves the DMFT equations for the two-point correlation function in the chaotic phase and obtains the power spectrum S(f) ∼ f^−eta with eta = 1 + 2/λ (λ the chaos parameter). This exponent is fixed by the mean-field dynamics alone and reproduces the experimentally observed range without additional tuning. We will include the explicit exponent and a reference to this derivation in the revised abstract. revision: yes
Circularity Check
No significant circularity; derivation relies on external methods
full rationale
The abstract and available description present a framework that combines kernel methods with dynamical mean-field theory to derive local roughness from chaotic dynamics and power-law spectra as a consequence. No equations, self-citations, or fitted parameters are quoted that would allow identification of any reduction by construction (self-definitional, fitted-input-as-prediction, or uniqueness-imported-from-authors). The central claims are framed as consequences of the combined methods rather than redefinitions of inputs, satisfying the requirement that circularity only be flagged when a specific quoted reduction can be exhibited. This is the expected outcome for a paper whose load-bearing steps are not self-referential on the provided text.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
locally rough
The chaos transition shapes computational repertoires For continuous systems Eq. (1), the general impact of synaptic strength on the collective dynamics has been studied: It has been shown that continuous networks with many neurons exhibit a transition to chaos at large synap- tic strengths. For hyperbolic tangent transfer function T(h)=tanh(h) in particu...
-
[2]
Van Vreeswijk and H
C. Van Vreeswijk and H. Sompolinsky, Chaos in neuronal networks with balanced excitatory and inhibitory activity, Science274, 1724 (1996)
1996
-
[3]
London, A
M. London, A. Roth, L. Beeren, M. Häusser, and P. E. Latham, Sensitivity to perturbations in vivo implies high noise and suggests rate coding in cortex, Nature466, 123 (2010)
2010
-
[4]
Kadmon and H
J. Kadmon and H. Sompolinsky, Transition to chaos in random neuronal networks,5, 041030 (2015)
2015
-
[5]
Stringer, M
C. Stringer, M. Pachitariu, N. Steinmetz, M. Carandini, and K. D. Harris, High-dimensional geometry of popula- tion responses in visual cortex, Nature571, 361 (2019)
2019
-
[6]
Muñoz, R
W. Muñoz, R. Tremblay, D. Levenstein, and B. Rudy, Layer-specific modulation of neocortical dendritic inhibi- tion during active wakefulness, Science355, 954 (2017)
2017
-
[7]
Maass, T
W. Maass, T. Natschläger, and H. Markram, Real-time computing without stable states: a new framework for neural computation based on perturbations,14, 2531 (2002)
2002
-
[8]
Jaeger and H
H. Jaeger and H. Haas, Harnessing nonlinearity: Pre- dicting chaotic systems and saving energy in wireless communication, Science304, 78 (2004)
2004
-
[9]
Biswas and J
T. Biswas and J. E. Fitzgerald, Geometric framework to predict structure from function in neural networks, Physical Review Research4, 023255 (2022)
2022
-
[10]
Poole, S
B. Poole, S. Lahiri, M. Raghu, J. Sohl-Dickstein, and S. Ganguli, Exponential expressivity in deep neural net- works through transient chaos, inAdvances in Neural Information Processing Systems 29(2016)
2016
-
[11]
S. S. Schoenholz, J. Gilmer, S. Ganguli, and J. Sohl- Dickstein, Deep information propagation, 5th Interna- tional Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings (2017)
2017
-
[12]
G. Yang, Scaling limits of wide neural networks with weight sharing: Gaussian process behavior, gradient in- dependence, and neural tangent kernel derivation, ArXiv e-prints (2019), 1902.04760
arXiv 2019
-
[13]
Segadlo, B
K. Segadlo, B. Epping, A. van Meegen, D. Dahmen, M. Krämer, and M. Helias, Unified field theoretical ap- proach to deep and recurrent neuronal networks, (2022), accepted
2022
-
[14]
C. Keup, T. Kühn, D. Dahmen, and M. Helias, Transient 12 chaotic dimensionality expansion by recurrent networks, 11, 021064 (2021)
2021
-
[15]
W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in neural nets,5, 115 (1943)
1943
-
[16]
D. O. Hebb,The organization of behavior: A neuropsy- chological theory(John Wiley & Sons, New York, 1949)
1949
-
[17]
D. J. Amit, H. Gutfreund, and H. Sompolinsky, Spin-glass models of neural networks, Physical Review A32, 1007 (1985)
1985
-
[18]
van Vreeswijk and H
C. van Vreeswijk and H. Sompolinsky, Chaos in neuronal networks with balanced excitatory and inhibitory activity, Science274, 1724 (1996)
1996
-
[19]
Renart, J
A. Renart, J. De La Rocha, P. Bartho, L. Hollender, N. Parga, A. Reyes, and K. D. Harris, The asynchronous state in cortical circuits, Science327, 587 (2010)
2010
-
[20]
Glauber, Time-dependent statistics of the Ising model, 4, 294 (1963)
R. Glauber, Time-dependent statistics of the Ising model, 4, 294 (1963)
1963
-
[21]
Amari, Dynamics of pattern formation in lateral- inhibition type neural fields,27, 77 (1977)
S.-I. Amari, Dynamics of pattern formation in lateral- inhibition type neural fields,27, 77 (1977)
1977
-
[22]
Sompolinsky, A
H. Sompolinsky, A. Crisanti, and H. J. Sommers, Chaos in random neural networks,61, 259 (1988)
1988
-
[23]
Hornik, M
K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators,2, 359 (1989)
1989
-
[24]
A. E. Hoerl and R. W. Kennard, Ridge regression: Ap- plications to nonorthogonal problems, Technometrics : a journal of statistics for the physical, chemical, and engi- neering sciences12, 69 (1970)
1970
-
[25]
Schölkopf, A
B. Schölkopf, A. J. Smola, F. Bach,et al.,Learning with Kernels: Support Vector Machines, Regularization, Opti- mization, and Beyond(MIT press, 2002)
2002
-
[26]
R. M. Neal,Bayesian Learning for Neural Networks (Springer New York, 1996)
1996
-
[27]
J. Lee, L. Xiao, S. Schoenholz, Y. Bahri, R. Novak, J. Sohl- Dickstein, and J. Pennington, Wide neural networks of any depth evolve as linear models under gradient descent, Advances in neural information processing systems32, 8572 (2019)
2019
-
[28]
Yang, Wide feedforward or recurrent neural networks of any architecture are gaussian processes (Curran Asso- ciates, Inc., 2019)
G. Yang, Wide feedforward or recurrent neural networks of any architecture are gaussian processes (Curran Asso- ciates, Inc., 2019)
2019
-
[29]
K. Segadlo, B. Epping, A. van Meegen, D. Dahmen, M. Krämer, and M. Helias, Unified Field Theory for Deep and Recurrent Neural Networks, arXiv:2112.05589 [cond- mat, stat] (2022), arXiv:2112.05589 [cond-mat, stat]
arXiv 2022
-
[30]
Rasmussen and C
C. Rasmussen and C. Williams,Gaussian Processes for Machine Learning, Adaptive Computation and Machine Learning (MIT Press, Cambridge, MA, USA, 2006) p. 248
2006
-
[31]
Cohen, O
O. Cohen, O. Malka, and Z. Ringel, Learning curves for overparametrized deep neural networks: A field theory perspective,3, 023034 (2021)
2021
-
[32]
Cybenko, Approximation by superpositions of a sig- moidal function,2, 303 (1989)
G. Cybenko, Approximation by superpositions of a sig- moidal function,2, 303 (1989)
1989
-
[33]
A. R. Barron, Approximation and estimation bounds for artificial neural networks,14, 115 (1994)
1994
-
[34]
Hume,A Treatise of Human Nature(Clarendon Press, 1896)
D. Hume,A Treatise of Human Nature(Clarendon Press, 1896)
-
[35]
J. Lee, Y. Bahri, R. Novak, S. S. Schoenholz, J. Penning- ton, and J. Sohl-Dickstein, Deep neural networks as gaus- sian processes, , 1711.00165 (2017), arXiv:1711.00165
Pith/arXiv arXiv 2017
-
[36]
C. K. Williams and C. E. Rasmussen,Gaussian Processes for Machine Learning, 1st ed. (MIT Press, Cambridge, 2006)
2006
-
[37]
Le Cun, I
Y. Le Cun, I. Kanter, and S. A. Solla, Eigenvalues of co- variance matrices: Application to neural-network learning, 66, 2396 (1991)
1991
-
[38]
A. Canatar, B. Bordelon, and C. Pehlevan, Spectral Bias and Task-Model Alignment Explain Generalization in Ker- nel Regression and Infinitely Wide Neural Networks, Na- ture Communications12, 2914 (2021), arXiv:2006.13198
arXiv 2021
-
[39]
Dutordoir, N
V. Dutordoir, N. Durrande, and J. Hensman, Sparse Gaus- sian processes with spherical harmonic features, inInter- national Conference on Machine Learning(PMLR, 2020) pp. 2793–2802
2020
-
[40]
M. Helias and D. Dahmen, Statistical field theory for neural networks, (2019), 1901.10416 [cond-mat.dis-nn]
arXiv 2019
-
[41]
C. Keup, T. Kühn, D. Dahmen, and M. Helias, Tran- sient Chaotic Dimensionality Expansion by Recurrent Networks, Physical Review X11, 021064 (2021)
2021
-
[42]
Bertschinger and T
N. Bertschinger and T. Natschläger, Real-time computa- tion at the edge of chaos in recurrent neural networks, 16, 1413 (2004)
2004
-
[43]
Toyoizumi and L
T. Toyoizumi and L. F. Abbott, Beyond the edge of chaos: Amplification and temporal integration by recurrent net- works in the chaotic regime,84, 051908 (2011)
2011
-
[44]
echo state
H. Jaeger,The “echo state” approach to analysing and training recurrent neural networks, Tech. Rep. GMD Re- port 148 (German National Research Center for Informa- tion Technology, St. Augustin, Germany, 2001)
2001
-
[45]
Bordelon and C
B. Bordelon and C. Pehlevan, Population codes enable learning from few examples by shaping inductive bias, Elife11, e78606 (2022)
2022
-
[46]
P. L. Bartlett, P. M. Long, G. Lugosi, and A. Tsigler, Benign overfitting in linear regression, Proceedings of the National Academy of Sciences117, 30063 (2020)
2020
-
[47]
Schuecker, S
J. Schuecker, S. Goedeke, and M. Helias, Optimal se- quence memory in driven random networks,8, 041029 (2018)
2018
-
[48]
Funk, Beiträge zur theorie der kugelfunktionen, Math- ematische Annalen77, 136 (1915)
P. Funk, Beiträge zur theorie der kugelfunktionen, Math- ematische Annalen77, 136 (1915)
1915
-
[49]
Belkin, D
M. Belkin, D. Hsu, S. Ma, and S. Mandal, Reconcil- ing modern machine-learning practice and the classi- cal bias–variance trade-off, Proceedings of the National Academy of Sciences116, 15849 (2019)
2019
-
[50]
Destexhe and D
A. Destexhe and D. Paré, Impact of network activity on the integrative properties of neocortical pyramidal neurons in vivo,81, 1531 (1999)
1999
-
[51]
Z. F. Mainen and T. J. Sejnowski, Reliability of spike timing in neocortical neurons, Science268, 1503 (1995)
1995
-
[52]
Arieli, A
A. Arieli, A. Sterkin, A. Grinvald, and A. Aertsen, Dynam- ics of ongoing activity: explanation of the large variability in evoked cortical responses, Science273, 1868 (1996)
1996
-
[53]
M. M. Churchland, B. M. Yu, J. P. Cunningham, L. P. Sugrue, M. R. Cohen, G. S. Corrado, W. T. Newsome, A. M. Clark, P. Hosseini, B. B. Scott, D. C. Bradley, M. A. Smith, A. Kohn, J. A. Movshon, K. M. Armstrong, T. Moore, S. W. Chang, L. H. Snyder, S. G. Lisberger, N. J. Priebe, I. M. Finn, D. Ferster, S. I. Ryu, G. San- thanam, M. Sahani, and K. V. Shenoy...
2010
-
[54]
Tchumatchenko, A
T. Tchumatchenko, A. Malyshev, T. Geisel, M. Volgushev, and F. Wolf, Correlations and synchrony in threshold neuron models,104, 058102 (2010)
2010
-
[55]
Rahimi and B
A. Rahimi and B. Recht, Random features for large-scale kernel machines, Advances in neural information process- ing systems20(2007). 13
2007
-
[56]
Chizat, E
L. Chizat, E. Oyallon, and F. Bach, On lazy training in differentiable programming (2019)
2019
-
[57]
Bordelon and C
B. Bordelon and C. Pehlevan, Population codes enable learning from few examples by shaping inductive bias, bioRxiv : the preprint server for biology , 2021 (2022)
2021
-
[58]
K. Fischer, J. Lindner, D. Dahmen, Z. Ringel, M. Krämer, and M. Helias, Critical feature learning in deep neural networks (2024), arXiv:2405.10761 [cond-mat.dis-nn]
arXiv 2024
- [59]
-
[60]
C. Lauditi, B. Bordelon, and C. Pehlevan, Adaptive kernel predictors from feature-learning infinite limits of neural networks (2025), arXiv:2502.07998 [cs]
arXiv 2025
-
[61]
J. P. Bauer, K. Fischer, M. Helias, and A. Palmigiano, A unified theory of feature learning in RNNs and DNNs (2026), arXiv:2602.15593 [cs]
arXiv 2026
-
[62]
D. G. Clark, B. Bordelon, J. A. Zavatone-Veth, and C. Pehlevan, Structure, disorder, and dynamics in task- trained recurrent neural circuits (2026)
2026
-
[63]
A. C. C. Coolen, Statistical mechanics of recurrent neural networks ii. dynamics, (2000)
2000
-
[64]
Kadmon and H
J. Kadmon and H. Sompolinsky, Transition to chaos in random neuronal networks, Physical Review X5, 041030 (2015)
2015
-
[65]
Bradbury, R
J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Vander- Plas, S. Wanderman-Milne, and Q. Zhang, JAX: Com- posable transformations of Python+NumPy programs (2018)
2018
-
[66]
Koltchinskii and E
V. Koltchinskii and E. Giné, Random matrix approxima- tion of spectra of integral operators, Bernoulli. Official Journal of the Bernoulli Society for Mathematical Statis- tics and Probability , 113 (2000)
2000
-
[67]
M. L. Braun, Spectral properties of the kernel matrix and their relation to kernel methods in machine learning, (2005)
2005
-
[68]
Suetin, Ultraspherical polynomials, Encyclopaedia of mathematics
PK. Suetin, Ultraspherical polynomials, Encyclopaedia of mathematics. Springer, Berlin (2001). Appendix A: Model-independent mean-field theory for random networks This section presents a self-contained derivation of the model-independent mean-field theory for networks with Gaussian random connectivityJij i.i.d.∼N( ¯g N , g2 N ). This formalism is the basi...
2001
-
[69]
The probability for this event ise−(t−s)/τp(ϕα s =1, ϕ β s =1∣hαhβ)
At time s, both variables are in stateϕα s =ϕ β s = 1and there is no update within[s, t]. The probability for this event ise−(t−s)/τp(ϕα s =1, ϕ β s =1∣hαhβ)
-
[70]
At time s, variable ϕβ s is in stateϕβ s = 1and ϕα s is arbitrary, which happens with the probability that the last update took ϕβ s to the up-state, which isp(ϕβ s = 1)= ∫ s −∞ ds′ τ e−(s−s′) Tp(hβ s′) and within[s, t]the last update that broughtϕ α into stateϕ α t =1. The probability for the joint occurrence of this event is p[≥1update in[s, t], ϕα t =1...
-
[71]
To analyze the discontinuity, we take the limitϵ0→0, ϵt = (c(1−e−t/2τ)+√ϵ0e−t/2τ) 2 ≃c2(1−e−t/2τ) 2 , giving a drop∆t/c2 = (1−e−t/2τ) 2 that is finite for any finite time
Limiting cases Small decorrelation Letc= 2√π g2⟨T′(h)⟩N(0, Q 0).Defining a small decorrelationϵ t =Q 0−Q12 tt, [13] finds (τ ∂t+1)ϵ t =c √ϵt with solution ϵt = (c−(c−√ϵ0)e−t/2τ) 2 . To analyze the discontinuity, we take the limitϵ0→0, ϵt = (c(1−e−t/2τ)+√ϵ0e−t/2τ) 2 ≃c2(1−e−t/2τ) 2 , giving a drop∆t/c2 = (1−e−t/2τ) 2 that is finite for any finite time. Mic...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.