On the Spatiotemporal Dynamics of Generalization in Neural Networks

Zichao Wei

classification 💻 cs.LG cs.AI

keywords computationneuraladditionautomatoncellulardigitdynamicsgeneralization

read the original abstract

Why do neural networks fail to generalize addition from 16-digit to 32-digit numbers, while a child who learns the rule can apply it to arbitrarily long sequences? We argue that this failure is not an engineering problem but a violation of physical postulates. Drawing inspiration from physics, we identify three constraints that any generalizing system must satisfy: (1) Locality -- information propagates at finite speed; (2) Symmetry -- the laws of computation are invariant across space and time; (3) Stability -- the system converges to discrete attractors that resist noise accumulation. From these postulates, we derive -- rather than design -- the Spatiotemporal Evolution with Attractor Dynamics (SEAD) architecture: a neural cellular automaton where local convolutional rules are iterated until convergence. Experiments on three tasks validate our theory: (1) Parity -- demonstrating perfect length generalization via light-cone propagation; (2) Addition -- achieving scale-invariant inference from L=16 to L=1 million with 100% accuracy, exhibiting input-adaptive computation; (3) Rule 110 -- learning a Turing-complete cellular automaton without trajectory divergence. Our results suggest that the gap between statistical learning and logical reasoning can be bridged -- not by scaling parameters, but by respecting the physics of computation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On the Mirage of Long-Range Dependency, with an Application to Integer Multiplication
cs.LG 2026-03 unverdicted novelty 8.0

Long-range dependency in integer multiplication is a mirage from 1D representation; a 2D grid reduces it to local 3x3 operations, letting a 321-parameter neural cellular automaton generalize perfectly to inputs 683 ti...
Structural Generalization on SLOG without Hand-Written Rules
cs.CL 2026-04 unverdicted novelty 7.0

A neural cellular automaton model learns all compositional rules from data via local iteration and achieves 100% type-exact match on 11 of 17 structural generalization categories on the SLOG benchmark.
On the Emergence of Syntax by Means of Local Interaction
cs.CL 2026-04 unverdicted novelty 7.0

A 2D neural cellular automaton spontaneously self-organizes into a Proto-CKY representation that exhibits syntactic processing capabilities for context-free grammars when trained on membership problems.
Structural Generalization on SLOG without Hand-Written Rules
cs.CL 2026-04 unverdicted novelty 6.0

A neural cellular automaton learns compositional rules from data alone to achieve structural generalization on the SLOG semantic parsing benchmark, reaching 67.3% accuracy and fully succeeding on 11 of 17 categories.