pith. machine review for the scientific record. sign in

arxiv: 2602.01651 · v2 · submitted 2026-02-02 · 💻 cs.LG · cs.AI

Recognition: unknown

On the Spatiotemporal Dynamics of Generalization in Neural Networks

Zichao Wei

classification 💻 cs.LG cs.AI
keywords computationneuraladditionautomatoncellulardigitdynamicsgeneralization
0
0 comments X
read the original abstract

Why do neural networks fail to generalize addition from 16-digit to 32-digit numbers, while a child who learns the rule can apply it to arbitrarily long sequences? We argue that this failure is not an engineering problem but a violation of physical postulates. Drawing inspiration from physics, we identify three constraints that any generalizing system must satisfy: (1) Locality -- information propagates at finite speed; (2) Symmetry -- the laws of computation are invariant across space and time; (3) Stability -- the system converges to discrete attractors that resist noise accumulation. From these postulates, we derive -- rather than design -- the Spatiotemporal Evolution with Attractor Dynamics (SEAD) architecture: a neural cellular automaton where local convolutional rules are iterated until convergence. Experiments on three tasks validate our theory: (1) Parity -- demonstrating perfect length generalization via light-cone propagation; (2) Addition -- achieving scale-invariant inference from L=16 to L=1 million with 100% accuracy, exhibiting input-adaptive computation; (3) Rule 110 -- learning a Turing-complete cellular automaton without trajectory divergence. Our results suggest that the gap between statistical learning and logical reasoning can be bridged -- not by scaling parameters, but by respecting the physics of computation.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. On the Mirage of Long-Range Dependency, with an Application to Integer Multiplication

    cs.LG 2026-03 unverdicted novelty 8.0

    Long-range dependency in integer multiplication is a mirage from 1D representation; a 2D grid reduces it to local 3x3 operations, letting a 321-parameter neural cellular automaton generalize perfectly to inputs 683 ti...

  2. Structural Generalization on SLOG without Hand-Written Rules

    cs.CL 2026-04 unverdicted novelty 7.0

    A neural cellular automaton model learns all compositional rules from data via local iteration and achieves 100% type-exact match on 11 of 17 structural generalization categories on the SLOG benchmark.

  3. On the Emergence of Syntax by Means of Local Interaction

    cs.CL 2026-04 unverdicted novelty 7.0

    A 2D neural cellular automaton spontaneously self-organizes into a Proto-CKY representation that exhibits syntactic processing capabilities for context-free grammars when trained on membership problems.

  4. Structural Generalization on SLOG without Hand-Written Rules

    cs.CL 2026-04 unverdicted novelty 6.0

    A neural cellular automaton learns compositional rules from data alone to achieve structural generalization on the SLOG semantic parsing benchmark, reaching 67.3% accuracy and fully succeeding on 11 of 17 categories.