Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Brenden M. Lake , Marco Baroni

Authors on Pith no claims yet

classification 💻 cs.CL cs.AIcs.LG

keywords compositionalgeneralizationnetworksneuralrnnsskillscommandsmeaning

read the original abstract

Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generalizations when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Structural Generalization on SLOG without Hand-Written Rules
cs.CL 2026-04 unverdicted novelty 7.0

A neural cellular automaton model learns all compositional rules from data via local iteration and achieves 100% type-exact match on 11 of 17 structural generalization categories on the SLOG benchmark.
Training Transformers as a Universal Computer
cs.AI 2026-04 unverdicted novelty 7.0

A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.
On the Emergence of Syntax by Means of Local Interaction
cs.CL 2026-04 unverdicted novelty 7.0

A 2D neural cellular automaton spontaneously self-organizes into a Proto-CKY representation that exhibits syntactic processing capabilities for context-free grammars when trained on membership problems.
Structural Generalization on SLOG without Hand-Written Rules
cs.CL 2026-04 unverdicted novelty 6.0

A neural cellular automaton learns compositional rules from data alone to achieve structural generalization on the SLOG semantic parsing benchmark, reaching 67.3% accuracy and fully succeeding on 11 of 17 categories.
HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering
cs.AI 2026-04 unverdicted novelty 6.0

HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.
LLMs for Text-Based Exploration and Navigation Under Partial Observability
cs.AI 2026-03 unverdicted novelty 6.0

Reasoning-tuned LLMs reliably complete navigation in partial-observability gridworlds but take longer paths than oracle optima, with few-shot prompting reducing invalid moves and action priors like UP/RIGHT causing loops.
How Psychological Learning Paradigms Shaped and Constrained Artificial Intelligence
cs.CL 2026-03 unverdicted novelty 5.0

AI's compositional reasoning failures originate in psychological learning paradigms that shaped its architectures, and the ReSynth trimodular framework is proposed to embed systematicity structurally.