pith. sign in

arxiv: 1511.06279 · v4 · pith:XD2QXGL4new · submitted 2015-11-19 · 💻 cs.LG · cs.NE

Neural Programmer-Interpreters

classification 💻 cs.LG cs.NE
keywords programslearnsmemoryneuralprogramrecurrentcompositionalexecute
0
0 comments X
read the original abstract

We propose the neural programmer-interpreter (NPI): a recurrent and compositional neural network that learns to represent and execute programs. NPI has three learnable components: a task-agnostic recurrent core, a persistent key-value program memory, and domain-specific encoders that enable a single NPI to operate in multiple perceptually diverse environments with distinct affordances. By learning to compose lower-level programs to express higher-level programs, NPI reduces sample complexity and increases generalization ability compared to sequence-to-sequence LSTMs. The program memory allows efficient learning of additional tasks by building on existing programs. NPI can also harness the environment (e.g. a scratch pad with read-write pointers) to cache intermediate results of computation, lessening the long-term memory burden on recurrent hidden units. In this work we train the NPI with fully-supervised execution traces; each program has example sequences of calls to the immediate subprograms conditioned on the input. Rather than training on a huge number of relatively weak labels, NPI learns from a small number of rich examples. We demonstrate the capability of our model to learn several types of compositional programs: addition, sorting, and canonicalizing 3D models. Furthermore, a single NPI learns to execute these programs and all 21 associated subprograms.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Gradient-Based Program Synthesis with Neurally Interpreted Languages

    cs.LG 2026-04 unverdicted novelty 8.0

    NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...

  2. Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

    cs.LG 2022-01 unverdicted novelty 8.0

    Neural networks exhibit grokking on small algorithmic datasets, achieving perfect generalization well after overfitting.

  3. Show Your Work: Scratchpads for Intermediate Computation with Language Models

    cs.LG 2021-11 unverdicted novelty 8.0

    Training language models to generate intermediate computation steps on a scratchpad enables them to perform multi-step tasks such as long addition and arbitrary program execution that they otherwise fail at.

  4. Adaptive Computation Time for Recurrent Neural Networks

    cs.NE 2016-03 accept novelty 8.0

    ACT lets RNNs dynamically adapt computation depth per input via a differentiable halting unit, yielding large gains on synthetic tasks and structural insights on language data.

  5. Training Transformers as a Universal Computer

    cs.AI 2026-04 unverdicted novelty 7.0

    A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.

  6. Solving math word problems with process- and outcome-based feedback

    cs.LG 2022-11 unverdicted novelty 6.0

    On GSM8K, outcome-based supervision achieves similar final-answer error rates to process-based with less labeling, but process-based or learned reward models are needed to reach 3.4% reasoning error among correct solutions.

  7. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

    cs.LG 2021-04 accept novelty 6.0

    Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.

  8. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

    cs.SE 2021-02 unverdicted novelty 6.0

    CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.

  9. Neural Computers

    cs.LG 2026-04 unverdicted novelty 5.0

    Neural Computers are introduced as a new machine form where computation, memory, and I/O are unified in a learned runtime state, with initial video-model experiments showing acquisition of basic interface primitives f...

  10. Why Build an Assistant in Minecraft?

    cs.AI 2019-07 unverdicted novelty 4.0

    A rationale is presented for developing an assistant in Minecraft to advance natural language understanding and dialogue learning.