pith. machine review for the scientific record. sign in

arxiv: 1606.04460 · v1 · submitted 2016-06-14 · 📊 stat.ML · cs.LG· q-bio.NC

Recognition: unknown

Model-Free Episodic Control

Authors on Pith no claims yet
classification 📊 stat.ML cs.LGq-bio.NC
keywords episodiclearningalgorithmscontroldeephighlyreinforcementrewarding
0
0 comments X
read the original abstract

State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of hippocampal episodic control can learn to solve difficult sequential decision-making tasks. We demonstrate that it not only attains a highly rewarding strategy significantly faster than state-of-the-art deep reinforcement learning algorithms, but also achieves a higher overall reward on some of the more challenging domains.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation

    cs.LG 2026-05 unverdicted novelty 6.0

    FAAST analytically compiles labeled examples into fast weights via a single forward pass, matching backprop adaptation performance with over 90% less time and up to 95% less memory than memory-based methods.

  2. Information as Structural Alignment: A Dynamical Theory of Continual Learning

    cs.LG 2026-04 unverdicted novelty 6.0

    IBF achieves near-zero forgetting and positive backward transfer in continual learning by driving configurations toward coherence through motion and modification dynamics without storing raw data.

  3. BrainMem: Brain-Inspired Evolving Memory for Embodied Agent Task Planning

    cs.RO 2026-03 unverdicted novelty 6.0

    BrainMem equips LLM-based embodied planners with working, episodic, and semantic memory that evolves interaction histories into retrievable knowledge graphs and guidelines, raising success rates on long-horizon 3D benchmarks.

  4. FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation

    cs.LG 2026-05 unverdicted novelty 5.0

    FAAST performs test-time supervised adaptation by analytically deriving fast weights from examples in one forward pass, matching backprop performance with over 90% less adaptation time and up to 95% memory savings ver...

  5. Artifacts as Memory Beyond the Agent Boundary

    cs.AI 2026-04 unverdicted novelty 5.0

    Artifacts in the environment can reduce the memory an RL agent needs to represent its history, as shown by a mathematical proof and experiments with spatial paths.