A Short Survey On Memory Based Reinforcement Learning

Dhruv Ramani

arxiv: 1904.06736 · v1 · pith:FNIT7S4Bnew · submitted 2019-04-14 · 💻 cs.AI

A Short Survey On Memory Based Reinforcement Learning

Dhruv Ramani This is my paper

classification 💻 cs.AI

keywords learningdecisionmakingmemoryrecentreinforcementhighmethods

0 comments

read the original abstract

Reinforcement learning (RL) is a branch of machine learning which is employed to solve various sequential decision making problems without proper supervision. Due to the recent advancement of deep learning, the newly proposed Deep-RL algorithms have been able to perform extremely well in sophisticated high-dimensional environments. However, even after successes in many domains, one of the major challenge in these approaches is the high magnitude of interactions with the environment required for efficient decision making. Seeking inspiration from the brain, this problem can be solved by incorporating instance based learning by biasing the decision making on the memories of high rewarding experiences. This paper reviews various recent reinforcement learning methods which incorporate external memory to solve decision making and a survey of them is presented. We provide an overview of the different methods - along with their advantages and disadvantages, applications and the standard experimentation settings used for memory based models. This review hopes to be a helpful resource to provide key insight of the recent advances in the field and provide help in further future development of it.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TACTFUL: Tactile-Driven Exploration For Object Localization and Identification in Confined Environments
cs.RO 2026-06 unverdicted novelty 6.0

TACTFUL introduces a vision-free tactile policy for robotic exploration and object identification in confined workspaces, trained on real hardware and achieving 77% success with 0.015 m reconstruction error.