pith. machine review for the scientific record. sign in

arxiv: 1609.04436 · v1 · submitted 2016-09-14 · 💻 cs.AI · cs.LG· stat.ML

Recognition: unknown

Bayesian Reinforcement Learning: A Survey

Authors on Pith no claims yet
classification 💻 cs.AI cs.LGstat.ML
keywords bayesianmethodslearningalgorithmspriorsurveyexpressedfunction
0
0 comments X
read the original abstract

Bayesian methods for machine learning have been widely investigated, yielding principled methods for incorporating prior information into inference algorithms. In this survey, we provide an in-depth review of the role of Bayesian methods for the reinforcement learning (RL) paradigm. The major incentives for incorporating Bayesian reasoning in RL are: 1) it provides an elegant approach to action-selection (exploration/exploitation) as a function of the uncertainty in learning; and 2) it provides a machinery to incorporate prior knowledge into the algorithms. We first discuss models and methods for Bayesian inference in the simple single-step Bandit model. We then review the extensive recent literature on Bayesian methods for model-based RL, where prior information can be expressed on the parameters of the Markov model. We also present Bayesian methods for model-free RL, where priors are expressed over the value function or policy class. The objective of the paper is to provide a comprehensive survey on Bayesian RL algorithms and their theoretical and empirical properties.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent

    cs.AI 2026-02 unverdicted novelty 2.0

    A survey provides a task-based formalization of meta-learning and meta-RL while chronicling algorithms that lead to DeepMind's Adaptive Agent.