pith. machine review for the scientific record. sign in

arxiv: 1505.00369 · v3 · submitted 2015-05-02 · 🧮 math.ST · stat.TH

Recognition: unknown

Batched bandit problems

Authors on Pith no claims yet
classification 🧮 math.ST stat.TH
keywords banditsbatchesnumberoptimalpolicyregretsmallstochastic
0
0 comments X
read the original abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.