Batched bandit problems

Vianney Perchet , Philippe Rigollet , Sylvain Chassang , Erik Snowberg

Authors on Pith no claims yet

classification 🧮 math.ST stat.TH

keywords banditsbatchesnumberoptimalpolicyregretsmallstochastic

read the original abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

This paper has not been read by Pith yet.

Batched bandit problems

discussion (0)