A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

· 2015 · stat.ML · arXiv 1510.00757

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Adaptive and sequential experiment design is a well-studied area in numerous domains. We survey and synthesize the work of the online statistical learning paradigm referred to as multi-armed bandits integrating the existing research as a resource for a certain class of online experiments. We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or consideration of the experiment design context. Finally, at the end of the paper, we present a table of known upper-bounds of regret for all studied algorithms providing both perspectives for future theoretical work and a decision-making tool for practitioners looking for theoretical guarantees.

representative citing papers

Productization Challenges of Contextual Multi-Armed Bandits

cs.IR · 2019-07-10 · accept · novelty 3.0

The authors enumerate and address six productization challenges encountered while running contextual multi-armed bandits for two large-scale web use cases.

citing papers explorer

Showing 1 of 1 citing paper.

Productization Challenges of Contextual Multi-Armed Bandits cs.IR · 2019-07-10 · accept · none · ref 2 · internal anchor
The authors enumerate and address six productization challenges encountered while running contextual multi-armed bandits for two large-scale web use cases.

A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

fields

years

verdicts

representative citing papers

citing papers explorer