pith. sign in

arxiv: math/0510384 · v1 · pith:ZJSPQ635new · submitted 2005-10-18 · 🧮 math.PR

A penalized bandit algorithm

classification 🧮 math.PR
keywords algorithmconvergencedistributionlimitarmed-banditbanditcentralcharacterized
0
0 comments X
read the original abstract

We study a two armed-bandit algorithm with penalty. We show the convergence of the algorithm and establish the rate of convergence. For some choices of the parameters, we obtain a central limit theorem in which the limit distribution is characterized as the unique stationary distribution of a discontinuous Markov process.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.