For the number of rounds and statistics, we used the same setting as in the main synthetic bandit experiments

We swept the variance inflation scale over c2 ={2,3,4}, compared it with standard ReMax, TS, KL-UCB · 2018

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Finite-Time Regret Analysis of Retry-Aware Bandits

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

ReMax achieves the first sublinear finite-time regret bound for Gaussian bandits with M=2 by deriving an expected-improvement balance condition for its optimal sampling distribution and separating saturation from underestimation effects.

citing papers explorer

Showing 1 of 1 citing paper.

Finite-Time Regret Analysis of Retry-Aware Bandits cs.LG · 2026-05-20 · unverdicted · none · ref 20
ReMax achieves the first sublinear finite-time regret bound for Gaussian bandits with M=2 by deriving an expected-improvement balance condition for its optimal sampling distribution and separating saturation from underestimation effects.

For the number of rounds and statistics, we used the same setting as in the main synthetic bandit experiments

fields

years

verdicts

representative citing papers

citing papers explorer