BCCB unifies learning of heterogeneous ad responses, exploration of uncertain users, and budget pacing into a single online process that works effectively from the first user on the Criteo Uplift dataset.
Thompson sampling for contextual bandits with linear payoffs
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
RIE-Greedy uses stochasticity from cross-validation regularization to induce Thompson Sampling-like exploration, claimed equivalent in the two-armed case and empirically competitive in large-scale settings.
citing papers explorer
-
Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making
BCCB unifies learning of heterogeneous ad responses, exploration of uncertain users, and budget pacing into a single online process that works effectively from the first user on the Criteo Uplift dataset.
-
RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits
RIE-Greedy uses stochasticity from cross-validation regularization to induce Thompson Sampling-like exploration, claimed equivalent in the two-armed case and empirically competitive in large-scale settings.