Bandit Algorithms

Lattimore, T · 2020 · DOI 10.1017/9781108571401

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open at publisher browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Mean-based algorithms: A lower bound and regret

cs.LG · 2026-06-03 · unverdicted · novelty 7.0

Derives first lower bound on γ_t for mean-based algorithms in unknown-horizon bandit settings, proposes two new algorithms, and shows some are also no-regret.

Prior-Free Sample Size Design for Test-and-Roll Experiments

econ.EM · 2026-05-04 · unverdicted · novelty 7.0

The paper introduces the Worst-case Marginal Benefit (WMB) criterion for sample-size design in test-and-roll experiments and shows it yields an optimal m approximately equal to N/3 for Bernoulli and Gaussian outcomes.

Target-Aware Bandit Allocation for Scalable Surrogate Optimization in Chemical Space

cs.LG · 2026-06-25 · unverdicted · novelty 6.0

Introduces BOBa, a multi-armed bandit method for scalable surrogate optimization that adaptively allocates inference and evaluations to promising partitions of ultra-large chemical libraries.

A Demon that remembers: An agential approach towards quantum thermodynamics of temporal correlations

quant-ph · 2026-04-06 · unverdicted · novelty 6.0

A classical agent extracts more work from quantum temporal correlations via adaptive strategies bounded by the new Time-Ordered Free Energy, while reinforcement learning achieves polylogarithmic dissipation when learning unknown states.

Contextual Scalarisation Thompson Sampling for multi-objective decisions in public media

cs.IR · 2026-05-29 · unverdicted · novelty 4.0

CSTS learns context-dependent weights for multiple objectives in a multi-objective contextual bandit and outperforms fixed-weight and standard contextual bandit baselines on Swiss public broadcaster programming data.

Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial

cs.LG · 2026-04-01 · accept · novelty 2.0

Bayesian optimization automates the scientific discovery cycle by modeling observations with surrogate models and using acquisition functions to select experiments that balance known information with new exploration.

SAGA: Workflow-Atomic Scheduling for AI Agent Inference on GPU Clusters

cs.DC · 2026-05-01

citing papers explorer

Showing 1 of 1 citing paper after filters.

A Demon that remembers: An agential approach towards quantum thermodynamics of temporal correlations quant-ph · 2026-04-06 · unverdicted · none · ref 91
A classical agent extracts more work from quantum temporal correlations via adaptive strategies bounded by the new Time-Ordered Free Energy, while reinforcement learning achieves polylogarithmic dissipation when learning unknown states.

Bandit Algorithms

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer