Proposes SW-Whittle policy that achieves sub-linear dynamic regret for restless bandits with unknown non-stationary transition kernels via sliding windows and bandit-over-bandit window tuning.
Timely communications for remote inference
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Online Learning of Whittle Indices for Restless Bandits with Non-Stationary Transition Kernels
Proposes SW-Whittle policy that achieves sub-linear dynamic regret for restless bandits with unknown non-stationary transition kernels via sliding windows and bandit-over-bandit window tuning.