Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.
Improved algorithms for linear stochastic bandits
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2025 2representative citing papers
citing papers explorer
-
Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information
Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.
- Neural Variance-aware Dueling Bandits with Deep Representation and Shallow Exploration