Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.
Improved re- gret bounds for oracle-based adversarial contextual bandits.Advances in Neural Information Processing Systems, 29
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information
Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.