Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.
Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time.Journal of the ACM (JACM), 51(3):385–463
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information
Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.