Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case

arxiv: 1511.08405 · v1 · pith:GHOIQRXRnew · submitted 2015-11-26 · 💻 cs.LG · stat.ML

Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case

Joon Kwon , Vianney Perchet This is my paper

classification 💻 cs.LG stat.ML

keywords lossesregretsqrtdifferentgainsoptimalorderbound

0 comments p. Extension

pith:GHOIQRXR Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{GHOIQRXR}

Prints a linked pith:GHOIQRXR badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We demonstrate that, in the classical non-stochastic regret minimization problem with $d$ decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most $s$ decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain an optimal regret guarantee after $T$ stages of order $\sqrt{T\log s}$, so the classical dependency in the dimension is replaced by the sparsity size. With losses, we provide matching upper and lower bounds of order $\sqrt{Ts\log(d)/d}$, which is decreasing in $d$. Eventually, we also study the bandit setting, and obtain an upper bound of order $\sqrt{Ts\log (d/s)}$ when outcomes are losses. This bound is proven to be optimal up to the logarithmic factor $\sqrt{\log(d/s)}$.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Covariance-adapting algorithm for semi-bandits with application to sparse rewards
stat.ML 2026-04 unverdicted novelty 7.0

A covariance-adapting algorithm for semi-bandits achieves asymptotically tight regret bounds under a new sub-exponential distribution family, with direct application to sparse rewards.