Recognition: unknown
Online Convex Optimization with Time-Varying Constraints
read the original abstract
This paper considers online convex optimization with time-varying constraint functions. Specifically, we have a sequence of convex objective functions $\{f_t(x)\}_{t=0}^{\infty}$ and convex constraint functions $\{g_{t,i}(x)\}_{t=0}^{\infty}$ for $i \in \{1, ..., k\}$. The functions are gradually revealed over time. For a given $\epsilon>0$, the goal is to choose points $x_t$ every step $t$, without knowing the $f_t$ and $g_{t,i}$ functions on that step, to achieve a time average at most $\epsilon$ worse than the best fixed-decision that could be chosen with hindsight, subject to the time average of the constraint functions being nonpositive. It is known that this goal is generally impossible. This paper develops an online algorithm that solves the problem with $O(1/\epsilon^2)$ convergence time in the special case when all constraint functions are nonpositive over a common subset of $\mathbb{R}^n$. Similar performance is shown in an expected sense when the common subset assumption is removed but the constraint functions are assumed to vary according to a random process that is independent and identically distributed (i.i.d.) over time slots $t \in \{0, 1, 2, \ldots\}$. Finally, in the special case when both the constraint and objective functions are i.i.d. over time slots $t$, the algorithm is shown to come within $\epsilon$ of optimality with respect to the best (possibly time-varying) causal policy that knows the full probability distribution.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Convex Optimization with Nested Evolving Feasible Sets
For convex losses in nested evolving feasible sets, a lazy algorithm balances O(T^{1-β}) regret with O(T^β) movement for any β; for strongly convex or sharp losses, Frugal achieves zero regret with O(log T) movement, ...
-
Constrained Contextual Bandits with Adversarial Contexts
A modular reduction from budget-constrained contextual bandits with adversarial contexts to unconstrained bandits via surrogate rewards, yielding improved guarantees and an efficient algorithm based on SquareCB.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.