pith. machine review for the scientific record. sign in

arxiv: 1404.3862 · v4 · submitted 2014-04-15 · 📊 stat.ML · cs.AI· cs.LG

Recognition: unknown

Optimizing the CVaR via Sampling

Aviv Tamar, Shie Mannor, Yonatan Glassner

Authors on Pith no claims yet
classification 📊 stat.ML cs.AIcs.LG
keywords cvargradientconditionalconsiderdomainsestimatorformulamethod
0
0 comments X
read the original abstract

Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the CVaR gradient, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs

    cs.AI 2026-05 unverdicted novelty 7.0

    DR-Gym is a new Gymnasium-compatible simulator for training utility demand-response policies with regime-switching wholesale prices and physics-based building demand.

  2. Concrete Problems in AI Safety

    cs.AI 2016-06 accept novelty 7.0

    The paper categorizes five concrete AI safety problems arising from flawed objectives, costly evaluation, and learning dynamics.