Efficient Risk-sensitive Planning via Entropic Risk Measures

Alexandre Marthe et al · 2025 · arXiv 2502.20423

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

On the Sample Complexity of Discounted Reinforcement Learning with Optimized Certainty Equivalents

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

Characterizes utility functions making recursive OCE objectives PAC-learnable and derives matching upper and lower PAC sample complexity bounds for value and policy learning, with improved tau dependence for CVaR.

Tight Sample Complexity Bounds for Entropic Best Policy Identification

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

New concentration bounds and stopping rule close the exponential gap to match the lower bound for entropic best policy identification.

Recursive Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model

cs.LG · 2025-05-30 · unverdicted · novelty 7.0

Derives PAC-type upper bounds and matching lower bounds on sample complexity for value and policy learning under recursive entropic risk measures, with exponential dependence on |β|/(1-γ).

citing papers explorer

Showing 3 of 3 citing papers.

On the Sample Complexity of Discounted Reinforcement Learning with Optimized Certainty Equivalents cs.LG · 2026-05-20 · unverdicted · none · ref 49
Characterizes utility functions making recursive OCE objectives PAC-learnable and derives matching upper and lower PAC sample complexity bounds for value and policy learning, with improved tau dependence for CVaR.
Tight Sample Complexity Bounds for Entropic Best Policy Identification cs.LG · 2026-05-13 · unverdicted · none · ref 26
New concentration bounds and stopping rule close the exponential gap to match the lower bound for entropic best policy identification.
Recursive Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model cs.LG · 2025-05-30 · unverdicted · none · ref 37
Derives PAC-type upper bounds and matching lower bounds on sample complexity for value and policy learning under recursive entropic risk measures, with exponential dependence on |β|/(1-γ).

Efficient Risk-sensitive Planning via Entropic Risk Measures

fields

years

verdicts

representative citing papers

citing papers explorer