Extends robust MDPs to continuous time with policy gradient derivations using differential equation methods and proposes optimizers achieving linear convergence and specific sample complexities.
and Ryu, E
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Natural policy gradient is a special case of doubly smoothed policy iteration that achieves distribution-free global geometric convergence to an epsilon-optimal policy in O((1-gamma)^{-1} log((1-gamma)^{-1} epsilon^{-1})) iterations.
citing papers explorer
-
Policy Gradient for Continuous-Time Robust Markov Decision Processes
Extends robust MDPs to continuous time with policy gradient derivations using differential equation methods and proposes optimizers achieving linear convergence and specific sample complexities.
-
Natural Policy Gradient as Doubly Smoothed Policy Iteration: A Bellman-Operator Framework
Natural policy gradient is a special case of doubly smoothed policy iteration that achieves distribution-free global geometric convergence to an epsilon-optimal policy in O((1-gamma)^{-1} log((1-gamma)^{-1} epsilon^{-1})) iterations.