Continuous-Time Robust Dynamic Programming

arxiv: 1809.05867 · v1 · pith:GPMTNEBOnew · submitted 2018-09-16 · 🧮 math.OC

Continuous-Time Robust Dynamic Programming

Tao Bian , Zhong-Ping Jiang This is my paper

classification 🧮 math.OC

keywords dynamicadaptivecontinuous-timecontrolframeworkmethodsoptimalprogramming

0 comments p. Extension

pith:GPMTNEBO Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{GPMTNEBO}

Prints a linked pith:GPMTNEBO badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

This paper presents a new theory, known as robust dynamic pro- gramming, for a class of continuous-time dynamical systems. Different from traditional dynamic programming (DP) methods, this new theory serves as a fundamental tool to analyze the robustness of DP algorithms, and in par- ticular, to develop novel adaptive optimal control and reinforcement learning methods. In order to demonstrate the potential of this new framework, four illustrative applications in the fields of stochastic optimal control and adaptive DP are presented. Three numerical examples arising from both finance and engineering industries are also given, along with several possible extensions of the proposed framework.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

DVPO: Distributional Value Modeling-based Policy Optimization for LLM Post-Training
cs.LG 2025-12 unverdicted novelty 5.0

DVPO learns token-level value distributions and uses asymmetric risk regularization to contract lower tails while expanding upper tails, outperforming PPO and GRPO under noisy supervision in dialogue, math, and QA tasks.