pith. sign in

arxiv: 1809.05867 · v1 · pith:GPMTNEBOnew · submitted 2018-09-16 · 🧮 math.OC

Continuous-Time Robust Dynamic Programming

classification 🧮 math.OC
keywords dynamicadaptivecontinuous-timecontrolframeworkmethodsoptimalprogramming
0
0 comments X p. Extension
pith:GPMTNEBO Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{GPMTNEBO}

Prints a linked pith:GPMTNEBO badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

This paper presents a new theory, known as robust dynamic pro- gramming, for a class of continuous-time dynamical systems. Different from traditional dynamic programming (DP) methods, this new theory serves as a fundamental tool to analyze the robustness of DP algorithms, and in par- ticular, to develop novel adaptive optimal control and reinforcement learning methods. In order to demonstrate the potential of this new framework, four illustrative applications in the fields of stochastic optimal control and adaptive DP are presented. Three numerical examples arising from both finance and engineering industries are also given, along with several possible extensions of the proposed framework.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DVPO: Distributional Value Modeling-based Policy Optimization for LLM Post-Training

    cs.LG 2025-12 unverdicted novelty 5.0

    DVPO learns token-level value distributions and uses asymmetric risk regularization to contract lower tails while expanding upper tails, outperforming PPO and GRPO under noisy supervision in dialogue, math, and QA tasks.