Robust dynamic programming

· 2005

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees

cs.LG · 2026-04-15 · unverdicted · novelty 8.0

RHC-UCRL is the first algorithm for safety-constrained RL under explicit adversarial dynamics, providing sub-linear regret and constraint violation guarantees by maintaining optimism over both agent and adversary policies.

Regret-Optimal Control for Finite-State Systems

math.OC · 2026-04-26 · unverdicted · novelty 6.0

A nested dynamic program using the Regret-Bellman operator computes regret-optimal policies that interpolate between MDP and robust controllers for finite-state systems.

citing papers explorer

Showing 2 of 2 citing papers.

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees cs.LG · 2026-04-15 · unverdicted · none · ref 14
RHC-UCRL is the first algorithm for safety-constrained RL under explicit adversarial dynamics, providing sub-linear regret and constraint violation guarantees by maintaining optimism over both agent and adversary policies.
Regret-Optimal Control for Finite-State Systems math.OC · 2026-04-26 · unverdicted · none · ref 20
A nested dynamic program using the Regret-Bellman operator computes regret-optimal policies that interpolate between MDP and robust controllers for finite-state systems.

Robust dynamic programming

fields

years

verdicts

representative citing papers

citing papers explorer