Exploratory optimal stopping: A singular control formulation

Jodi Dianetti, Giorgio Ferrari, Renyuan Xu · 2024 · arXiv 2408.09335

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems

q-fin.CP · 2026-06-10 · unverdicted · novelty 7.0

A two-stage actor-critic RL algorithm learns deterministic equilibrium policies for general time-inconsistent control problems by combining DPG on an auxiliary time-consistent problem with fixed-point iteration on auxiliary functions.

Equilibrium under Time-Inconsistency: A New Existence Theory by Vanishing Entropy Regularization

math.OC · 2026-03-11 · unverdicted · novelty 7.0

Solutions to the regularized exploratory equilibrium HJB equation converge in suitable norms to a strong solution of the original EHJB as the entropy parameter vanishes, yielding existence of equilibria without conventional stringent regularity assumptions.

Randomized Optimal Switching Problem and Related Mirror Descent Flow

math.OC · 2026-06-11 · unverdicted · novelty 6.0

Proves regularized value solves elliptic HJB system with Gibbs policy, approximates classical optimum with O(λ log 1/λ) error, and shows mirror descent flow converges at O(1/(e^{λs}-1) + λ log 1/λ) or O(log s / sqrt(s)).

A Two-fold Randomization Framework for Impulse Control Problems

math.OC · 2025-09-15

citing papers explorer

Showing 3 of 3 citing papers after filters.

Equilibrium under Time-Inconsistency: A New Existence Theory by Vanishing Entropy Regularization math.OC · 2026-03-11 · unverdicted · none · ref 6
Solutions to the regularized exploratory equilibrium HJB equation converge in suitable norms to a strong solution of the original EHJB as the entropy parameter vanishes, yielding existence of equilibria without conventional stringent regularity assumptions.
Randomized Optimal Switching Problem and Related Mirror Descent Flow math.OC · 2026-06-11 · unverdicted · none · ref 8
Proves regularized value solves elliptic HJB system with Gibbs policy, approximates classical optimum with O(λ log 1/λ) error, and shows mirror descent flow converges at O(1/(e^{λs}-1) + λ log 1/λ) or O(log s / sqrt(s)).
A Two-fold Randomization Framework for Impulse Control Problems math.OC · 2025-09-15 · unreviewed · ref 17

Exploratory optimal stopping: A singular control formulation

fields

years

verdicts

representative citing papers

citing papers explorer