pith. sign in

Minimax regret bounds for reinforcement learning

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

A Measure-Theoretic Finite-Sample Theory for Adaptive-Data Fitted Q-Iteration

cs.LG · 2026-05-07 · unverdicted · novelty 8.0

The authors derive a finite-sample adaptive-data performance bound for FQI by chaining measure-theoretic probability with Bellman contractions and prove the first cumulative pathwise online regret guarantee in continuous spaces using sequential Rademacher complexity.

citing papers explorer

Showing 1 of 1 citing paper.

  • A Measure-Theoretic Finite-Sample Theory for Adaptive-Data Fitted Q-Iteration cs.LG · 2026-05-07 · unverdicted · none · ref 3

    The authors derive a finite-sample adaptive-data performance bound for FQI by chaining measure-theoretic probability with Bellman contractions and prove the first cumulative pathwise online regret guarantee in continuous spaces using sequential Rademacher complexity.