pith. machine review for the scientific record. sign in

Stabilizing off-policy q-learning via bootstrapping error reduction.arXiv preprint arXiv:1906.00949

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

fields

cs.LG 4 cs.AI 1

representative citing papers

Offline Reinforcement Learning with Implicit Q-Learning

cs.LG · 2021-10-12 · unverdicted · novelty 8.0

IQL achieves policy improvement in offline RL by implicitly estimating optimal action values through state-conditional upper expectiles of value functions, without querying Q-functions on out-of-distribution actions.

Behavior Regularized Offline Reinforcement Learning

cs.LG · 2019-11-26 · unverdicted · novelty 6.0

Behavior-regularized actor-critic methods achieve strong offline RL results with simple regularization, rendering many recent technical additions unnecessary.

citing papers explorer

Showing 5 of 5 citing papers.