IDQL generalizes IQL into an actor-critic framework and uses diffusion policies for robust policy extraction, outperforming prior offline RL methods.
Jiancheng Yang, Rui Shi, and Bingbing Ni
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 2roles
background 1polarities
background 1representative citing papers
LiLAW learns to weight samples as easy, moderate or hard using three global scalars updated by one gradient step on a validation batch to improve noisy training performance.
citing papers explorer
-
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies
IDQL generalizes IQL into an actor-critic framework and uses diffusion policies for robust policy extraction, outperforming prior offline RL methods.
-
LiLAW: Lightweight Learnable Adaptive Weighting to Learn Sample Difficulty & Improve Noisy Training
LiLAW learns to weight samples as easy, moderate or hard using three global scalars updated by one gradient step on a validation batch to improve noisy training performance.