Beyond the rainbow: High performance deep reinforcement learning on a desktop pc.arXiv preprint arXiv:2411.03820

Clark, T · 2024 · arXiv 2411.03820

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Scalable Reinforcement Learning via Adaptive Batch Scaling

stat.ML · 2026-05-20 · unverdicted · novelty 7.0 · 2 refs

ABS uses Behavioral Divergence to adaptively scale batch sizes in RL according to policy volatility, enabling effective large-batch large-network training on ALE benchmarks.

Distributional Value Estimation Without Target Networks for Robust Quality-Diversity

cs.LG · 2026-04-22 · unverdicted · novelty 5.0

QDHUAC is a distributional, target-free QD-RL method that enables stable high-UTD training and competitive performance on Brax locomotion tasks using far fewer environment steps than prior approaches.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Scalable Reinforcement Learning via Adaptive Batch Scaling stat.ML · 2026-05-20 · unverdicted · none · ref 5 · 2 links
ABS uses Behavioral Divergence to adaptively scale batch sizes in RL according to policy volatility, enabling effective large-batch large-network training on ALE benchmarks.
Distributional Value Estimation Without Target Networks for Robust Quality-Diversity cs.LG · 2026-04-22 · unverdicted · none · ref 8
QDHUAC is a distributional, target-free QD-RL method that enables stable high-UTD training and competitive performance on Brax locomotion tasks using far fewer environment steps than prior approaches.

Beyond the rainbow: High performance deep reinforcement learning on a desktop pc.arXiv preprint arXiv:2411.03820

fields

years

verdicts

representative citing papers

citing papers explorer