Simplifying deep temporal difference learning

Gallici, M · 2024 · arXiv 2407.04811

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Goal-Conditioned Agents that Learn Everything All at Once

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

LEO enables efficient all-goals learning in goal-conditioned RL by jointly predicting for all goals in one network pass, yielding >250x speedup over relabelling and better performance on Craftax.

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

cs.LG · 2026-04-06 · unverdicted · novelty 6.0 · 2 refs

FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.

Scalable On-Policy Reinforcement Learning via Adaptive Batch Scaling

stat.ML · 2026-05-20 · unverdicted · novelty 5.0

Adaptive Batch Scaling dynamically increases batch size in on-policy RL as policy volatility drops, measured by a new Behavioral Divergence metric, and shows larger networks plus larger batches outperform on ALE with PQN.

A High-Throughput Compute-Efficient POMDP Hide-And-Seek-Engine (HASE) for Multi-Agent Operations

cs.MA · 2026-04-29 · unverdicted · novelty 5.0

A C++ Dec-POMDP simulator using data-oriented design and zero-copy PyTorch integration achieves up to 33 million steps per second on a 16-core CPU, enabling multi-agent policy training in minutes with PPO, DQN, and SAC.

Plasticity Loss in Deep Reinforcement Learning: A Survey

cs.AI · 2024-11-07 · unverdicted · novelty 4.0

Survey unifies the definition of plasticity loss in DRL, taxonomizes over 50 mitigations, identifies evaluation gaps, and finds general regularization often outperforms domain-specific methods.

TABX: A High-Throughput Sandbox Battle Simulator for Multi-Agent Reinforcement Learning

cs.MA · 2026-02-02

citing papers explorer

Showing 6 of 6 citing papers.

Goal-Conditioned Agents that Learn Everything All at Once cs.LG · 2026-05-22 · unverdicted · none · ref 68
LEO enables efficient all-goals learning in goal-conditioned RL by jointly predicting for all goals in one network pass, yielding >250x speedup over relabelling and better performance on Craftax.
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control cs.LG · 2026-04-06 · unverdicted · none · ref 18 · 2 links
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
Scalable On-Policy Reinforcement Learning via Adaptive Batch Scaling stat.ML · 2026-05-20 · unverdicted · none · ref 7
Adaptive Batch Scaling dynamically increases batch size in on-policy RL as policy volatility drops, measured by a new Behavioral Divergence metric, and shows larger networks plus larger batches outperform on ALE with PQN.
A High-Throughput Compute-Efficient POMDP Hide-And-Seek-Engine (HASE) for Multi-Agent Operations cs.MA · 2026-04-29 · unverdicted · none · ref 11
A C++ Dec-POMDP simulator using data-oriented design and zero-copy PyTorch integration achieves up to 33 million steps per second on a 16-core CPU, enabling multi-agent policy training in minutes with PPO, DQN, and SAC.
Plasticity Loss in Deep Reinforcement Learning: A Survey cs.AI · 2024-11-07 · unverdicted · none · ref 34
Survey unifies the definition of plasticity loss in DRL, taxonomizes over 50 mitigations, identifies evaluation gaps, and finds general regularization often outperforms domain-specific methods.
TABX: A High-Throughput Sandbox Battle Simulator for Multi-Agent Reinforcement Learning cs.MA · 2026-02-02 · unreviewed · ref 7

Simplifying deep temporal difference learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer