pith. machine review for the scientific record. sign in

arxiv: 2411.04832 · v3 · submitted 2024-11-07 · 💻 cs.AI · cs.LG

Recognition: unknown

Plasticity Loss in Deep Reinforcement Learning: A Survey

Authors on Pith no claims yet
classification 💻 cs.AI cs.LG
keywords plasticitylossdeeplearningreinforcementunderstandingabilityadapt
0
0 comments X
read the original abstract

Plasticity refers to a network's ability to adapt to changing data distributions, which is crucial for the successful training of deep reinforcement learning agents. Loss of plasticity causes performance plateaus and contributes to scaling failures, overestimation bias, and insufficient exploration. To deepen the understanding of plasticity loss, we propose a unified definition, examine its drivers and pathologies, and organize over 50 mitigation strategies into the first comprehensive taxonomy of the field. Our analysis shows gaps in current evaluation practices and reveals that general regularization techniques often outperform domain-specific interventions. Future research should prioritize understanding the mechanisms underlying plasticity loss.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning

    cs.LG 2026-04 unverdicted novelty 7.0

    TeLAPA maintains archives of behaviorally diverse yet competent policies aligned in a shared latent space to preserve plasticity and enable faster recovery after interference in continual reinforcement learning.

  2. SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning

    cs.LG 2026-05 unverdicted novelty 6.0

    SPHERE applies a Parseval penalty derived from a Neural Tangent Kernel proxy for spectral plasticity to Mixture-of-Experts policies, raising average success rates by 133% on MetaWorld and 50% on HumanoidBench in conti...

  3. SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning

    cs.LG 2026-05 unverdicted novelty 6.0

    SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.

  4. Safe Continual Reinforcement Learning in Non-stationary Environments

    cs.LG 2026-04 unverdicted novelty 6.0

    Safe continual RL methods face a fundamental tension between enforcing safety constraints and preventing catastrophic forgetting in non-stationary environments, with regularization providing only partial mitigation.