Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13):3521–3526

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al · 2017

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

Characterizing and Correcting Effective Target Shift in Online Learning

stat.ML · 2026-05-08 · unverdicted · novelty 7.0

Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.

Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Forgetting in LLM continual post-training is a geometry conflict between task-induced covariance structures and the evolving model state, controlled by gating Wasserstein barycenter merging on measured conflict.

HoReN: Normalized Hopfield Retrieval for Large-Scale Sequential Model Editing

cs.LG · 2026-05-02 · unverdicted · novelty 6.0

HoReN achieves stable sequential editing of 50K facts in LLMs by combining a normalized Hopfield codebook with angular retrieval and attractor dynamics.

FLAME: Adaptive Mixture-of-Experts for Continual Multimodal Multi-Task Learning

cs.LG · 2026-05-10 · unverdicted · novelty 5.0

FLAME is an MoE architecture using modality-specific routers and low-rank compression of expert knowledge to support efficient continual multimodal multi-task learning while reducing catastrophic forgetting.

Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning

cs.LG · 2026-05-09 · unverdicted · novelty 5.0

Muon-OGD integrates Muon-style spectral-norm geometry with orthogonal gradient constraints to improve the stability-plasticity trade-off during sequential LLM adaptation.

citing papers explorer

Showing 5 of 5 citing papers.

Characterizing and Correcting Effective Target Shift in Online Learning stat.ML · 2026-05-08 · unverdicted · none · ref 10
Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training cs.LG · 2026-05-10 · unverdicted · none · ref 54
Forgetting in LLM continual post-training is a geometry conflict between task-induced covariance structures and the evolving model state, controlled by gating Wasserstein barycenter merging on measured conflict.
HoReN: Normalized Hopfield Retrieval for Large-Scale Sequential Model Editing cs.LG · 2026-05-02 · unverdicted · none · ref 12
HoReN achieves stable sequential editing of 50K facts in LLMs by combining a normalized Hopfield codebook with angular retrieval and attractor dynamics.
FLAME: Adaptive Mixture-of-Experts for Continual Multimodal Multi-Task Learning cs.LG · 2026-05-10 · unverdicted · none · ref 32
FLAME is an MoE architecture using modality-specific routers and low-rank compression of expert knowledge to support efficient continual multimodal multi-task learning while reducing catastrophic forgetting.
Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning cs.LG · 2026-05-09 · unverdicted · none · ref 1
Muon-OGD integrates Muon-style spectral-norm geometry with orthogonal gradient constraints to improve the stability-plasticity trade-off during sequential LLM adaptation.

Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13):3521–3526

fields

years

verdicts

representative citing papers

citing papers explorer