Continual Learning with Deep Generative Replay
read the original abstract
Attempts to train a comprehensive artificial intelligence capable of solving multiple tasks have been impeded by a chronic problem called catastrophic forgetting. Although simply replaying all previous data alleviates the problem, it requires large memory and even worse, often infeasible in real world applications where the access to past data is limited. Inspired by the generative nature of hippocampus as a short-term memory system in primate brain, we propose the Deep Generative Replay, a novel framework with a cooperative dual model architecture consisting of a deep generative model ("generator") and a task solving model ("solver"). With only these two models, training data for previous tasks can easily be sampled and interleaved with those for a new task. We test our methods in several sequential learning settings involving image classification tasks.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
World Action Models Enable Continual Imitation Learning with Recurrent Generative Replays
REGEN uses recurrent generative replays from World Action Models to cut catastrophic forgetting by up to 50% in continual imitation learning compared to sequential fine-tuning.
-
Forgetting in Language Models: Capacity, Optimization, and Self-Generated Replay
Self-generated replay from language models nearly eliminates catastrophic forgetting during finetuning except when models are pretrained close to saturation.
-
Attention to task structure for cognitive flexibility
Task connectivity in graph-structured multi-task environments enhances generalization and stability, with stronger benefits for attention models than MLPs.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.