hub

Chaudhry et al

· 2019 · cs.LG · arXiv 1902.10486

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

open full Pith review browse 11 citing papers arXiv PDF

abstract

In continual learning (CL), an agent learns from a stream of tasks leveraging prior experience to transfer knowledge to future tasks. It is an ideal framework to decrease the amount of supervision in the existing learning algorithms. But for a successful knowledge transfer, the learner needs to remember how to perform previous tasks. One way to endow the learner the ability to perform tasks seen in the past is to store a small memory, dubbed episodic memory, that stores few examples from previous tasks and then to replay these examples when training for future tasks. In this work, we empirically analyze the effectiveness of a very small episodic memory in a CL setup where each training example is only seen once. Surprisingly, across four rather different supervised learning benchmarks adapted to CL, a very simple baseline, that jointly trains on both examples from the current task as well as examples stored in the episodic memory, significantly outperforms specifically designed CL approaches with and without episodic memory. Interestingly, we find that repetitive training on even tiny memories of past tasks does not harm generalization, on the contrary, it improves it, with gains between 7\% and 17\% when the memory is populated with a single example per class.

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning

cs.AI · 2023-06-05 · conditional · novelty 8.0

LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.

KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks

cs.LG · 2026-05-12 · conditional · novelty 7.0

KAN-CL cuts catastrophic forgetting by 88-93% on Split-CIFAR-10/5T and Split-CIFAR-100/10T by anchoring KAN parameters at per-knot granularity while matching baseline accuracy.

Online Continual Learning with Dynamic Label Hierarchies

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

HALO improves online continual learning under evolving label hierarchies by adaptively combining classification heads regularized with organized learnable prototypes for better adaptation and reduced forgetting.

MIST: Reliable Streaming Decision Trees for Online Class-Incremental Learning via McDiarmid Bound

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

MIST fixes unreliable splits in streaming decision trees for class-incremental learning by using a K-independent McDiarmid bound on Gini impurity, Bayesian moment projection for knowledge transfer, and KLL quantile sketches for adaptive leaf predictions.

Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay

q-bio.TO · 2026-04-15 · conditional · novelty 7.0

A structure-aware VAE generates realistic FC matrices for replay, combined with multi-level knowledge distillation and hierarchical contextual bandit sampling, to enable continual fMRI-based brain disorder diagnosis across sequentially arriving multi-site data without catastrophic forgetting.

Direct Discrepancy Replay: Distribution-Discrepancy Condensation and Manifold-Consistent Replay for Continual Face Forgery Detection

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A replay method for continual face forgery detection condenses real-fake distribution discrepancies into compact maps and synthesizes compatible samples from current real faces to reduce forgetting under tight memory budgets without storing historical images.

Continual Fine-Tuning of Large Language Models via Program Memory

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

ProCL organizes LoRA adapters into input-conditioned program memory slots that combine with a distributed adapter to improve retention and reduce forgetting in continual LLM fine-tuning.

Critical Patch-Aware Sparse Prompting with Decoupled Training for Continual Learning on the Edge

cs.LG · 2026-04-08 · unverdicted · novelty 6.0

CPS-Prompt delivers 1.6x gains in peak memory, training time, and energy on edge hardware for continual learning while staying within 2% accuracy of top prompt-based baselines.

HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning

cs.AI · 2026-05-07 · unverdicted · novelty 5.0

HEDP uses energy regularization inspired by Helmholtz free energy plus hybrid energy-distance weighting in prompts to improve domain selection and achieve a 2.57% accuracy gain on benchmarks like CORe50 while mitigating catastrophic forgetting.

CoMemNet: Contrastive Sampling with Memory Replay Network for Continual Traffic Prediction

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

CoMemNet is a dual-branch continual learning model for dynamic traffic networks that combines contrastive sampling via Wasserstein features and memory replay to achieve SOTA performance while mitigating forgetting.

Face-D(^2)CL: Multi-Domain Synergistic Representation with Dual Continual Learning for Facial DeepFake Detection

cs.CV · 2026-04-09 · unverdicted · novelty 4.0

Face-D²CL fuses spatial and frequency features and uses dual continual learning to reduce forgetting while adapting to new DeepFakes, cutting average error rates by 60.7% and raising unseen-domain AUC by 7.9% over prior SOTA.

citing papers explorer

Showing 11 of 11 citing papers.

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning cs.AI · 2023-06-05 · conditional · none · ref 13
LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.
KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks cs.LG · 2026-05-12 · conditional · none · ref 9
KAN-CL cuts catastrophic forgetting by 88-93% on Split-CIFAR-10/5T and Split-CIFAR-100/10T by anchoring KAN parameters at per-knot granularity while matching baseline accuracy.
Online Continual Learning with Dynamic Label Hierarchies cs.LG · 2026-05-12 · unverdicted · none · ref 2
HALO improves online continual learning under evolving label hierarchies by adaptively combining classification heads regularized with organized learnable prototypes for better adaptation and reduced forgetting.
MIST: Reliable Streaming Decision Trees for Online Class-Incremental Learning via McDiarmid Bound cs.LG · 2026-05-12 · unverdicted · none · ref 14
MIST fixes unreliable splits in streaming decision trees for class-incremental learning by using a K-independent McDiarmid bound on Gini impurity, Bayesian moment projection for knowledge transfer, and KLL quantile sketches for adaptive leaf predictions.
Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay q-bio.TO · 2026-04-15 · conditional · none · ref 5
A structure-aware VAE generates realistic FC matrices for replay, combined with multi-level knowledge distillation and hierarchical contextual bandit sampling, to enable continual fMRI-based brain disorder diagnosis across sequentially arriving multi-site data without catastrophic forgetting.
Direct Discrepancy Replay: Distribution-Discrepancy Condensation and Manifold-Consistent Replay for Continual Face Forgery Detection cs.CV · 2026-04-14 · unverdicted · none · ref 4
A replay method for continual face forgery detection condenses real-fake distribution discrepancies into compact maps and synthesizes compatible samples from current real faces to reduce forgetting under tight memory budgets without storing historical images.
Continual Fine-Tuning of Large Language Models via Program Memory cs.LG · 2026-05-13 · unverdicted · none · ref 28 · internal anchor
ProCL organizes LoRA adapters into input-conditioned program memory slots that combine with a distributed adapter to improve retention and reduce forgetting in continual LLM fine-tuning.
Critical Patch-Aware Sparse Prompting with Decoupled Training for Continual Learning on the Edge cs.LG · 2026-04-08 · unverdicted · none · ref 2
CPS-Prompt delivers 1.6x gains in peak memory, training time, and energy on edge hardware for continual learning while staying within 2% accuracy of top prompt-based baselines.
HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning cs.AI · 2026-05-07 · unverdicted · none · ref 92
HEDP uses energy regularization inspired by Helmholtz free energy plus hybrid energy-distance weighting in prompts to improve domain selection and achieve a 2.57% accuracy gain on benchmarks like CORe50 while mitigating catastrophic forgetting.
CoMemNet: Contrastive Sampling with Memory Replay Network for Continual Traffic Prediction cs.LG · 2026-05-07 · unverdicted · none · ref 41
CoMemNet is a dual-branch continual learning model for dynamic traffic networks that combines contrastive sampling via Wasserstein features and memory replay to achieve SOTA performance while mitigating forgetting.
Face-D(^2)CL: Multi-Domain Synergistic Representation with Dual Continual Learning for Facial DeepFake Detection cs.CV · 2026-04-09 · unverdicted · none · ref 3
Face-D²CL fuses spatial and frequency features and uses dual continual learning to reduce forgetting while adapting to new DeepFakes, cutting average error rates by 60.7% and raising unseen-domain AUC by 7.9% over prior SOTA.

Chaudhry et al

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer