arXiv preprint arXiv:2301.12314 , year=

Progressive prompts: Continual learning for language models , author= · 2023 · arXiv 2301.12314

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

background 1 use dataset 1

representative citing papers

FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning

cs.LG · 2026-01-07 · conditional · novelty 7.0

FOREVER aligns replay intervals in LLM continual learning with a model-centric time based on optimizer update magnitudes and an Ebbinghaus-inspired forgetting curve to reduce catastrophic forgetting.

Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning

cs.LG · 2026-05-09 · unverdicted · novelty 6.0 · 2 refs

Muon-OGD introduces a spectral-norm constrained orthogonal projection method solved via dual iterations and Newton-Schulz approximations to improve stability-plasticity trade-off in sequential LLM adaptation.

Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

cs.LG · 2026-06-05 · unverdicted · novelty 5.0

SETA decomposes parameters into task-specific and shared sparse experts with adaptive anchoring and routing regularization to improve retention and backward transfer in LLM continual learning.

A Comprehensive Overview of Large Language Models

cs.CL · 2023-07-12 · unverdicted · novelty 2.0

A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.

Little by Little: Continual Learning via Incremental Mixture of Rank-1 Associative Memory Experts

cs.LG · 2025-06-26

citing papers explorer

Showing 5 of 5 citing papers.

FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning cs.LG · 2026-01-07 · conditional · none · ref 9
FOREVER aligns replay intervals in LLM continual learning with a model-centric time based on optimizer update magnitudes and an Ebbinghaus-inspired forgetting curve to reduce catastrophic forgetting.
Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning cs.LG · 2026-05-09 · unverdicted · none · ref 24 · 2 links
Muon-OGD introduces a spectral-norm constrained orthogonal projection method solved via dual iterations and Newton-Schulz approximations to improve stability-plasticity trade-off in sequential LLM adaptation.
Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning cs.LG · 2026-06-05 · unverdicted · none · ref 22
SETA decomposes parameters into task-specific and shared sparse experts with adaptive anchoring and routing regularization to improve retention and backward transfer in LLM continual learning.
A Comprehensive Overview of Large Language Models cs.CL · 2023-07-12 · unverdicted · none · ref 252
A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.
Little by Little: Continual Learning via Incremental Mixture of Rank-1 Associative Memory Experts cs.LG · 2025-06-26 · unreviewed · ref 62

arXiv preprint arXiv:2301.12314 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer