SSU mitigates catastrophic forgetting in low-resource LLM target-language adaptation by scoring and column-wise freezing source-critical parameters, reducing source degradation to ~3% versus ~20% for full fine-tuning while matching target performance.
Smith, and Luke Zettlemoyer
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3representative citing papers
Parameter alignment strategies substantially reduce forgetting in family-based continual pretraining of multilingual LLMs across 32 languages with minimal impact on language acquisition.
Replacing tokens, freezing the corresponding embeddings, and tuning the rest of the model improves NLU performance on low-resource languages compared to full fine-tuning.
citing papers explorer
-
Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
SSU mitigates catastrophic forgetting in low-resource LLM target-language adaptation by scoring and column-wise freezing source-critical parameters, reducing source degradation to ~3% versus ~20% for full fine-tuning while matching target performance.
-
Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models
Parameter alignment strategies substantially reduce forgetting in family-based continual pretraining of multilingual LLMs across 32 languages with minimal impact on language acquisition.
-
Modular Monolingual Adaptation using Pretrained Language Models
Replacing tokens, freezing the corresponding embeddings, and tuning the rest of the model improves NLU performance on low-resource languages compared to full fine-tuning.