{"paper":{"title":"Low-Rank Adapters Initialization via Gradient Surgery for Continual Learning","license":"http://creativecommons.org/licenses/by/4.0/","headline":"SLICE initializes LoRA adapters by projecting current and replay gradients then applying truncated SVD to reduce catastrophic forgetting in continual learning.","cross_cats":[],"primary_cat":"cs.LG","authors_text":"Arthur S. Bianchessi, Christian Mattjie, Joana Pasquali, Jo\\~ao Vitor Boer Abitante, Lucas S. Kupssinsk\\\"u, Ot\\'avio Parraga, Rafaela Cappelari Ravazio, Ramiro N. Barros, Rodrigo C. Barros, Vin\\'icius Conte Turani","submitted_at":"2026-05-12T21:06:03Z","abstract_excerpt":"LoRA is widely adopted for continual fine-tuning of Large Language Models due to its parameter efficiency, modularity across tasks, and compatibility with replay strategies. However, LoRA-based continual learning remains vulnerable to catastrophic forgetting, whose severity depends on how successive task gradients interact: when consecutive task gradients conflict, standard adapter initializations channel updates into subspaces that overwrite previously learned directions. We propose SLICE, a gradient-surgery-based initialization for LoRA adapters in continual learning. SLICE accumulates gradi"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Compared to vanilla LoRA, LoRA-GA, and LoRAM, SLICE consistently achieves a better stability-plasticity trade-off, improving Average Performance, Final Performance and Forgetting metrics while preserving General Performance and In Context Performance across both standard and adversarial continual learning sequences.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the projection operator applied to accumulated current-task and replay gradients, followed by truncated SVD, reliably channels updates into subspaces that avoid overwriting previously learned directions without introducing new interference or instability.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"SLICE applies gradient surgery via projection and truncated SVD to initialize LoRA adapters, yielding better stability-plasticity trade-offs on continual learning benchmarks including adversarial task sequences.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"SLICE initializes LoRA adapters by projecting current and replay gradients then applying truncated SVD to reduce catastrophic forgetting in continual learning.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"25e479a3484ec408eccf467ab27a62eba76eb507f14b17d49dc1527c5b4a1d49"},"source":{"id":"2605.12752","kind":"arxiv","version":1},"verdict":{"id":"b6def680-bc89-49a2-a35f-b05c0c8067ae","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T21:16:54.222262Z","strongest_claim":"Compared to vanilla LoRA, LoRA-GA, and LoRAM, SLICE consistently achieves a better stability-plasticity trade-off, improving Average Performance, Final Performance and Forgetting metrics while preserving General Performance and In Context Performance across both standard and adversarial continual learning sequences.","one_line_summary":"SLICE applies gradient surgery via projection and truncated SVD to initialize LoRA adapters, yielding better stability-plasticity trade-offs on continual learning benchmarks including adversarial task sequences.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the projection operator applied to accumulated current-task and replay gradients, followed by truncated SVD, reliably channels updates into subspaces that avoid overwriting previously learned directions without introducing new interference or instability.","pith_extraction_headline":"SLICE initializes LoRA adapters by projecting current and replay gradients then applying truncated SVD to reduce catastrophic forgetting in continual learning."},"references":{"count":41,"sample":[{"doi":"10.18653/v1/2022.emnlp-main.340","year":2022,"title":"and Hajishirzi, Hannaneh and Khashabi, Daniel , booktitle =","work_id":"2ff158b5-8900-4efa-8930-d26f842996ef","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models , author=. 2023 , eprint=","work_id":"1fe4d920-7a63-40b6-adfb-a129c805c079","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=","work_id":"2e0a48e6-689d-4d5b-a9b3-e7d7556c8493","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Nature Machine Intelligence , volume=","work_id":"9a7cab73-2031-43c6-87c0-7448c501c3cb","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Chenlong Zhang and Zhuoran Jin and Hongbang Yuan and Jiaheng Wei and Tong Zhou and Kang Liu and Jun Zhao and Yubo Chen , booktitle=. 2025 , url=","work_id":"df6a0467-1c7e-45da-8b1d-08d657a32227","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":41,"snapshot_sha256":"0a9a3bfe92b1e52d14db27c9dff5b60844a0f7dbe5f82546b9643d6b4e64c321","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}