A plug-and-play KL regularizer that masks the target token and renormalizes probabilities to improve the learning-forgetting trade-off in LoRA adaptation of LLMs.
Decoupled knowledge distillation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
ADAM uses personality-guided LLM augmentation and cross-lingual attention distillation to raise balanced accuracy on multilingual personality recognition to 0.6332 on Essays and 0.7448 on Kaggle, outperforming standard BCE loss.
citing papers explorer
-
Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting
A plug-and-play KL regularizer that masks the target token and renormalizes probabilities to improve the learning-forgetting trade-off in LoRA adaptation of LLMs.
-
Cross-Lingual Attention Distillation with Personality-Informed Generative Augmentation for Multilingual Personality Recognition
ADAM uses personality-guided LLM augmentation and cross-lingual attention distillation to raise balanced accuracy on multilingual personality recognition to 0.6332 on Essays and 0.7448 on Kaggle, outperforming standard BCE loss.