Curvature-guided localization and low-rank repair detoxifies backdoored LLMs by suppressing trigger responses while preserving normal behavior.
CUBE: A black-box backdoor defense via clean unlearning,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Curvature-Guided Module Localization for Low-Rank Detoxification of Backdoored Large Language Models
Curvature-guided localization and low-rank repair detoxifies backdoored LLMs by suppressing trigger responses while preserving normal behavior.