pith. sign in

hub

Pratiksha Thaker, Yash Maurya, Shengyuan Hu, Zhiwei Steven Wu, and Virginia Smith

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

hub tools

years

2026 11 2025 2

clear filters

representative citing papers

Improving LLM Unlearning Robustness via Random Perturbations

cs.CL · 2025-01-31 · unverdicted · novelty 7.0

LLM unlearning is reframed as inadvertently installing backdoor triggers on forget-tokens; Random Noise Augmentation is introduced as a defense that improves robustness with theoretical guarantees.

Fast Unlearning at Scale via Margin Self-Correction

cs.LG · 2026-06-01 · unverdicted · novelty 6.0

MASC achieves competitive forget-retain trade-offs in language model unlearning at lower computational cost via margin self-correction and an online stopping criterion on TOFU, MUSE News, and MUSE Books.

CAP: Controllable Alignment Prompting for Unlearning in LLMs

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

CAP is a reinforcement-learning-driven prompt optimization framework that suppresses target knowledge in LLMs while preserving general capabilities, enabling reversible unlearning without any parameter updates.

citing papers explorer

Showing 2 of 2 citing papers after filters.