VLM-UnBench demonstrates that prompt-based training-free unlearning in VLMs leaves forget accuracy near the no-instruction baseline except under oracle conditions that reveal the target concept.
hub
Pratiksha Thaker, Yash Maurya, Shengyuan Hu, Zhiwei Steven Wu, and Virginia Smith
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
representative citing papers
LLM unlearning is reframed as inadvertently installing backdoor triggers on forget-tokens; Random Noise Augmentation is introduced as a defense that improves robustness with theoretical guarantees.
Distinguishable Deletion unifies knowledge erasure and refusal for LLM unlearning via an energy index that enforces boundaries during training and enables refusal at inference.
Targeting minor components in LLM representations during unlearning yields substantially better resistance to relearning attacks than prior methods.
CAP is a reinforcement-learning-driven prompt optimization framework that suppresses target knowledge in LLMs while preserving general capabilities, enabling reversible unlearning without any parameter updates.
REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.
CURaTE performs continual unlearning in LLMs in real time by using sentence embeddings to detect and refuse forget requests without changing model parameters, achieving effective forgetting and perfect knowledge preservation.
Downgrading optimizers to lower-information variants during LLM unlearning yields more robust forgetting on MUSE and WMDP benchmarks by converging to harder-to-perturb loss basins.
Survey identifying technical and supply-chain barriers to GDPR data subject rights in ML, with new framing of 'models in the dark' for downstream opacity.
Runtime-structured task decomposition reduces retry costs in agentic coding systems by up to 51.7% versus monolithic prompts by rerunning only failed subtasks on two software engineering workloads.
Proposes AI-driven simulations for literary-historical experiments and reports preliminary text-generation results claiming the first limited in-distribution outputs matching human novels.
Counterfactual tuning for LLM unlearning induces knowledge conflict and hallucination spillover, diagnosed via the RWKU+ benchmark.
citing papers explorer
-
Runtime-Structured Task Decomposition for Agentic Coding Systems
Runtime-structured task decomposition reduces retry costs in agentic coding systems by up to 51.7% versus monolithic prompts by rerunning only failed subtasks on two software engineering workloads.