Introduces Latent Adversarial Robustification and Rank-Constrained Subspace Learning to enable robust generalization in multimodal knowledge editing through adversarial subspace alignment.
Evaluating the ripple effects of knowledge editing in language models
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 1polarities
background 1representative citing papers
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Knowledge conflicts in hypernetwork LLM adaptation stem from constant adapter margins losing to frequency-dependent pretrained margins; selective layer boosting and conflict-aware triggering raise deep-conflict accuracy to 71-72.5% on Gemma-2B and Mistral-7B.
citing papers explorer
-
Beyond Binary Edits Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment
Introduces Latent Adversarial Robustification and Rank-Constrained Subspace Learning to enable robust generalization in multimodal knowledge editing through adversarial subspace alignment.
-
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
-
The Override Gap: A Magnitude Account of Knowledge Conflict Failure in Hypernetwork-Based Instant LLM Adaptation
Knowledge conflicts in hypernetwork LLM adaptation stem from constant adapter margins losing to frequency-dependent pretrained margins; selective layer boosting and conflict-aware triggering raise deep-conflict accuracy to 71-72.5% on Gemma-2B and Mistral-7B.