arXiv preprint arXiv:2505.22586 , year=

Precise In-Parameter Concept Erasure in Large Language Models , author= · arXiv 2505.22586

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

GKnow: Measuring the Entanglement of Gender Bias and Factual Gender

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

Gender bias and factual gender knowledge are severely entangled in language model circuits and neurons, making neuron ablation an unreliable method for debiasing.

A framework for analyzing concept representations in neural models

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.

citing papers explorer

Showing 2 of 2 citing papers.

GKnow: Measuring the Entanglement of Gender Bias and Factual Gender cs.CL · 2026-05-12 · unverdicted · none · ref 51
Gender bias and factual gender knowledge are severely entangled in language model circuits and neurons, making neuron ablation an unreliable method for debiasing.
A framework for analyzing concept representations in neural models cs.CL · 2026-05-02 · unverdicted · none · ref 67
A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.

arXiv preprint arXiv:2505.22586 , year=

fields

years

verdicts

representative citing papers

citing papers explorer