Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
PragLocker protects agent prompts as IP by building non-portable obfuscated versions that function only on the intended LLM through code-symbol semantic anchoring followed by target-model feedback noise injection.
citing papers explorer
-
Toy Models of Superposition
Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.