LLMs represent semantic relations geometrically via embedding distance and direction; a linear Polar Probe decodes these structures from middle-layer activations and generalizes to new entities.
Proceedings of the 40th International Conference on Machine Learning , articleno =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.
citing papers explorer
-
Polar probe linearly decodes semantic structures from LLMs
LLMs represent semantic relations geometrically via embedding distance and direction; a linear Polar Probe decodes these structures from middle-layer activations and generalizes to new entities.
-
On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference
An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.