A Riemannian geodesic framework for label-free manifold steering in language models via a schema-supervised encoder approximating output Hellinger distance on activations.
mlr.press/v119/kalatzis20a.html
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
Linear probes for Othello board states factor into tensor-product structure with square and color embeddings composed by a binding matrix, from which the linear probes can be directly recovered.
State-writing models causally use edited scratchpad states in a controlled task at 80-91% accuracy on held-out examples, unlike final-answer-only and pretrained controls.
Linear probes recover day-of-year from LM activations for temporal reasoning but are orthogonal to the model's causal 4D subspace identified by DAS, with the angle matching the Haar-uniform random null, replicated across scales and families.
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.
citing papers explorer
-
Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering
A Riemannian geodesic framework for label-free manifold steering in language models via a schema-supervised encoder approximating output Hellinger distance on activations.
-
Tensor Product Representation Probes Reveal Shared Structure Across Linear Directions
Linear probes for Othello board states factor into tensor-product structure with square and color embeddings composed by a binding matrix, from which the linear probes can be directly recovered.
-
Do Models Read What They Write? Causal Registers in Scratchpad Reasoning
State-writing models causally use edited scratchpad states in a controlled task at 80-91% accuracy on held-out examples, unlike final-answer-only and pretrained controls.
-
When and How Long? The Readout-Mediator Angle in Temporal Reasoning
Linear probes recover day-of-year from LM activations for temporal reasoning but are orthogonal to the model's causal 4D subspace identified by DAS, with the angle matching the Haar-uniform random null, replicated across scales and families.
-
Convergent Evolution: How Different Language Models Learn Similar Number Representations
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
-
H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models
H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.