Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.
Dynamics of transient structure in in-context linear regression transformers
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
In a controlled synthetic setting, transformers implement in-distribution task inference via convex combinations of task vectors and out-of-distribution inference via nearly orthogonal extrapolative representations.
Functional attribution with influence functions detects anomalous mechanisms in neural networks, achieving SOTA backdoor detection (average DER 0.93) on vision benchmarks and improvements on LLMs.
A survey synthesizing representative advances, common themes, and open problems in high-dimensional statistics while pointing to key entry-point works.
citing papers explorer
-
Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior
Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.
-
Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers
In a controlled synthetic setting, transformers implement in-distribution task inference via convex combinations of task vectors and out-of-distribution inference via nearly orthogonal extrapolative representations.
-
Mechanistic Anomaly Detection via Functional Attribution
Functional attribution with influence functions detects anomalous mechanisms in neural networks, achieving SOTA backdoor detection (average DER 0.93) on vision benchmarks and improvements on LLMs.
-
High-Dimensional Statistics: Reflections on Progress and Open Problems
A survey synthesizing representative advances, common themes, and open problems in high-dimensional statistics while pointing to key entry-point works.