Linear representations of high-level concepts in LLMs are formalized via counterfactuals in input and output spaces, unified under a causal inner product that enables consistent probing and steering.
hub
Language models represent space and time
15 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 4representative citing papers
Bridge augments a graph neural network backbone with time-aware retrieval from a memory of region-time windows to improve cold-start and cross-city urban delivery demand forecasting.
Transformer activations show spectral anti-concentration for concepts in the tail while syntax prefers high-variance directions, forming a dual geometry.
Large language models encode relational bindings via a cell-based representation: a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair and attributes are retrieved from the matching cell.
Subword tokenization impairs phonological knowledge encoding in LMs, but an IPA-based fine-tuning method restores it with minimal impact on other capabilities.
Sparse autoencoders scaled to 34 million features on Claude 3 Sonnet yield interpretable, steerable representations of concrete and abstract concepts that generalize across languages and modalities.
PerturbedVAE disentangles perturbation-specific signals from invariant gene expression structure to recover causal representations and improve out-of-distribution prediction in single-cell perturbation modeling.
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Fluent AI users adopt an active, iterative collaboration mode that produces more visible failures but better recovery and success on hard tasks, whereas novices experience more invisible failures from passive use.
The Master Key Hypothesis states that capabilities are low-dimensional directions transferable across models through linear subspace alignment, with UNLOCK demonstrating gains such as 12.1% accuracy improvement on MATH when transferring CoT from 14B to 7B models.
Reasoning in LLMs produces a transient geometric pulse in which concept manifolds untangle into linearly separable subspaces immediately before computation and compress afterward.
H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.
Analysis estimates 18.7% of Common Crawl documents contain geospatial information like coordinates and addresses, with little difference by language.
A TPSR-based framework with four LLM roles integrates language model reasoning into industrial automation via digital twins, achieving high task executability in case studies.
Perceptual geometry for color, pitch, emotion and taste emerges transiently in intermediate layers of transformer LLMs despite purely textual training.
citing papers explorer
-
The Linear Representation Hypothesis and the Geometry of Large Language Models
Linear representations of high-level concepts in LLMs are formalized via counterfactuals in input and output spaces, unified under a causal inner product that enables consistent probing and steering.
-
Bridge: Retrieval-Augmented Spatiotemporal Modeling for Urban Delivery Demand
Bridge augments a graph neural network backbone with time-aware retrieval from a memory of region-time windows to improve cold-start and cross-city urban delivery demand forecasting.
-
Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations
Transformer activations show spectral anti-concentration for concepts in the tail while syntax prefers high-variance directions, forming a dual geometry.
-
Cell-Based Representation of Relational Binding in Language Models
Large language models encode relational bindings via a cell-based representation: a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair and attributes are retrieved from the matching cell.
-
How Tokenization Limits Phonological Knowledge Representation in Language Models and How to Improve Them
Subword tokenization impairs phonological knowledge encoding in LMs, but an IPA-based fine-tuning method restores it with minimal impact on other capabilities.
-
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Sparse autoencoders scaled to 34 million features on Claude 3 Sonnet yield interpretable, steerable representations of concrete and abstract concepts that generalize across languages and modalities.
-
What Makes a Representation Good for Single-Cell Perturbation Prediction?
PerturbedVAE disentangles perturbation-specific signals from invariant gene expression structure to recover causal representations and improve out-of-distribution prediction in single-cell perturbation modeling.
-
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
-
A paradox of AI fluency
Fluent AI users adopt an active, iterative collaboration mode that produces more visible failures but better recovery and success on hard tasks, whereas novices experience more invisible failures from passive use.
-
The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment
The Master Key Hypothesis states that capabilities are low-dimensional directions transferable across models through linear subspace alignment, with UNLOCK demonstrating gains such as 12.1% accuracy improvement on MATH when transferring CoT from 14B to 7B models.
-
Emergent Manifold Separability during Reasoning in Large Language Models
Reasoning in LLMs produces a transient geometric pulse in which concept manifolds untangle into linearly separable subspaces immediately before computation and compress afterward.
-
H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models
H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.
-
Quantifying Geospatial in the Common Crawl Corpus
Analysis estimates 18.7% of Common Crawl documents contain geospatial information like coordinates and addresses, with little difference by language.
-
Integrating Large Language Model Agents with Digital Twins for Industrial Autonomous Systems
A TPSR-based framework with four LLM roles integrates language model reasoning into industrial automation via digital twins, achieving high task executability in case studies.
-
Geometry of Human Perceptual Domains Emerges Transiently in LLM Representations
Perceptual geometry for color, pitch, emotion and taste emerges transiently in intermediate layers of transformer LLMs despite purely textual training.