Title resolution pending

Kantamneni, S · 2025 · arXiv 2502.00873

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

cs.LG · 2026-06-02 · unverdicted · novelty 8.0

FLIPS identifies LLM instances with 96% closed-set and 90% open-set accuracy by exploiting biases in generated binary random sequences across 237 instances.

Structuring Sparsity: Block-Sparse Featurizers Capture Visual Concept Manifolds

cs.CV · 2026-06-23 · unverdicted · novelty 7.0

Block-sparse featurizers recover visual concepts as two- to four-dimensional manifolds and describe activations more compactly than direction-based methods via minimum-description-length comparison.

Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering

cs.LG · 2026-05-24 · unverdicted · novelty 7.0

A Riemannian geodesic framework for label-free manifold steering in language models via a schema-supervised encoder approximating output Hellinger distance on activations.

Tensor Product Representation Probes Reveal Shared Structure Across Linear Directions

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

Linear probes for Othello board states factor into tensor-product structure with square and color embeddings composed by a binding matrix, from which the linear probes can be directly recovered.

Do Models Read What They Write? Causal Registers in Scratchpad Reasoning

cs.LG · 2026-06-28 · unverdicted · novelty 6.0

State-writing models causally use edited scratchpad states in a controlled task at 80-91% accuracy on held-out examples, unlike final-answer-only and pretrained controls.

Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes

cs.LG · 2026-06-08 · unverdicted · novelty 6.0

Co-activation clustering of attention heads proposes candidate circuits that pass causal closure validation in dense 1B models but fail in a Mixture-of-Experts model, where ablation can improve loss.

When and How Long? The Readout-Mediator Angle in Temporal Reasoning

cs.LG · 2026-05-27 · unverdicted · novelty 6.0

Linear probes recover day-of-year from LM activations for temporal reasoning but are orthogonal to the model's causal 4D subspace identified by DAS, with the angle matching the Haar-uniform random null, replicated across scales and families.

Convergent Evolution: How Different Language Models Learn Similar Number Representations

cs.CL · 2026-04-22 · unverdicted · novelty 6.0

Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.

Temporal Preference Concepts and their Functions in a Large Language Model

cs.LG · 2026-05-11 · unverdicted · novelty 5.0

Causal localization via attribution and patching identifies a temporal preference subgraph in mid-to-upper layers of Qwen3-4B-Instruct-2507, with time-horizon geometry in the residual stream and initial evidence for steering-vector control.

H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models

cs.CL · 2026-04-15 · unverdicted · novelty 5.0

H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.

citing papers explorer

Showing 7 of 7 citing papers after filters.

FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences cs.LG · 2026-06-02 · unverdicted · none · ref 39
FLIPS identifies LLM instances with 96% closed-set and 90% open-set accuracy by exploiting biases in generated binary random sequences across 237 instances.
Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering cs.LG · 2026-05-24 · unverdicted · none · ref 5
A Riemannian geodesic framework for label-free manifold steering in language models via a schema-supervised encoder approximating output Hellinger distance on activations.
Tensor Product Representation Probes Reveal Shared Structure Across Linear Directions cs.LG · 2026-05-11 · unverdicted · none · ref 10
Linear probes for Othello board states factor into tensor-product structure with square and color embeddings composed by a binding matrix, from which the linear probes can be directly recovered.
Do Models Read What They Write? Causal Registers in Scratchpad Reasoning cs.LG · 2026-06-28 · unverdicted · none · ref 3
State-writing models causally use edited scratchpad states in a controlled task at 80-91% accuracy on held-out examples, unlike final-answer-only and pretrained controls.
Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes cs.LG · 2026-06-08 · unverdicted · none · ref 6
Co-activation clustering of attention heads proposes candidate circuits that pass causal closure validation in dense 1B models but fail in a Mixture-of-Experts model, where ablation can improve loss.
When and How Long? The Readout-Mediator Angle in Temporal Reasoning cs.LG · 2026-05-27 · unverdicted · none · ref 4
Linear probes recover day-of-year from LM activations for temporal reasoning but are orthogonal to the model's causal 4D subspace identified by DAS, with the angle matching the Haar-uniform random null, replicated across scales and families.
Temporal Preference Concepts and their Functions in a Large Language Model cs.LG · 2026-05-11 · unverdicted · none · ref 56
Causal localization via attribution and patching identifies a temporal preference subgraph in mid-to-upper layers of Qwen3-4B-Instruct-2507, with time-horizon geometry in the residual stream and initial evidence for steering-vector control.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer