A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.
Title resolution pending
8 Pith papers cite this work. Polarity classification is still indexing.
years
2026 8representative citing papers
Indirect elicitation via triplet comparisons recovers meaningful association structures from LLMs and supports conservative causal candidate links across prompted subpopulations.
ACT*ONOMY is a Grounded-Theory-derived hierarchical taxonomy and open repository that enables systematic comparison and characterization of autonomous agent behavior across trajectories.
Pretrained base models exhibit higher yield to peer disagreement than RLHF instruct variants, with the effect localized to mid-layer attention and mitigated by structured dissent rather than prompt defenses.
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
In Dyck-language transformers, attention patterns causally use top-of-stack information while residual-stream depth and distance signals are decodable yet causally inert.
Co-locating tests with implementation code yields substantially higher preservation and correctness in foundation-model-generated programs than separated test syntax.
STEAR reduces spatial and temporal hallucinations in Video-LLMs via layer-aware evidence intervention from middle decoder layers in a single-encode pass.
citing papers explorer
-
Dissecting Jet-Tagger Through Mechanistic Interpretability
A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.