A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2representative citing papers
Mutual information analysis of TNG50 simulations shows gravitational potential and total energy retain merger mass and infall time information longest, while radial velocity loses it within ~5 Gyr, with washout depending on radius, merger age, and mass.
citing papers explorer
-
Dissecting Jet-Tagger Through Mechanistic Interpretability
A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.
-
Galactic Amnesia: The Information Washout of the Milky Way Merger History
Mutual information analysis of TNG50 simulations shows gravitational potential and total energy retain merger mass and infall time information longest, while radial velocity loses it within ~5 Gyr, with washout depending on radius, merger age, and mass.