VFUSE applies sparse autoencoders to diffusion-transformer activations in RoseTTAFold3 and RFDiffusion3 to find monosemantic features that detect hazardous protein designs with AUROC up to 0.84.
Aaron Maiwald, Piotr Jedryszek, Florent Draye, Garrett M
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ESM2 predicts N-terminal methionine via retrieval of a positional prior from the BOS token through distributed attention circuits rather than direct recognition, revealed by a norm-direction decomposition of rotary attention scores.
citing papers explorer
-
VFUSE: Virulent Feature Understanding with Sparse autoEncoders
VFUSE applies sparse autoencoders to diffusion-transformer activations in RoseTTAFold3 and RFDiffusion3 to find monosemantic features that detect hazardous protein designs with AUROC up to 0.84.
-
Retrieval and competition: how a protein foundation model starts a protein
ESM2 predicts N-terminal methionine via retrieval of a positional prior from the BOS token through distributed attention circuits rather than direct recognition, revealed by a norm-direction decomposition of rotary attention scores.