FoldSAE: Learning to Steer Protein Folding Through Sparse Representations

Ewa Szczurek; Kamil Deja; Paulina Szymczak; Wojciech Zarzecki

arxiv: 2511.22519 · v2 · pith:TS7A6YPZnew · submitted 2025-11-27 · 🧬 q-bio.QM

FoldSAE: Learning to Steer Protein Folding Through Sparse Representations

Wojciech Zarzecki , Paulina Szymczak , Ewa Szczurek , Kamil Deja This is my paper

classification 🧬 q-bio.QM

keywords proteinrepresentationsrfdiffusionfeaturesinternalinterpretablecontrolprecise

0 comments

read the original abstract

RFdiffusion is a popular and well-established model for generation of protein structures. However, this generative process offers limited insight into its internal representations and how they contribute to the final protein structure. Concurrently, recent work in mechanistic interpretability has successfully used Sparse Autoencoders (SAEs) to discover interpretable features within neural networks. We combine these concepts by applying SAE to the internal representations of RFdiffusion to uncover secondary structure-specific features and establish a relationship between them and generated protein structures. Building on these insights, we introduce a novel steering mechanism that enables precise control of secondary structure formation through a tunable hyperparameter, while simultaneously revealing interpretable block and neuron-level representations within RFdiffusion. Our work pioneers a new framework for making RFdiffusion more interpretable, demonstrating how understanding internal features can be directly translated into precise control over the protein design process.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

VFUSE: Virulent Feature Understanding with Sparse autoEncoders
cs.LG 2026-06 unverdicted novelty 7.0

VFUSE applies sparse autoencoders to diffusion-transformer activations in RoseTTAFold3 and RFDiffusion3 to find monosemantic features that detect hazardous protein designs with AUROC up to 0.84.