Introduces a causal intervention framework with new metrics for mechanistic interpretability of VAEs and reports empirical findings from extensive experiments on multiple models and datasets.
Variational inference of disentangled latent concepts from unlabeled observations,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
Tabular VAEs show ~50% lower causal circuit modularity than image VAEs, with beta-VAE CES collapsing to 0.043 versus 0.133 due to reconstruction degradation, challenging direct transfer of image interpretability techniques.
citing papers explorer
-
A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders
Introduces a causal intervention framework with new metrics for mechanistic interpretability of VAEs and reports empirical findings from extensive experiments on multiple models and datasets.
-
Posterior-Calibrated Causal Circuits in Variational Autoencoders: Why Image-Domain Interpretability Fails on Tabular Data
Tabular VAEs show ~50% lower causal circuit modularity than image VAEs, with beta-VAE CES collapsing to 0.043 versus 0.133 due to reconstruction degradation, challenging direct transfer of image interpretability techniques.