Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations

Michael Ding; Xinghua Lu; Yifan Xue

arxiv: 1909.11124 · v2 · pith:CYSSLQF3new · submitted 2019-09-24 · 💻 cs.LG · stat.ML

Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations

Yifan Xue , Michael Ding , Xinghua Lu This is my paper

classification 💻 cs.LG stat.ML

keywords datalearningglobalclassinterpretablemodelrepresentationrepresentations

0 comments

read the original abstract

Learning interpretable representations of data remains a central challenge in deep learning. When training a deep generative model, the observed data are often associated with certain categorical labels, and, in parallel with learning to regenerate data and simulate new data, learning an interpretable representation of each class of data is also a process of acquiring knowledge. Here, we present a novel generative model, referred to as the Supervised Vector Quantized Variational AutoEncoder (S-VQ-VAE), which combines the power of supervised and unsupervised learning to obtain a unique, interpretable global representation for each class of data. Compared with conventional generative models, our model has three key advantages: first, it is an integrative model that can simultaneously learn a feature representation for individual data point and a global representation for each class of data; second, the learning of global representations with embedding codes is guided by supervised information, which clearly defines the interpretation of each code; and third, the global representations capture crucial characteristics of different classes, which reveal similarity and differences of statistical structures underlying different groups of data. We evaluated the utility of S-VQ-VAE on a machine learning benchmark dataset, the MNIST dataset, and on gene expression data from the Library of Integrated Network-Based Cellular Signatures (LINCS). We proved that S-VQ-VAE was able to learn the global genetic characteristics of samples perturbed by the same class of perturbagen (PCL), and further revealed the mechanism correlations between PCLs. Such knowledge is crucial for promoting new drug development for complex diseases like cancer.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Physically Interpretable World Models via Weakly Supervised Representation Learning
cs.LG 2024-12 unverdicted novelty 6.0

PIWM aligns latent states in image-based world models with physical variables and constrains their dynamics to known equations via weak distribution supervision, yielding accurate long-horizon predictions and paramete...