Fixing a Broken ELBO

Alexander A. Alemi; Ben Poole; Ian Fischer; Joshua V. Dillon; Kevin Murphy; Rif A. Saurous

arxiv: 1711.00464 · v3 · pith:YGMVSLAQnew · submitted 2017-11-01 · 💻 cs.LG · stat.ML

Fixing a Broken ELBO

Alexander A. Alemi , Ben Poole , Ian Fischer , Joshua V. Dillon , Rif A. Saurous , Kevin Murphy This is my paper

classification 💻 cs.LG stat.ML

keywords latentmodelselboboundsdemonstratederiveevidenceframework

0 comments

read the original abstract

Recent work in unsupervised representation learning has focused on learning deep directed latent-variable models. Fitting these models by maximizing the marginal likelihood or evidence is typically intractable, thus a common approximation is to maximize the evidence lower bound (ELBO) instead. However, maximum likelihood training (whether exact or approximate) does not necessarily result in a good latent representation, as we demonstrate both theoretically and empirically. In particular, we derive variational lower and upper bounds on the mutual information between the input and the latent variable, and use these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy. Using this framework, we demonstrate that there is a family of models with identical ELBO, but different quantitative and qualitative characteristics. Our framework also suggests a simple new method to ensure that latent variable models with powerful stochastic decoders do not ignore their latent code.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Posterior Collapse as Automatic Spectral Pruning
cs.LG 2026-05 unverdicted novelty 6.0

Posterior collapse in β-VAEs is derived as automatic spectral pruning via Landau stability analysis, with collapse thresholds matching normalized PCA spectra in the linear Gaussian case and tested on WorldClim data.
Taming Audio VAEs via Target-KL Regularization
cs.SD 2026-05 unverdicted novelty 6.0

The paper introduces target-KL regularization to train audio VAEs at specific bitrates, enabling rate-distortion curves and comparison to discrete audio codecs for improved text-to-sound generation.
Shaping Belief States with Generative Environment Models for RL
cs.LG 2019-06 unverdicted novelty 5.0

Multi-step predictive generative models form stable belief states capturing environment layout and agent pose, yielding higher data efficiency on RL tasks than model-free agents.
Query-based Deep Improvisation
cs.SD 2019-06 unverdicted novelty 3.0

A VAE trained on one music style is queried with different-style input and uses rate-distortion noise to produce blended output with longer-term structure.