Understanding disentangling in $\beta$-VAE

Alexander Lerchner; Arka Pal; Christopher P. Burgess; Guillaume Desjardins; Irina Higgins; Loic Matthey; Nick Watters

arxiv: 1804.03599 · v1 · pith:S6ZCNJLXnew · submitted 2018-04-10 · 📊 stat.ML · cs.AI· cs.LG

Understanding disentangling in β-VAE

Christopher P. Burgess , Irina Higgins , Arka Pal , Loic Matthey , Nick Watters , Guillaume Desjardins , Alexander Lerchner This is my paper

classification 📊 stat.ML cs.AIcs.LG

keywords betatrainingdisentangledmodificationrepresentationsaccuracyalignedassessments

0 comments

read the original abstract

We present new intuitions and theoretical assessments of the emergence of disentangled representation in variational autoencoders. Taking a rate-distortion theory perspective, we show the circumstances under which representations aligned with the underlying generative factors of variation of data emerge when optimising the modified ELBO bound in $\beta$-VAE, as training progresses. From these insights, we propose a modification to the training regime of $\beta$-VAE, that progressively increases the information capacity of the latent code during training. This modification facilitates the robust learning of disentangled representations in $\beta$-VAE, without the previous trade-off in reconstruction accuracy.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 20 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Inference-Time Refinement Closes the Synthetic-Real Gap in Tabular Diffusion
cs.LG 2026-05 unverdicted novelty 8.0

Inference-time refinement of pre-trained tabular diffusion models via Bidirectional Chamfer Refinement achieves median 8.6% better downstream performance than real data across 15 benchmarks while preserving fidelity a...
Gradient-Based Program Synthesis with Neurally Interpreted Languages
cs.LG 2026-04 unverdicted novelty 8.0

NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...
StrADiff: A Structured Source-Wise Adaptive Diffusion Framework for Linear and Nonlinear Blind Source Separation
stat.ML 2026-04 unverdicted novelty 7.0

StrADiff recovers latent source trajectories from linear and nonlinear mixtures via source-wise adaptive diffusion and a Gaussian process prior in a single unsupervised end-to-end objective.
Score-based Membership Inference on Diffusion Models
cs.LG 2025-09 unverdicted novelty 7.0

Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.
Posterior Collapse as Automatic Spectral Pruning
cs.LG 2026-05 unverdicted novelty 6.0

Posterior collapse in β-VAEs is derived as automatic spectral pruning via Landau stability analysis, with collapse thresholds matching normalized PCA spectra in the linear Gaussian case and tested on WorldClim data.
Winner-Take-All bottlenecks enforce disentangled symbolic representations in multi-task learning
cs.LG 2026-05 unverdicted novelty 6.0

WTA bottlenecks enforce highly symbolic, disentangled categorical representations of latent factors under defined conditions in multi-task DNNs, shown via theorem and experiments on two datasets.
Vision Foundation Models as Generalist Tokenizers for Image Generation
cs.CV 2026-05 unverdicted novelty 6.0

VFMTok builds a generalist image tokenizer on frozen VFMs using adaptive quantization and semantic alignment, delivering gFID 1.36 for autoregressive and 1.25 for continuous generation on ImageNet with 3x faster convergence.
Unsupervised learning of acquisition variability in structural connectomes via hybrid latent space modeling
cs.LG 2026-05 unverdicted novelty 6.0

A hybrid VAE with architectural annealing learns discrete clusters aligned with scanner and protocol differences in a dataset of 7416 structural connectomes spanning 13 studies.
A renormalization-group inspired lattice-based framework for piecewise generalized linear models
stat.ME 2026-05 unverdicted novelty 6.0

RG-inspired lattice models for piecewise GLMs provide explicit interpretable partitions and a replica-analysis-derived scaling law for regularization that allows increasing complexity without expected rise in generali...
Discovering quantum phenomena with Interpretable Machine Learning
quant-ph 2026-04 unverdicted novelty 6.0

Variational autoencoders combined with symbolic regression extract physically meaningful representations and order parameters from raw quantum measurement data, revealing new phenomena such as corner-ordering in Rydbe...
Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing
cs.LG 2026-04 unverdicted novelty 6.0

RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.
From Unsupervised to Guided Clustering: A Variational Implementation
stat.ME 2026-04 unverdicted novelty 6.0

GCVAE is a variational autoencoder that structures its latent space as a Gaussian mixture and optimizes a variational objective to make the representation maximally informative about a user-chosen guiding variable, en...
Cyclic Adaptive Private Synthesis for Sharing Real-World Data in Education
cs.CY 2026-02 unverdicted novelty 6.0

CAPS provides an iterative differentially private synthesis method that outperforms one-shot baselines on authentic educational real-world data.
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
cs.LG 2024-03 unverdicted novelty 6.0

Sparse feature circuits are introduced as interpretable causal subnetworks in language models, supporting unsupervised discovery of thousands of circuits and a method called SHIFT to improve classifier generalization ...
Deep Attention Reweighting: Post-Hoc Attention-Based Feature Aggregation in CNNs for Disentangling Core and Spurious Features under Spurious Correlations
cs.CV 2026-05 unverdicted novelty 5.0

DAR replaces GAP with an attention-based aggregation module retrained jointly with the classifier head to disentangle core from spurious features and outperforms DFR on multiple datasets.
To Use AI as Dice of Possibilities with Timing Computation
cs.AI 2026-05 unverdicted novelty 5.0

Proposes verb-based paradigm with timing computation to enable data-driven discovery of patient trajectories and counterfactual timing from EHR data without domain knowledge.
Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds
cs.LG 2026-04 unverdicted novelty 5.0

Aligning the DDIM forward diffusion process with flow-matching manifold evolution enables high-quality generation without time conditioning, and class-conditional synthesis is possible with an unconditional denoiser b...
A Systematic Framework for Tabular Data Disentanglement
cs.LG 2026-04 unverdicted novelty 5.0

A systematic framework modularizes tabular data disentanglement into data extraction, modeling, analysis, and latent extrapolation, with a case study on synthetic data generation.
Approximately Equivariant Recurrent Generative Models for Quasi-Periodic Time Series with a Progressive Training Scheme
cs.LG 2025-05 unverdicted novelty 5.0

AEQ-RVAE-ST combines approximate equivariance and progressive sequence lengthening in a recurrent VAE to match or exceed prior generative models on quasi-periodic time series benchmarks.
Learning Disentangled Representations for Generalized Multi-view Clustering
cs.CV 2026-05 unverdicted novelty 4.0

GMAE learns disentangled view-specific and view-common embeddings via dual-path autoencoders and cross-view adversarial training to boost performance on complete and incomplete multi-view clustering tasks.