Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.
hub
A note on the evaluation of generative models
12 Pith papers cite this work. Polarity classification is still indexing.
abstract
Probabilistic generative models can be used for compression, denoising, inpainting, texture synthesis, semi-supervised learning, unsupervised feature learning, and other tasks. Given this wide range of applications, it is not surprising that a lot of heterogeneity exists in the way these models are formulated, trained, and evaluated. As a consequence, direct comparison between models is often difficult. This article reviews mostly known but often underappreciated properties relating to the evaluation and interpretation of generative models with a focus on image models. In particular, we show that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional. Good performance with respect to one criterion therefore need not imply good performance with respect to the other criteria. Our results show that extrapolation from one criterion to another is not warranted and generative models need to be evaluated directly with respect to the application(s) they were intended for. In addition, we provide examples demonstrating that Parzen window estimates should generally be avoided.
hub tools
representative citing papers
DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.
A forward diffusion process adds noise iteratively to data until it is unstructured, and a neural network learns the reverse process to generate new samples from the original distribution.
Diffusion model priors enable training-free Bayesian sampling for more accurate rain field reconstruction from path-integrated commercial microwave link measurements than Gaussian process baselines.
BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.
Mixing auxiliary high-resource language data outperforms hyperparameter tuning in data-constrained bilingual pre-training, with gains equivalent to 2-13 times more unique target data.
Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.
NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.
GazeVaLM provides 960 gaze recordings from 16 radiologists on 60 chest X-rays (half synthetic) plus LLM predictions for diagnostic accuracy and real-fake detection under matched conditions.
Generative perplexity and entropy are shown to be the two additive components of KL divergence to a reference distribution, motivating generative frontiers as a principled evaluation method for diffusion language models.
Kernel interpolation with a constant scaling factor enables Stable Diffusion to produce higher-resolution images without training and extends to general neural networks with small accuracy drops.
Fully connected neural network with randomized loss synthesizes real-world tabular data distributions from Gaussian noise faster than state-of-the-art deep generative models.
citing papers explorer
-
Density estimation using Real NVP
Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.
-
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.
-
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
A forward diffusion process adds noise iteratively to data until it is unstructured, and a neural network learns the reverse process to generate new samples from the original distribution.
-
Bayesian Rain Field Reconstruction using Commercial Microwave Links and Diffusion Model Priors
Diffusion model priors enable training-free Bayesian sampling for more accurate rain field reconstruction from path-integrated commercial microwave link measurements than Gaussian process baselines.
-
Large Scale GAN Training for High Fidelity Natural Image Synthesis
BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.
-
Mix, Don't Tune: Bilingual Pre-Training Outperforms Hyperparameter Search in Data-Constrained Settings
Mixing auxiliary high-resource language data outperforms hyperparameter tuning in data-constrained bilingual pre-training, with gains equivalent to 2-13 times more unique target data.
-
Coupling Models for One-Step Discrete Generation
Coupling Models enable single-step discrete sequence generation via learned couplings to Gaussian latents and outperform prior one-step baselines on text perplexity, biological FBD, and image FID metrics.
-
Learning to Theorize the World from Observation
NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.
-
GazeVaLM: A Multi-Observer Eye-Tracking Benchmark for Evaluating Clinical Realism in AI-Generated X-Rays
GazeVaLM provides 960 gaze recordings from 16 radiologists on 60 chest X-rays (half synthetic) plus LLM predictions for diagnostic accuracy and real-fake detection under matched conditions.
-
Generative Frontiers: Why Evaluation Matters for Diffusion Language Models
Generative perplexity and entropy are shown to be the two additive components of KL divergence to a reference distribution, motivating generative frontiers as a principled evaluation method for diffusion language models.
-
Supersampling Stable Diffusion and More: An Approach for Interpolating Neural Networks Using Common Interpolation Methods
Kernel interpolation with a constant scaling factor enables Stable Diffusion to produce higher-resolution images without training and extends to general neural networks with small accuracy drops.
-
Synthesizing real-world distributions from high-dimensional Gaussian Noise with Fully Connected Neural Network
Fully connected neural network with randomized loss synthesizes real-world tabular data distributions from Gaussian noise faster than state-of-the-art deep generative models.