A Style-Based Generator Architecture for Generative Adversarial Networks

Tero Karras , Samuli Laine , Timo Aila

Authors on Pith no claims yet

classification 💻 cs.NE cs.LGstat.ML

keywords architecturegeneratoradversarialbetterfacesgenerativehumaninterpolation

read the original abstract

We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching
cs.LG 2026-05 unverdicted novelty 8.0

Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
cs.CV 2021-12 accept novelty 7.0

A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.
Diffusion Models Beat GANs on Image Synthesis
cs.LG 2021-05 accept novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.
What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion
cs.CV 2026-05 unverdicted novelty 6.0

Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.
LatRef-Diff: Latent and Reference-Guided Diffusion for Facial Attribute Editing and Style Manipulation
cs.CV 2026-04 unverdicted novelty 6.0

LatRef-Diff replaces semantic directions in diffusion models with latent and reference-guided style codes, uses a hierarchical style modulation module, and applies forward-backward consistency training to achieve stat...
Deepfake Detection Generalization with Diffusion Noise
cs.CV 2026-04 unverdicted novelty 6.0

ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
cs.CV 2026-04 unverdicted novelty 5.0

Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemph...
AttDiff-GAN: A Hybrid Diffusion-GAN Framework for Facial Attribute Editing
cs.CV 2026-04 unverdicted novelty 5.0

AttDiff-GAN decouples attribute manipulation via feature-level adversarial learning and guides diffusion generation with the edited features, plus PriorMapper and RefineExtractor modules, to achieve more accurate edit...
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence
cs.LG 2026-04 unverdicted novelty 4.0

A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
SAGE-GAN: Towards Realistic and Robust Segmentation of Spatially Ordered Nanoparticles via Attention-Guided GANs
cs.CV 2026-04 unverdicted novelty 4.0

SAGE-GAN integrates a self-attention U-Net into a CycleGAN framework to generate realistic synthetic electron microscopy image-mask pairs that augment training data for nanoparticle segmentation without human labeling.