pith. machine review for the scientific record. sign in

arxiv: 1706.08500 · v6 · submitted 2017-06-26 · 💻 cs.LG · stat.ML

Recognition: unknown

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

Bernhard Nessler, Hubert Ramsauer, Martin Heusel, Sepp Hochreiter, Thomas Unterthiner

Authors on Pith no claims yet
classification 💻 cs.LG stat.ML
keywords ganstturtrainingconvergenceequilibriumimagesinceptionlearning
0
0 comments X
read the original abstract

Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. However, the convergence of GAN training has still not been proved. We propose a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions. TTUR has an individual learning rate for both the discriminator and the generator. Using the theory of stochastic approximation, we prove that the TTUR converges under mild assumptions to a stationary local Nash equilibrium. The convergence carries over to the popular Adam optimization, for which we prove that it follows the dynamics of a heavy ball with friction and thus prefers flat minima in the objective landscape. For the evaluation of the performance of GANs at image generation, we introduce the "Fr\'echet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score. In experiments, TTUR improves learning for DCGANs and Improved Wasserstein GANs (WGAN-GP) outperforming conventional GAN training on CelebA, CIFAR-10, SVHN, LSUN Bedrooms, and the One Billion Word Benchmark.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 22 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. iTRIALSPACE: Programmable Virtual Lesion Trials for Controlled Evaluation of Lung CT Models

    cs.CV 2026-05 unverdicted novelty 7.0

    iTRIALSPACE generates realistic virtual lesion trials on lung CTs that isolate performance drivers and show strong transfer of model rankings to real clinical data (ρ=0.93).

  2. Dream-Cubed: Controllable Generative Modeling in Minecraft by Training on Billions of Cubes

    cs.CV 2026-04 unverdicted novelty 7.0

    Dream-Cubed releases a billion-scale voxel dataset and 3D diffusion models that generate controllable Minecraft worlds by operating directly on blocks.

  3. ExpertEdit: Learning Skill-Aware Motion Editing from Expert Videos

    cs.CV 2026-04 unverdicted novelty 7.0

    ExpertEdit edits novice motions to expert skill levels by learning a motion prior from unpaired videos and infilling masked skill-critical spans.

  4. Efficient Unlearning through Maximizing Relearning Convergence Delay

    cs.LG 2026-04 unverdicted novelty 7.0

    The Influence Eliminating Unlearning framework maximizes relearning convergence delay via weight decay and noise injection to remove the influence of a forgetting set while preserving accuracy on retained data.

  5. DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning

    eess.IV 2026-04 unverdicted novelty 7.0

    DiV-INR integrates implicit neural representations as conditioning signals for diffusion models to achieve better perceptual quality than HEVC, VVC, and prior neural codecs at extremely low bitrates under 0.05 bpp.

  6. Imagen Video: High Definition Video Generation with Diffusion Models

    cs.CV 2022-10 unverdicted novelty 7.0

    Imagen Video generates high-definition text-conditional videos via a cascade of base and super-resolution diffusion models, achieving high fidelity and controllability.

  7. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

    cs.CV 2022-05 accept novelty 7.0

    Imagen achieves state-of-the-art photorealistic text-to-image generation by scaling a text-only pretrained T5 language model within a diffusion framework, reaching FID 7.27 on COCO without training on it.

  8. Ensemble Distributionally Robust Bayesian Optimisation

    cs.LG 2026-05 unverdicted novelty 6.0

    A tractable ensemble distributionally robust Bayesian optimization method achieves improved sublinear regret bounds under context uncertainty.

  9. CASCADE: Context-Aware Relaxation for Speculative Image Decoding

    cs.CV 2026-05 unverdicted novelty 6.0

    CASCADE formalizes semantic interchangeability and convergence in target model representations to enable context-aware acceptance relaxation in tree-based speculative decoding, delivering up to 3.6x speedup on text-to...

  10. Defining Robust Ultrasound Quality Metrics via an Ultrasound Foundation Model

    eess.IV 2026-04 unverdicted novelty 6.0

    TinyUSFM-uLPIPS and TinyUSFM-NRQ provide task-linked, cross-organ, and clinically predictive quality assessment for ultrasound images that outperforms conventional metrics in calibration with segmentation performance ...

  11. Evaluating AI-Generated Images of Cultural Artifacts with Community-Informed Rubrics

    cs.CY 2026-04 unverdicted novelty 6.0

    Community members from the UK blind community, Kerala, and Tamil Nadu helped define what counts as culturally appropriate depictions of artifacts, and the authors tested whether those definitions can be turned into re...

  12. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

    cs.AI 2023-08 unverdicted novelty 6.0

    MetaGPT embeds human SOPs into LLM prompts to create role-specialized agent teams that produce more coherent solutions on collaborative software engineering tasks than prior chat-based multi-agent systems.

  13. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

    cs.CV 2023-07 conditional novelty 6.0

    SDXL improves upon prior Stable Diffusion versions through a larger UNet backbone, dual text encoders, novel conditioning, and a refinement model, producing higher-fidelity images competitive with black-box state-of-t...

  14. Demystifying MMD GANs

    stat.ML 2018-01 accept novelty 6.0

    MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.

  15. CaloArt: Large-Patch x-Prediction Diffusion Transformers for High-Granularity Calorimeter Shower Generation

    physics.ins-det 2026-05 unverdicted novelty 5.0

    CaloArt achieves top FPD, high-level, and classifier metrics on CaloChallenge datasets 2 and 3 while keeping single-GPU generation at 9-11 ms per shower by combining large-patch tokenization, x-prediction, and conditi...

  16. Stability of the Monge Map in Semi-Dual Optimal Transport

    math.OC 2026-05 unverdicted novelty 5.0

    Semi-dual optimal transport has a degenerate saddle-point structure whose solution is a constrained optimization problem, giving necessary and sufficient conditions for Monge map convergence independent of dual optimality.

  17. Stability of the Monge Map in Semi-Dual Optimal Transport

    math.OC 2026-05 unverdicted novelty 5.0

    Semi-dual OT formulation has degenerate saddle-point structure; necessary and sufficient conditions for Monge map convergence are derived without requiring dual potential optimality.

  18. LoRaQ: Optimized Low Rank Approximation for 4-bit Quantization

    cs.LG 2026-04 unverdicted novelty 5.0

    LoRaQ enables fully sub-16-bit quantized diffusion models by optimizing low-rank error compensation in a data-free way, outperforming prior methods at equal memory cost on Pixart-Σ and SANA while supporting mixed low-...

  19. UniMesh: Unifying 3D Mesh Understanding and Generation

    cs.CV 2026-04 unverdicted novelty 5.0

    UniMesh unifies 3D mesh generation and understanding in one model via a Mesh Head interface, Chain of Mesh iterative editing, and an Actor-Evaluator self-reflection loop.

  20. Protecting and Preserving Protest Dynamics for Responsible Analysis

    cs.CV 2026-04 unverdicted novelty 5.0

    A responsible computing framework substitutes real protest imagery with labeled synthetic reproductions from conditional image synthesis to enable privacy-aware analysis of collective action patterns.

  21. Generative Texture Diversification of 3D Pedestrians for Robust Autonomous Driving Perception

    cs.CV 2026-05 unverdicted novelty 4.0

    Generative texture synthesis from StyleGAN2 diversifies 3D pedestrian assets from a single base model, improving robustness in 2D object detection while exposing 3D perception models' sensitivity to geometric domain gaps.

  22. Discrete Meanflow Training Curriculum

    cs.LG 2026-04 unverdicted novelty 4.0

    A DMF curriculum initialized from pretrained flow models achieves one-step FID 3.36 on CIFAR-10 after only 2000 epochs by exploiting a discretized consistency property in the Meanflow objective.