hub Baseline reference

A Style-Based Generator Architecture for Generative Adversarial Networks

Karras, T · 2018 · cs.NE · arXiv 1812.04948

Baseline reference. 50% of citing Pith papers use this work as a benchmark or comparison.

25 Pith papers citing it

Baseline 50% of classified citations

open full Pith review browse 25 citing papers arXiv PDF

abstract

We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 dataset 2 baseline 1

citation-polarity summary

background 2 use dataset 2 baseline 1 unclear 1

representative citing papers

Denoising Diffusion Implicit Models

cs.LG · 2020-10-06 · unverdicted · novelty 8.0

DDIMs construct non-Markovian diffusion processes that share DDPM training objectives but allow much faster reverse sampling, demonstrated empirically at 10-50x wall-clock speedup.

What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.

Seeking the Unfamiliar but Memorable: Conceptual Creativity as Meta-Learning

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Creativity is defined as meta-learning where a frozen diffusion creator optimizes candidates for rapid improvement by an adapting appraiser such as an autoencoder or CLIP adapter.

Deep Learning for CMB Foreground Removal and Beam Deconvolution: A U-Net GAN Approach

astro-ph.IM · 2025-08-29 · unverdicted · novelty 7.0

A U-Net GAN reconstructs CMB T and E maps from Planck-like simulations with foregrounds and systematics, achieving under 1% error outside the Galactic region and demonstrating first-time correction for non-circular beams and asymmetric scans.

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

cs.CV · 2021-12-20 · accept · novelty 7.0

A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value

cs.LG · 2025-06-16 · conditional · novelty 6.0

Derives closed-form optimal loss for unified diffusion models, provides variance-controlled estimators, and shows improved diagnosis, training schedules, and power-law scaling after subtracting the optimal value.

Generative Modeling by Estimating Gradients of the Data Distribution

cs.LG · 2019-07-12 · unverdicted · novelty 6.0

Score-based generative modeling via multi-noise-level score matching and annealed Langevin dynamics produces samples on par with GANs and sets a new inception score record on CIFAR-10.

Hiding Faces in Plain Sight: Disrupting AI Face Synthesis with Adversarial Perturbations

cs.CV · 2019-06-21 · unverdicted · novelty 6.0

Adversarial perturbations disrupt DNN-based face detectors under white-box, gray-box, and black-box settings to sabotage training data for AI face synthesis.

Adversarial Learning for Improved Onsets and Frames Music Transcription

cs.SD · 2019-06-20 · unverdicted · novelty 6.0

Adversarial training on time-frequency representations yields consistent gains in frame-level and note-level accuracy over the Onsets and Frames baseline for automatic music transcription.

Multiple-Identity Image Attacks Against Face-based Identity Verification

cs.CV · 2019-06-20 · unverdicted · novelty 6.0

The paper shows that multiple-identity image attacks succeed due to modest angular separation between matching (~90°) and non-matching (40-60°) face representations, with image morphing and representation inversion realizing effective attacks that transfer across comparators.

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.

LatRef-Diff: Latent and Reference-Guided Diffusion for Facial Attribute Editing and Style Manipulation

cs.CV · 2026-04-23 · unverdicted · novelty 6.0

LatRef-Diff replaces semantic directions in diffusion models with latent and reference-guided style codes, uses a hierarchical style modulation module, and applies forward-backward consistency training to achieve state-of-the-art facial attribute editing and style manipulation on CelebA-HQ.

Deepfake Detection Generalization with Diffusion Noise

cs.CV · 2026-04-16 · unverdicted · novelty 6.0

ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.

From Prompts to Context: An Ontology-Driven Framework for Human-Generative AI Collaboration

cs.HC · 2026-05-28 · unverdicted · novelty 5.0

Presents the CCAI ontology and SPARQL retrieval method to convert ephemeral Human-Generative AI prompt interactions into explicit, machine-readable collaboration traces, illustrated in a competency-profile software case study.

FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution

cs.CV · 2025-12-01 · unverdicted · novelty 5.0

FRAMER improves real-world super-resolution by decomposing features into low- and high-frequency bands via FFT, applying intra- and inter-contrastive losses with adaptive modulators, and using the final layer as teacher for intermediate layers during diffusion denoising.

NS-Net: Decoupling CLIP Semantic Information through NULL-Space for Generalizable AI-Generated Image Detection

cs.CV · 2025-08-02 · unverdicted · novelty 5.0

NS-Net uses null-space projection on CLIP features plus contrastive learning and patch selection to improve generalization of AI-generated image detectors across 40 unseen generative models.

CCNETS: A Modular Causal Learning Framework for Pattern Recognition in Imbalanced Datasets

cs.LG · 2024-01-07 · unverdicted · novelty 5.0

CCNETS is a new modular causal framework using three cooperative modules and a Zoint mechanism to align synthetic data generation with classifier needs on imbalanced pattern recognition tasks.

Correlation via synthesis: end-to-end nodule image generation and radiogenomic map learning based on generative adversarial network

cs.CV · 2019-07-08 · unverdicted · novelty 5.0

A conditional GAN fuses gene expression profiles with background images at multiple scales to generate synthetic nodule images and learn radiogenomic correlations end-to-end on NSCLC data.

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

cs.CV · 2026-04-30 · unverdicted · novelty 5.0

Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.

AttDiff-GAN: A Hybrid Diffusion-GAN Framework for Facial Attribute Editing

cs.CV · 2026-04-23 · unverdicted · novelty 5.0

AttDiff-GAN decouples attribute manipulation via feature-level adversarial learning and guides diffusion generation with the edited features, plus PriorMapper and RefineExtractor modules, to achieve more accurate edits and better non-target preservation on CelebA-HQ.

Why we need an AI-resilient society

cs.CY · 2019-12-18 · unverdicted · novelty 4.0

Applies forensic psychology profiling to characterize AI risks via nine features and proposes cognitive sovereignty, measurable control, and partial autonomy as a framework for an AI-resilient society.

A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

cs.LG · 2026-04-22 · unverdicted · novelty 4.0

A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.

SAGE-GAN: Towards Realistic and Robust Segmentation of Spatially Ordered Nanoparticles via Attention-Guided GANs

cs.CV · 2026-04-04 · unverdicted · novelty 4.0

SAGE-GAN integrates a self-attention U-Net into a CycleGAN framework to generate realistic synthetic electron microscopy image-mask pairs that augment training data for nanoparticle segmentation without human labeling.

citing papers explorer

Showing 25 of 25 citing papers.

Denoising Diffusion Implicit Models cs.LG · 2020-10-06 · unverdicted · none · ref 10 · internal anchor
DDIMs construct non-Markovian diffusion processes that share DDPM training objectives but allow much faster reverse sampling, demonstrated empirically at 10-50x wall-clock speedup.
What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching cs.LG · 2026-05-08 · unverdicted · none · ref 16
Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.
Seeking the Unfamiliar but Memorable: Conceptual Creativity as Meta-Learning cs.LG · 2026-05-15 · unverdicted · none · ref 4 · internal anchor
Creativity is defined as meta-learning where a frozen diffusion creator optimizes candidates for rapid improvement by an adapting appraiser such as an autoencoder or CLIP adapter.
Deep Learning for CMB Foreground Removal and Beam Deconvolution: A U-Net GAN Approach astro-ph.IM · 2025-08-29 · unverdicted · none · ref 27 · internal anchor
A U-Net GAN reconstructs CMB T and E maps from Planck-like simulations with foregrounds and systematics, achieving under 1% error outside the Galactic region and demonstrating first-time correction for non-circular beams and asymmetric scans.
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models cs.CV · 2021-12-20 · accept · none · ref 13
A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.
Diffusion Models Beat GANs on Image Synthesis cs.LG · 2021-05-11 · accept · none · ref 27
Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.
Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value cs.LG · 2025-06-16 · conditional · none · ref 23 · internal anchor
Derives closed-form optimal loss for unified diffusion models, provides variance-controlled estimators, and shows improved diagnosis, training schedules, and power-law scaling after subtracting the optimal value.
Generative Modeling by Estimating Gradients of the Data Distribution cs.LG · 2019-07-12 · unverdicted · none · ref 26 · internal anchor
Score-based generative modeling via multi-noise-level score matching and annealed Langevin dynamics produces samples on par with GANs and sets a new inception score record on CIFAR-10.
Hiding Faces in Plain Sight: Disrupting AI Face Synthesis with Adversarial Perturbations cs.CV · 2019-06-21 · unverdicted · none · ref 1 · internal anchor
Adversarial perturbations disrupt DNN-based face detectors under white-box, gray-box, and black-box settings to sabotage training data for AI face synthesis.
Adversarial Learning for Improved Onsets and Frames Music Transcription cs.SD · 2019-06-20 · unverdicted · none · ref 28 · internal anchor
Adversarial training on time-frequency representations yields consistent gains in frame-level and note-level accuracy over the Onsets and Frames baseline for automatic music transcription.
Multiple-Identity Image Attacks Against Face-based Identity Verification cs.CV · 2019-06-20 · unverdicted · none · ref 39 · internal anchor
The paper shows that multiple-identity image attacks succeed due to modest angular separation between matching (~90°) and non-matching (40-60°) face representations, with image morphing and representation inversion realizing effective attacks that transfer across comparators.
What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion cs.CV · 2026-05-08 · unverdicted · none · ref 37
Prior-Aligned AutoEncoders shape latent manifolds with spatial coherence, local continuity, and global semantics to improve latent diffusion, achieving SOTA gFID 1.03 on ImageNet 256x256 with up to 13x faster convergence.
LatRef-Diff: Latent and Reference-Guided Diffusion for Facial Attribute Editing and Style Manipulation cs.CV · 2026-04-23 · unverdicted · none · ref 32
LatRef-Diff replaces semantic directions in diffusion models with latent and reference-guided style codes, uses a hierarchical style modulation module, and applies forward-backward consistency training to achieve state-of-the-art facial attribute editing and style manipulation on CelebA-HQ.
Deepfake Detection Generalization with Diffusion Noise cs.CV · 2026-04-16 · unverdicted · none · ref 22
ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
From Prompts to Context: An Ontology-Driven Framework for Human-Generative AI Collaboration cs.HC · 2026-05-28 · unverdicted · none · ref 41 · internal anchor
Presents the CCAI ontology and SPARQL retrieval method to convert ephemeral Human-Generative AI prompt interactions into explicit, machine-readable collaboration traces, illustrated in a competency-profile software case study.
FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution cs.CV · 2025-12-01 · unverdicted · none · ref 21 · internal anchor
FRAMER improves real-world super-resolution by decomposing features into low- and high-frequency bands via FFT, applying intra- and inter-contrastive losses with adaptive modulators, and using the final layer as teacher for intermediate layers during diffusion denoising.
NS-Net: Decoupling CLIP Semantic Information through NULL-Space for Generalizable AI-Generated Image Detection cs.CV · 2025-08-02 · unverdicted · none · ref 2 · internal anchor
NS-Net uses null-space projection on CLIP features plus contrastive learning and patch selection to improve generalization of AI-generated image detectors across 40 unseen generative models.
CCNETS: A Modular Causal Learning Framework for Pattern Recognition in Imbalanced Datasets cs.LG · 2024-01-07 · unverdicted · none · ref 8 · internal anchor
CCNETS is a new modular causal framework using three cooperative modules and a Zoint mechanism to align synthetic data generation with classifier needs on imbalanced pattern recognition tasks.
Correlation via synthesis: end-to-end nodule image generation and radiogenomic map learning based on generative adversarial network cs.CV · 2019-07-08 · unverdicted · none · ref 5 · internal anchor
A conditional GAN fuses gene expression profiles with background images at multiple scales to generate synthetic nodule images and learn radiogenomic correlations end-to-end on NSCLC data.
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling cs.CV · 2026-04-30 · unverdicted · none · ref 38
Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.
AttDiff-GAN: A Hybrid Diffusion-GAN Framework for Facial Attribute Editing cs.CV · 2026-04-23 · unverdicted · none · ref 27
AttDiff-GAN decouples attribute manipulation via feature-level adversarial learning and guides diffusion generation with the edited features, plus PriorMapper and RefineExtractor modules, to achieve more accurate edits and better non-target preservation on CelebA-HQ.
Why we need an AI-resilient society cs.CY · 2019-12-18 · unverdicted · none · ref 11 · internal anchor
Applies forensic psychology profiling to characterize AI risks via nine features and proposes cognitive sovereignty, measurable control, and partial autonomy as a framework for an AI-resilient society.
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence cs.LG · 2026-04-22 · unverdicted · none · ref 29
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
SAGE-GAN: Towards Realistic and Robust Segmentation of Spatially Ordered Nanoparticles via Attention-Guided GANs cs.CV · 2026-04-04 · unverdicted · none · ref 9
SAGE-GAN integrates a self-attention U-Net into a CycleGAN framework to generate realistic synthetic electron microscopy image-mask pairs that augment training data for nanoparticle segmentation without human labeling.
Generative Modeling of Bach-Style Symbolic Music: A Comparative Study of Autoregressive, Latent-Variable, and Adversarial Approaches cs.SD · 2026-06-11 · unverdicted · none · ref 20 · internal anchor
Autoregressive LSTM with attention yields the most coherent Bach-style samples; vector quantization improves VAE structure over standard recurrent VAEs while GANs struggle with training stability and style generalization.

A Style-Based Generator Architecture for Generative Adversarial Networks

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer