hub Canonical reference

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, Luke Metz, Soumith Chintala · 2015 · cs.LG · arXiv 1511.06434

Canonical reference. 71% of citing Pith papers cite this work as background.

51 Pith papers citing it

Background 71% of classified citations

open full Pith review browse 51 citing papers arXiv PDF

abstract

In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5 method 2

citation-polarity summary

background 5 use method 2

representative citing papers

Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks

cs.CL · 2023-05-02 · unverdicted · novelty 8.0

ciwGAN and fiwGAN models trained on isolated words spontaneously generate concatenated multi-word outputs and display early compositionality precursors.

Toy Models of Superposition

cs.LG · 2022-09-21 · accept · novelty 8.0

Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.

Generative Language Modeling for Automated Theorem Proving

cs.LG · 2020-09-07 · unverdicted · novelty 8.0

GPT-f, a transformer-based prover for Metamath, generated new short proofs that were accepted into the main library—the first such contribution from a deep-learning system.

AGAN: Towards Automated Design of Generative Adversarial Networks

cs.LG · 2019-06-25 · unverdicted · novelty 8.0

AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.

Density estimation using Real NVP

cs.LG · 2016-05-27 · accept · novelty 8.0

Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.

Discriminative Span as a Predictor of Synthetic Data Utility via Classifier Reconstruction

cs.CV · 2026-05-10 · unverdicted · novelty 7.0 · 2 refs

A relative projection error metric in foundation-model embedding space predicts the downstream utility of synthetic positive samples for binary classifiers.

Active Learning for Conditional Generative Compressed Sensing

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Prompts can be split into separate roles for sampling design and recovery modeling in generative compressed sensing, with stable recovery bounds for matched prompts and an explicit penalty for mismatch, validated on Stable Diffusion.

Physics-informed, Generative Adversarial Design of Funicular Shells

cs.CE · 2026-04-17 · unverdicted · novelty 7.0

A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.

SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation

cs.CV · 2026-04-08 · conditional · novelty 7.0

SurFITR is a new collection of 137k+ surveillance-style forged images that causes existing detectors to degrade while enabling substantial gains when used for training in both in-domain and cross-domain settings.

Toward Generative Quantum Utility via Correlation-Complexity Map

cs.LG · 2026-03-06 · unverdicted · novelty 7.0

A pre-training diagnostic map based on spectral correlation resemblance to IQP circuits and excess structural complexity identifies suitable datasets like turbulence data for quantum generative models, yielding competitive low-resource performance.

ASTRA: Let Arbitrary Subjects Transform in Video Editing

cs.CV · 2025-10-01 · unverdicted · novelty 7.0

ASTRA is a plug-and-play training-free method for precise multi-subject video editing that uses prompt-guided multimodal alignment and prior-based mask retargeting to avoid attention dilution and boundary issues.

Progressive Growing of GANs for Improved Quality, Stability, and Variation

cs.NE · 2017-10-27 · accept · novelty 7.0

Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.

Mixed Precision Training

cs.AI · 2017-10-10 · accept · novelty 7.0

Mixed precision training uses FP16 for most computations, FP32 master weights for accumulation, and loss scaling to enable accurate training of large DNNs with halved memory usage.

Vision Foundation Models as Generalist Tokenizers for Image Generation

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

VFMTok builds a generalist image tokenizer on frozen VFMs using adaptive quantization and semantic alignment, delivering gFID 1.36 for autoregressive and 1.25 for continuous generation on ImageNet with 3x faster convergence.

Neural Fields for NV-Center Inverse Sensing

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

NeTMY neural fields with annealed encoding, multiscale optimization, and spectrum-fidelity losses achieve superior localization and distributional accuracy in NV-center inverse sensing by using a tensor power-summed dipolar operator that exposes and mitigates center-collapse failures.

Enabling Federated Inference via Unsupervised Consensus Embedding

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

CE-FI maps heterogeneous model representations to a shared embedding space via unsupervised training on unlabeled data, enabling privacy-preserving federated inference that outperforms solo models on image classification benchmarks.

A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities

cs.AI · 2026-04-21 · unverdicted · novelty 6.0

A new framework evaluates utility of synthetic mobility trajectories while a membership inference attack reveals privacy vulnerabilities in generative models thought to be safe.

Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

Embedding Arithmetic performs vector operations in the embedding space of T2I models to mitigate bias at inference time, outperforming baselines on diversity while preserving coherence via a new Concept Coherence Score.

FatigueFusion: Latent Space Fusion for Fatigue-Driven Motion Synthesis

cs.GR · 2026-04-11 · unverdicted · novelty 6.0

FatigueFusion fuses fatigue features in latent space using algorithmic, data-driven, and PINN modules to synthesize novel fatigued motions from non-fatigued joint sequences in an end-to-end pipeline.

gen2seg: Generative Models Enable Generalizable Instance Segmentation

cs.CV · 2025-05-21 · unverdicted · novelty 6.0

Finetuning generative models on limited instance segmentation data produces zero-shot generalization to unseen object categories and styles, matching or exceeding supervised baselines like SAM on ambiguous boundaries.

"Noisier" Noise Contrastive Eestimation is (Almost) Maximum Likelihood

cs.LG · 2024-05-27 · unverdicted · novelty 6.0

Scaling noise magnitude in NCE aligns gradients with MLE, enabling a practical approximation that improves performance on CIFAR-10 and ImageNet image modeling with fewer training steps.

MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

cs.AI · 2023-08-01 · unverdicted · novelty 6.0

MetaGPT embeds human SOPs into LLM prompts to create role-specialized agent teams that produce more coherent solutions on collaborative software engineering tasks than prior chat-based multi-agent systems.

VideoGPT: Video Generation using VQ-VAE and Transformers

cs.CV · 2021-04-20 · accept · novelty 6.0

VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.

Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning

cs.CV · 2019-07-12 · unverdicted · novelty 6.0

DASCN uses a unified primal-dual GAN architecture to generate semantics-consistent visual features for generalized zero-shot learning, claiming state-of-the-art gains.

citing papers explorer

Showing 50 of 51 citing papers.

Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks cs.CL · 2023-05-02 · unverdicted · none · ref 57 · internal anchor
ciwGAN and fiwGAN models trained on isolated words spontaneously generate concatenated multi-word outputs and display early compositionality precursors.
Toy Models of Superposition cs.LG · 2022-09-21 · accept · none · ref 3 · internal anchor
Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.
Generative Language Modeling for Automated Theorem Proving cs.LG · 2020-09-07 · unverdicted · none · ref 9 · internal anchor
GPT-f, a transformer-based prover for Metamath, generated new short proofs that were accepted into the main library—the first such contribution from a deep-learning system.
AGAN: Towards Automated Design of Generative Adversarial Networks cs.LG · 2019-06-25 · unverdicted · none · ref 11 · internal anchor
AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
Density estimation using Real NVP cs.LG · 2016-05-27 · accept · none · ref 47 · internal anchor
Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.
Discriminative Span as a Predictor of Synthetic Data Utility via Classifier Reconstruction cs.CV · 2026-05-10 · unverdicted · none · ref 12 · 2 links · internal anchor
A relative projection error metric in foundation-model embedding space predicts the downstream utility of synthetic positive samples for binary classifiers.
Active Learning for Conditional Generative Compressed Sensing cs.LG · 2026-05-06 · unverdicted · none · ref 44 · internal anchor
Prompts can be split into separate roles for sampling design and recovery modeling in generative compressed sensing, with stable recovery bounds for matched prompts and an explicit penalty for mismatch, validated on Stable Diffusion.
Physics-informed, Generative Adversarial Design of Funicular Shells cs.CE · 2026-04-17 · unverdicted · none · ref 28 · internal anchor
A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation cs.CV · 2026-04-08 · conditional · none · ref 36 · internal anchor
SurFITR is a new collection of 137k+ surveillance-style forged images that causes existing detectors to degrade while enabling substantial gains when used for training in both in-domain and cross-domain settings.
Toward Generative Quantum Utility via Correlation-Complexity Map cs.LG · 2026-03-06 · unverdicted · none · ref 44 · internal anchor
A pre-training diagnostic map based on spectral correlation resemblance to IQP circuits and excess structural complexity identifies suitable datasets like turbulence data for quantum generative models, yielding competitive low-resource performance.
ASTRA: Let Arbitrary Subjects Transform in Video Editing cs.CV · 2025-10-01 · unverdicted · none · ref 17 · internal anchor
ASTRA is a plug-and-play training-free method for precise multi-subject video editing that uses prompt-guided multimodal alignment and prior-based mask retargeting to avoid attention dilution and boundary issues.
Progressive Growing of GANs for Improved Quality, Stability, and Variation cs.NE · 2017-10-27 · accept · none · ref 41 · internal anchor
Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.
Mixed Precision Training cs.AI · 2017-10-10 · accept · none · ref 26 · internal anchor
Mixed precision training uses FP16 for most computations, FP32 master weights for accumulation, and loss scaling to enable accurate training of large DNNs with halved memory usage.
Vision Foundation Models as Generalist Tokenizers for Image Generation cs.CV · 2026-05-18 · unverdicted · none · ref 62 · internal anchor
VFMTok builds a generalist image tokenizer on frozen VFMs using adaptive quantization and semantic alignment, delivering gFID 1.36 for autoregressive and 1.25 for continuous generation on ImageNet with 3x faster convergence.
Neural Fields for NV-Center Inverse Sensing cs.LG · 2026-05-13 · unverdicted · none · ref 58 · internal anchor
NeTMY neural fields with annealed encoding, multiscale optimization, and spectrum-fidelity losses achieve superior localization and distributional accuracy in NV-center inverse sensing by using a tensor power-summed dipolar operator that exposes and mitigates center-collapse failures.
Enabling Federated Inference via Unsupervised Consensus Embedding cs.LG · 2026-05-07 · unverdicted · none · ref 43 · internal anchor
CE-FI maps heterogeneous model representations to a shared embedding space via unsupervised training on unlabeled data, enabling privacy-preserving federated inference that outperforms solo models on image classification benchmarks.
A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities cs.AI · 2026-04-21 · unverdicted · none · ref 100 · internal anchor
A new framework evaluates utility of synthetic mobility trajectories while a membership inference attack reveals privacy vulnerabilities in generative models thought to be safe.
Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models cs.CV · 2026-04-20 · unverdicted · none · ref 31 · internal anchor
Embedding Arithmetic performs vector operations in the embedding space of T2I models to mitigate bias at inference time, outperforming baselines on diversity while preserving coherence via a new Concept Coherence Score.
FatigueFusion: Latent Space Fusion for Fatigue-Driven Motion Synthesis cs.GR · 2026-04-11 · unverdicted · none · ref 50 · internal anchor
FatigueFusion fuses fatigue features in latent space using algorithmic, data-driven, and PINN modules to synthesize novel fatigued motions from non-fatigued joint sequences in an end-to-end pipeline.
gen2seg: Generative Models Enable Generalizable Instance Segmentation cs.CV · 2025-05-21 · unverdicted · none · ref 17 · internal anchor
Finetuning generative models on limited instance segmentation data produces zero-shot generalization to unseen object categories and styles, matching or exceeding supervised baselines like SAM on ambiguous boundaries.
"Noisier" Noise Contrastive Eestimation is (Almost) Maximum Likelihood cs.LG · 2024-05-27 · unverdicted · none · ref 9 · internal anchor
Scaling noise magnitude in NCE aligns gradients with MLE, enabling a practical approximation that improves performance on CIFAR-10 and ImageNet image modeling with fewer training steps.
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework cs.AI · 2023-08-01 · unverdicted · none · ref 228 · internal anchor
MetaGPT embeds human SOPs into LLM prompts to create role-specialized agent teams that produce more coherent solutions on collaborative software engineering tasks than prior chat-based multi-agent systems.
VideoGPT: Video Generation using VQ-VAE and Transformers cs.CV · 2021-04-20 · accept · none · ref 27 · internal anchor
VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.
Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning cs.CV · 2019-07-12 · unverdicted · none · ref 21 · internal anchor
DASCN uses a unified primal-dual GAN architecture to generate semantics-consistent visual features for generalized zero-shot learning, claiming state-of-the-art gains.
Dual Adversarial Learning with Attention Mechanism for Fine-grained Medical Image Synthesis eess.IV · 2019-07-07 · unverdicted · none · ref 17 · internal anchor
Dual-discriminator GAN with adversarial attention improves fine-grained medical image synthesis, especially in hard-to-generate tumor regions, and outperforms prior methods on brain tumor and CT-to-MRI tasks.
RED: A ReRAM-based Deconvolution Accelerator cs.ET · 2019-07-05 · unverdicted · none · ref 14 · internal anchor
RED introduces pixel-wise mapping and zero-skipping dataflow for ReRAM deconvolution acceleration, reporting 1.15x-3.69x speedup and 8%-88.36% energy reduction versus prior ReRAM accelerators.
A Halo Merger Tree Generation and Evaluation Framework astro-ph.CO · 2019-06-22 · unverdicted · none · ref 32 · internal anchor
A GAN framework is trained on EAGLE simulation merger trees to generate new realistic trees for semi-analytic galaxy models at modest computational cost.
Demystifying MMD GANs stat.ML · 2018-01-04 · accept · none · ref 42 · internal anchor
MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.
Are Candidate Models Really Needed for Active Learning? cs.CV · 2026-05-14 · unverdicted · none · ref 64 · internal anchor
Active learning with randomly initialized models achieves comparable results to traditional candidate-model methods, with low-confidence sampling proving most effective.
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling cs.CV · 2026-04-30 · unverdicted · none · ref 64 · internal anchor
Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance cs.CV · 2026-04-29 · unverdicted · none · ref 21 · internal anchor
ACPO uses anchor-based regularization with NR-IQA guidance to enable stable perceptual quality improvements in diffusion model fine-tuning.
Improving Diversity in Black-box Few-shot Knowledge Distillation cs.CV · 2026-04-28 · unverdicted · none · ref 59 · internal anchor
An adaptive high-confidence image selection scheme during GAN training expands diversity in the distillation set for black-box few-shot KD and yields SOTA student accuracy on seven image datasets.
A Geometric Algebra-informed NeRF Framework for Generalizable Wireless Channel Prediction cs.NI · 2026-04-13 · unverdicted · none · ref 44 · internal anchor
GAI-NeRF combines geometric algebra attention and an adaptive ray tracing module inside a NeRF model to deliver more accurate and generalizable wireless channel predictions across varied indoor environments.
Unsupervised Detection of Spatiotemporal Anomalies in PMU Data Using Transformer-Based BiGAN cs.LG · 2025-09-30 · unverdicted · none · ref 11 · internal anchor
T-BiGAN integrates window-attention Transformers in a BiGAN to achieve ROC-AUC 0.95 and average precision 0.996 for unsupervised spatiotemporal anomaly detection in PMU data.
Quantum generative modeling for financial time series with temporal correlations quant-ph · 2025-07-29 · unverdicted · none · ref 4 · internal anchor
QGANs with quantum generators and classical discriminators generate financial time series matching target distributions and desired temporal correlations, with quality varying by circuit depth, bond dimension, and simulation method.
CCNETS: A Modular Causal Learning Framework for Pattern Recognition in Imbalanced Datasets cs.LG · 2024-01-07 · unverdicted · none · ref 6 · internal anchor
CCNETS is a new modular causal framework using three cooperative modules and a Zoint mechanism to align synthetic data generation with classifier needs on imbalanced pattern recognition tasks.
Synthetic Augmentation and Feature-based Filtering for Improved Cervical Histopathology Image Classification eess.IV · 2019-07-24 · unverdicted · none · ref 13 · internal anchor
cGAN data augmentation with feature-based filtering improves ResNet18 CIN grading accuracy from 66.3% to 71.7% on segmented epithelium patches.
Affine Disentangled GAN for Interpretable and Robust AV Perception cs.CV · 2019-07-06 · unverdicted · none · ref 22 · internal anchor
ADIS-GAN disentangles affine transformations in a GAN to achieve over 98% classification accuracy on MNIST within 30 degrees rotation and over 90% under FGSM and PGD attacks while generating rotation and scaling factors.
Generative Counterfactual Introspection for Explainable Deep Learning cs.LG · 2019-07-06 · unverdicted · none · ref 30 · internal anchor
A generative-model-driven introspection method produces counterfactual image edits to explain deep neural network predictions on MNIST and CelebA.
Disentangled Makeup Transfer with Generative Adversarial Network cs.CV · 2019-07-02 · unverdicted · none · ref 28 · internal anchor
DMT uses identity and makeup encoders in a GAN to enable controllable makeup transfer from references and sampling of new styles from a prior distribution.
Enhancing the accuracy of under-resolved numerical simulations of atmospheric flows with super resolution physics.flu-dyn · 2026-04-10 · unverdicted · none · ref 64 · internal anchor
A multi-scale CNN super-resolution model outperforms baseline CNN, attention CNN, and diffusion-based approaches in reconstructing fine-scale features from under-resolved atmospheric flow simulations on standard benchmarks.
Improving conditional generative adversarial networks for inverse design of plasmonic structures physics.optics · 2025-11-14 · unverdicted · none · ref 3 · internal anchor
Adding label projection and a novel embedding network to cGANs cuts mean absolute error by up to an order of magnitude and makes training converge over three times faster for plasmonic inverse design.
Diving Deeper into Underwater Image Enhancement: A Survey cs.CV · 2019-07-17 · accept · none · ref 46 · internal anchor
A comprehensive survey of deep learning-based underwater image enhancement with systematic experimental comparison of algorithms on multiple datasets.
Mean Spectral Normalization of Deep Neural Networks for Embedded Automation cs.LG · 2019-07-09 · unverdicted · none · ref 28 · internal anchor
Proposes MSN reparameterization to address mean-drift in SN, claiming ~16% faster inference than BN with fewer parameters on CNNs and GANs.
MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music Generation eess.AS · 2019-07-02 · unverdicted · none · ref 18 · internal anchor
MIDI-Sandwich is a hierarchical VAE-GAN architecture that generates structured 136-beat melodies by modeling local bars and global relationships on the Nottingham dataset.
GAN-Knowledge Distillation for one-stage Object Detection cs.CV · 2019-06-20 · unverdicted · none · ref 13 · internal anchor
A GAN-based adversarial training method distills knowledge from teacher to student networks by treating their feature maps as real and fake samples to boost one-stage object detector performance.
Synthetic data in cryptocurrencies using generative models cs.LG · 2026-04-17 · unverdicted · none · ref 2 · internal anchor
CGANs with LSTM generator can produce synthetic crypto price series that reproduce temporal patterns and preserve market trends and dynamics.
Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation eess.IV · 2025-06-29 · conditional · none · ref 11 · internal anchor
Balanced synthetic image augmentation via GANs and diffusion models raises average AUC from 0.9206 to 0.9362 for FedAvg and 0.9429 to 0.9574 for FedProx in federated breast ultrasound classification.
A Geometric Algebra-Informed 3DGS Framework for Wireless Channel Prediction cs.NI · 2026-05-18 · unreviewed · ref 25 · internal anchor
One-Step Generative Modeling via Wasserstein Gradient Flows cs.LG · 2026-05-12 · unreviewed · ref 48 · internal anchor

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer