hub

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, Jianxiong Xiao · 2015 · cs.CV · arXiv 1506.03365

26 Pith papers cite this work. Polarity classification is still indexing.

26 Pith papers citing it

open full Pith review browse 26 citing papers arXiv PDF

abstract

While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry. Large labeled training datasets, expensive and tedious to produce, are required to optimize millions of parameters in deep network models. Lagging behind the growth in model capacity, the available datasets are quickly becoming outdated in terms of size and density. To circumvent this bottleneck, we propose to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop. Starting from a large set of candidate images for each category, we iteratively sample a subset, ask people to label them, classify the others with a trained model, split the set into positives, negatives, and unlabeled based on the classification confidence, and then iterate with the unlabeled set. To assess the effectiveness of this cascading procedure and enable further progress in visual recognition research, we construct a new image dataset, LSUN. It contains around one million labeled images for each of 10 scene categories and 20 object categories. We experiment with training popular convolutional networks and find that they achieve substantial performance gains when trained on this dataset.

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

Consistency Models

cs.LG · 2023-03-02 · conditional · novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

cs.LG · 2022-09-07 · unverdicted · novelty 8.0

Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.

Denoising Diffusion Probabilistic Models

cs.LG · 2020-06-19 · accept · novelty 8.0

Denoising diffusion probabilistic models generate high-quality images by learning to reverse a fixed forward diffusion process, achieving FID 3.17 on CIFAR10.

Density estimation using Real NVP

cs.LG · 2016-05-27 · accept · novelty 8.0

Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

cs.LG · 2015-11-19 · accept · novelty 8.0

DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.

Proximal-Based Generative Modeling for Bayesian Inverse Problems

math.OC · 2026-05-13 · unverdicted · novelty 7.0

PGM replaces the intractable likelihood score in diffusion models with a closed-form Moreau score computed via proximal operators, enabling non-asymptotic sampling for inverse problems trained only on prior data.

ImageAttributionBench: How Far Are We from Generalizable Attribution?

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

ImageAttributionBench is a benchmark dataset demonstrating that state-of-the-art image attribution methods lack robustness to image degradation and fail to generalize to semantically disjoint domains.

From Diffusion to Rectified Flow: Rethinking Text-Based Segmentation

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

RLFSeg repurposes pretrained generative models via Rectified Flow for direct latent-space image-to-mask mapping in text-based segmentation, outperforming diffusion-based methods especially in zero-shot cases.

GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models

cs.LG · 2026-04-27 · unverdicted · novelty 7.0

GeoEdit constructs local tangent frames from small perturbations to initial noise, enabling Jacobian-free on-manifold edits in diffusion models via alternating tangent steps and diffusion projections.

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

cs.CV · 2023-10-06 · unverdicted · novelty 7.0

Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.

Diffusion Posterior Sampling for General Noisy Inverse Problems

stat.ML · 2022-09-29 · unverdicted · novelty 7.0

Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.

High-Resolution Image Synthesis with Latent Diffusion Models

cs.CV · 2021-12-20 · conditional · novelty 7.0

Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Progressive Growing of GANs for Improved Quality, Stability, and Variation

cs.NE · 2017-10-27 · accept · novelty 7.0

Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.

Score-Based Generative Modeling through Anisotropic Stochastic Partial Differential Equations

cs.CE · 2026-05-09 · unverdicted · novelty 6.0

Anisotropic SPDEs preserve geometric data structure over longer timescales in score-based generative modeling, yielding better image quality than standard SDE baselines and flow matching in unconditional and conditional tasks.

Improving Generative Adversarial Networks with Self-Distillation

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

SD-GAN uses the EMA generator as a teacher to distill perceptual knowledge to the training generator, improving FID scores, stabilizing training, and providing guidance uncorrelated with standard adversarial loss.

Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines in inpainting and super-resolution.

TTL: Test-time Textual Learning for OOD Detection with Pretrained Vision-Language Models

cs.CL · 2026-04-17 · unverdicted · novelty 6.0

TTL dynamically learns OOD textual semantics from unlabeled test streams via prompt updates, purification, and a knowledge bank to improve detection performance in pretrained VLMs.

Combating Pattern and Content Bias: Adversarial Feature Learning for Generalized AI-Generated Image Detection

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

MAFL uses adversarial training to suppress pattern and content biases, guiding models to learn shared generative features for better cross-model generalization in detecting AI images.

Detecting Diffusion-generated Images via Dynamic Assembly Forests

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

DAF is a novel deep forest-based detector for diffusion-generated images that uses fewer parameters and less computation than DNN methods while matching their performance.

Variational Encoder--Multi-Decoder (VE-MD) for Privacy-by-functional-design (Group) Emotion Recognition

cs.CV · 2026-04-02 · unverdicted · novelty 6.0

VE-MD uses a shared variational latent space jointly optimized for group affect classification and structural body/face decoding, delivering SOTA results on GAF-3.0 and VGAF while never producing individual emotion or identity outputs.

Depth Anything V2

cs.CV · 2024-06-13 · unverdicted · novelty 6.0

Depth Anything V2 delivers finer, more robust monocular depth predictions by replacing real labeled images with synthetic data, scaling the teacher model, and using large-scale pseudo-labeled real images for student training.

Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

cs.CV · 2026-05-10 · unverdicted · novelty 5.0

MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.

HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection

cs.CV · 2026-04-30 · unverdicted · novelty 5.0

HiMix combines mixup augmentation to create transitional real-fake samples with hierarchical global-local artifact feature fusion to achieve better generalization in detecting AI-generated images from unseen generators.

citing papers explorer

Showing 26 of 26 citing papers.

Consistency Models cs.LG · 2023-03-02 · conditional · none · ref 70 · internal anchor
Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow cs.LG · 2022-09-07 · unverdicted · none · ref 93 · internal anchor
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
Denoising Diffusion Probabilistic Models cs.LG · 2020-06-19 · accept · none · ref 71 · internal anchor
Denoising diffusion probabilistic models generate high-quality images by learning to reverse a fixed forward diffusion process, achieving FID 3.17 on CIFAR10.
Density estimation using Real NVP cs.LG · 2016-05-27 · accept · none · ref 70 · internal anchor
Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks cs.LG · 2015-11-19 · accept · none · ref 21 · internal anchor
DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.
Proximal-Based Generative Modeling for Bayesian Inverse Problems math.OC · 2026-05-13 · unverdicted · none · ref 130 · internal anchor
PGM replaces the intractable likelihood score in diffusion models with a closed-form Moreau score computed via proximal operators, enabling non-asymptotic sampling for inverse problems trained only on prior data.
ImageAttributionBench: How Far Are We from Generalizable Attribution? cs.CV · 2026-05-13 · unverdicted · none · ref 80 · internal anchor
ImageAttributionBench is a benchmark dataset demonstrating that state-of-the-art image attribution methods lack robustness to image degradation and fail to generalize to semantically disjoint domains.
From Diffusion to Rectified Flow: Rethinking Text-Based Segmentation cs.CV · 2026-05-06 · unverdicted · none · ref 55 · internal anchor
RLFSeg repurposes pretrained generative models via Rectified Flow for direct latent-space image-to-mask mapping in text-based segmentation, outperforming diffusion-based methods especially in zero-shot cases.
GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models cs.LG · 2026-04-27 · unverdicted · none · ref 32 · internal anchor
GeoEdit constructs local tangent frames from small perturbations to initial noise, enabling Jacobian-free on-manifold edits in diffusion models via alternating tangent steps and diffusion projections.
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference cs.CV · 2023-10-06 · unverdicted · none · ref 89 · internal anchor
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
Diffusion Posterior Sampling for General Noisy Inverse Problems stat.ML · 2022-09-29 · unverdicted · none · ref 84 · internal anchor
Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.
High-Resolution Image Synthesis with Latent Diffusion Models cs.CV · 2021-12-20 · conditional · none · ref 102 · internal anchor
Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and
Diffusion Models Beat GANs on Image Synthesis cs.LG · 2021-05-11 · accept · none · ref 71 · internal anchor
Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.
Progressive Growing of GANs for Improved Quality, Stability, and Variation cs.NE · 2017-10-27 · accept · none · ref 55 · internal anchor
Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.
Score-Based Generative Modeling through Anisotropic Stochastic Partial Differential Equations cs.CE · 2026-05-09 · unverdicted · none · ref 34 · internal anchor
Anisotropic SPDEs preserve geometric data structure over longer timescales in score-based generative modeling, yielding better image quality than standard SDE baselines and flow matching in unconditional and conditional tasks.
Improving Generative Adversarial Networks with Self-Distillation cs.CV · 2026-05-09 · unverdicted · none · ref 43 · internal anchor
SD-GAN uses the EMA generator as a teacher to distill perceptual knowledge to the training generator, improving FID scores, stabilizing training, and providing guidance uncorrelated with standard adversarial loss.
Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees cs.LG · 2026-05-06 · unverdicted · none · ref 13 · internal anchor
Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines in inpainting and super-resolution.
TTL: Test-time Textual Learning for OOD Detection with Pretrained Vision-Language Models cs.CL · 2026-04-17 · unverdicted · none · ref 50 · internal anchor
TTL dynamically learns OOD textual semantics from unlabeled test streams via prompt updates, purification, and a knowledge bank to improve detection performance in pretrained VLMs.
Combating Pattern and Content Bias: Adversarial Feature Learning for Generalized AI-Generated Image Detection cs.CV · 2026-04-14 · unverdicted · none · ref 43 · internal anchor
MAFL uses adversarial training to suppress pattern and content biases, guiding models to learn shared generative features for better cross-model generalization in detecting AI images.
Detecting Diffusion-generated Images via Dynamic Assembly Forests cs.CV · 2026-04-10 · unverdicted · none · ref 45 · internal anchor
DAF is a novel deep forest-based detector for diffusion-generated images that uses fewer parameters and less computation than DNN methods while matching their performance.
Variational Encoder--Multi-Decoder (VE-MD) for Privacy-by-functional-design (Group) Emotion Recognition cs.CV · 2026-04-02 · unverdicted · none · ref 64 · internal anchor
VE-MD uses a shared variational latent space jointly optimized for group affect classification and structural body/face decoding, delivering SOTA results on GAF-3.0 and VGAF while never producing individual emotion or identity outputs.
Depth Anything V2 cs.CV · 2024-06-13 · unverdicted · none · ref 98 · internal anchor
Depth Anything V2 delivers finer, more robust monocular depth predictions by replacing real labeled images with synthetic data, scaling the teacher model, and using large-scale pseudo-labeled real images for student training.
Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts cs.CV · 2026-05-10 · unverdicted · none · ref 51 · internal anchor
MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.
HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection cs.CV · 2026-04-30 · unverdicted · none · ref 51 · internal anchor
HiMix combines mixup augmentation to create transitional real-fake samples with hierarchical global-local artifact feature fusion to achieve better generalization in detecting AI-generated images from unseen generators.
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance cs.CV · 2026-04-29 · unverdicted · none · ref 40 · internal anchor
ACPO uses anchor-based regularization with NR-IQA guidance to enable stable perceptual quality improvements in diffusion model fine-tuning.
Elucidating the SNR-t Bias of Diffusion Probabilistic Models cs.CV · 2026-04-17 · unverdicted · none · ref 62 · internal anchor
Diffusion models have an SNR-timestep mismatch during inference that the authors mitigate with per-frequency differential correction, raising generation quality across IDDPM, ADM, DDIM and others.

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer