Stylegan-nada: Clip-guided domain adaptation of image generators

Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, Daniel Cohen-Or · 2021 · arXiv 2108.00946

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Prompt-to-Prompt Image Editing with Cross Attention Control

cs.CV · 2022-08-02 · unverdicted · novelty 8.0

Cross-attention control in text-conditioned models enables localized and global image edits by editing only the input text prompt.

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

cs.CV · 2022-08-02 · unverdicted · novelty 8.0

Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.

Hierarchical Text-Conditional Image Generation with CLIP Latents

cs.CV · 2022-04-13 · accept · novelty 7.0

A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

cs.CV · 2021-12-20 · accept · novelty 7.0

A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.

citing papers explorer

Showing 4 of 4 citing papers.

Prompt-to-Prompt Image Editing with Cross Attention Control cs.CV · 2022-08-02 · unverdicted · none · ref 14
Cross-attention control in text-conditioned models enables localized and global image edits by editing only the input text prompt.
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion cs.CV · 2022-08-02 · unverdicted · none · ref 9
Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.
Hierarchical Text-Conditional Image Generation with CLIP Latents cs.CV · 2022-04-13 · accept · none · ref 18
A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models cs.CV · 2021-12-20 · accept · none · ref 6
A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.

Stylegan-nada: Clip-guided domain adaptation of image generators

fields

years

verdicts

representative citing papers

citing papers explorer