High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Bj ¨orn Ommer · 2022

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text

cs.SD · 2026-04-06 · unverdicted · novelty 7.0

OmniSonic introduces a TriAttn-DiT architecture with MoE gating to jointly generate on-screen, off-screen, and speech audio from video and text, outperforming prior models on a new UniHAGen-Bench.

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

cs.CV · 2025-11-24 · conditional · novelty 6.0

DeCo decouples high- and low-frequency generation in pixel diffusion via a DiT plus lightweight decoder and a frequency-aware flow-matching loss, reaching FID 1.62 at 256x256 and 2.22 at 512x512 on ImageNet while closing the gap to latent diffusion methods.

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization

cs.CV · 2026-03-03 · unverdicted · novelty 5.0

ProGIC applies residual vector quantization with a lightweight CNN-attention backbone to deliver progressive generative image compression with claimed perceptual gains and over 10x faster encoding/decoding versus MS-ILLM.

citing papers explorer

Showing 3 of 3 citing papers.

OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text cs.SD · 2026-04-06 · unverdicted · none · ref 38
OmniSonic introduces a TriAttn-DiT architecture with MoE gating to jointly generate on-screen, off-screen, and speech audio from video and text, outperforming prior models on a new UniHAGen-Bench.
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation cs.CV · 2025-11-24 · conditional · none · ref 46
DeCo decouples high- and low-frequency generation in pixel diffusion via a DiT plus lightweight decoder and a frequency-aware flow-matching loss, reaching FID 1.62 at 256x256 and 2.22 at 512x512 on ImageNet while closing the gap to latent diffusion methods.
ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization cs.CV · 2026-03-03 · unverdicted · none · ref 46
ProGIC applies residual vector quantization with a lightweight CNN-attention backbone to deliver progressive generative image compression with claimed perceptual gains and over 10x faster encoding/decoding versus MS-ILLM.

High-resolution image synthesis with latent diffusion models

fields

years

verdicts

representative citing papers

citing papers explorer