hub

Ilvr: Conditioning method for denoising diffusion probabilistic models

Ilvr: Conditioning method for denoising diffusion probabilistic models , author= · 2021 · arXiv 2108.02938

20 Pith papers cite this work. Polarity classification is still indexing.

20 Pith papers citing it

read on arXiv browse 20 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

cs.LG · 2022-09-07 · unverdicted · novelty 8.0

Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

cs.CV · 2022-08-02 · unverdicted · novelty 8.0

Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.

Spectral Guidance for Flexible and Efficient Control of Diffusion Models

cs.LG · 2026-05-27 · unverdicted · novelty 7.0

Spectral Guidance learns singular functions via self-supervised objective to project guidance signals onto diffusion sampling trajectories, enabling stable control without retraining or backpropagation and improving CIFAR-10 accuracy by 37 points with 4x faster sampling.

Latent Fourier Transform

cs.SD · 2026-04-20 · unverdicted · novelty 7.0

LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.

Conflated Inverse Modeling to Generate Diverse and Temperature-Change Inducing Urban Vegetation Patterns

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

A diffusion generative inverse model conditioned on temperature targets produces diverse, physically plausible urban vegetation patterns that achieve specified regional temperature shifts.

LPNSR: Optimal Noise-Guided Diffusion Image Super-Resolution Via Learnable Noise Prediction

cs.CV · 2026-03-22 · conditional · novelty 7.0

LPNSR derives optimal intermediate noise for diffusion SR via MLE and implements it with an LR-guided noise predictor, reaching SOTA perceptual quality in 4 steps without text priors.

LooseRoPE: Content-aware Attention Manipulation for Semantic Harmonization

cs.GR · 2026-01-08 · unverdicted · novelty 7.0

LooseRoPE modulates RoPE in diffusion attention maps to continuously trade off between preserving a pasted object's identity and harmonizing it with its new surroundings.

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

cs.CV · 2021-08-02 · conditional · novelty 7.0

SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.

Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines in inpainting and super-resolution.

Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.

StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

StructDiff adds adaptive receptive fields and 3D positional encoding to a single-scale diffusion model to preserve structure and enable spatial control in single-image generation.

Analyzing and Guiding Zero-Shot Posterior Sampling in Diffusion Models

cs.LG · 2026-02-07 · unverdicted · novelty 6.0

Under a Gaussian prior assumption, zero-shot diffusion posterior samplers for inverse problems admit closed-form spectral representations that enable a new parameter-selection framework balancing perceptual quality and signal fidelity.

Regional climate risk assessment from climate models using probabilistic machine learning

cs.LG · 2024-12-11 · unverdicted · novelty 6.0

GenFocal uses probabilistic ML to downscale coarse climate projections to fine-scale weather events without paired training data and samples rare high-impact events more accurately than prior methods.

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory

cs.CV · 2023-08-16 · unverdicted · novelty 6.0

DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.

TopoStyle: Supporting Iterative Design with Generative AI for 2.5D Topology Optimization

cs.HC · 2026-04-23 · unverdicted · novelty 5.0

TopoStyle provides an interactive system using 2D diffusion models for 2.5D topology optimization that supports hand-drawn and point-based edits plus masking to enable iterative customization balancing performance and aesthetics.

SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance

cs.CV · 2026-04-15 · unverdicted · novelty 5.0

SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.

Dual Ascent Diffusion for Inverse Problems

cs.CV · 2025-05-23 · unverdicted · novelty 5.0

A dual ascent optimization framework is introduced for MAP estimation with diffusion priors, claimed to outperform prior methods on image restoration in quality, noise robustness, speed, and data fidelity.

SOWing Information: Cultivating Contextual Coherence with MLLMs in Image Generation

cs.CV · 2024-11-28 · unverdicted · novelty 5.0

SOW uses MLLMs and attention to selectively control unidirectional diffusion for pixel-level fidelity and contextual coherence in text-vision-to-image tasks.

Breaking the Lock-in: Diversifying Text-to-Image Generation via Representation Modulation

cs.CV · 2026-06-05 · unverdicted · novelty 4.0

Early DC component convergence in text-to-image Transformer features causes output homogeneity; selective early attenuation via DAVE improves diversity without retraining or extra cost.

MENO: MeanFlow-Enhanced Neural Operators for Dynamical Systems

cs.LG · 2026-04-08

citing papers explorer

Showing 20 of 20 citing papers.

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow cs.LG · 2022-09-07 · unverdicted · none · ref 8
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion cs.CV · 2022-08-02 · unverdicted · none · ref 5
Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.
Spectral Guidance for Flexible and Efficient Control of Diffusion Models cs.LG · 2026-05-27 · unverdicted · none · ref 2
Spectral Guidance learns singular functions via self-supervised objective to project guidance signals onto diffusion sampling trajectories, enabling stable control without retraining or backpropagation and improving CIFAR-10 accuracy by 37 points with 4x faster sampling.
Latent Fourier Transform cs.SD · 2026-04-20 · unverdicted · none · ref 7
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
Conflated Inverse Modeling to Generate Diverse and Temperature-Change Inducing Urban Vegetation Patterns cs.CV · 2026-04-14 · unverdicted · none · ref 10
A diffusion generative inverse model conditioned on temperature targets produces diverse, physically plausible urban vegetation patterns that achieve specified regional temperature shifts.
LPNSR: Optimal Noise-Guided Diffusion Image Super-Resolution Via Learnable Noise Prediction cs.CV · 2026-03-22 · conditional · none · ref 35
LPNSR derives optimal intermediate noise for diffusion SR via MLE and implements it with an LR-guided noise predictor, reaching SOTA perceptual quality in 4 steps without text priors.
LooseRoPE: Content-aware Attention Manipulation for Semantic Harmonization cs.GR · 2026-01-08 · unverdicted · none · ref 6
LooseRoPE modulates RoPE in diffusion attention maps to continuously trade off between preserving a pasted object's identity and harmonizing it with its new surroundings.
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations cs.CV · 2021-08-02 · conditional · none · ref 2
SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.
Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees cs.LG · 2026-05-06 · unverdicted · none · ref 1
Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines in inpainting and super-resolution.
Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing cs.LG · 2026-04-17 · unverdicted · none · ref 11
RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.
StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation cs.CV · 2026-04-14 · unverdicted · none · ref 49
StructDiff adds adaptive receptive fields and 3D positional encoding to a single-scale diffusion model to preserve structure and enable spatial control in single-image generation.
Analyzing and Guiding Zero-Shot Posterior Sampling in Diffusion Models cs.LG · 2026-02-07 · unverdicted · none · ref 4
Under a Gaussian prior assumption, zero-shot diffusion posterior samplers for inverse problems admit closed-form spectral representations that enable a new parameter-selection framework balancing perceptual quality and signal fidelity.
Regional climate risk assessment from climate models using probabilistic machine learning cs.LG · 2024-12-11 · unverdicted · none · ref 63
GenFocal uses probabilistic ML to downscale coarse climate projections to fine-scale weather events without paired training data and samples rare high-impact events more accurately than prior methods.
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory cs.CV · 2023-08-16 · unverdicted · none · ref 151
DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.
TopoStyle: Supporting Iterative Design with Generative AI for 2.5D Topology Optimization cs.HC · 2026-04-23 · unverdicted · none · ref 14
TopoStyle provides an interactive system using 2D diffusion models for 2.5D topology optimization that supports hand-drawn and point-based edits plus masking to enable iterative customization balancing performance and aesthetics.
SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance cs.CV · 2026-04-15 · unverdicted · none · ref 7
SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.
Dual Ascent Diffusion for Inverse Problems cs.CV · 2025-05-23 · unverdicted · none · ref 6
A dual ascent optimization framework is introduced for MAP estimation with diffusion priors, claimed to outperform prior methods on image restoration in quality, noise robustness, speed, and data fidelity.
SOWing Information: Cultivating Contextual Coherence with MLLMs in Image Generation cs.CV · 2024-11-28 · unverdicted · none · ref 44
SOW uses MLLMs and attention to selectively control unidirectional diffusion for pixel-level fidelity and contextual coherence in text-vision-to-image tasks.
Breaking the Lock-in: Diversifying Text-to-Image Generation via Representation Modulation cs.CV · 2026-06-05 · unverdicted · none · ref 10
Early DC component convergence in text-to-image Transformer features causes output homogeneity; selective early attenuation via DAVE improves diversity without retraining or extra cost.
MENO: MeanFlow-Enhanced Neural Operators for Dynamical Systems cs.LG · 2026-04-08 · unreviewed · ref 2

Ilvr: Conditioning method for denoising diffusion probabilistic models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer