Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
hub
Ilvr: Conditioning method for denoising diffusion probabilistic models
18 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Textual Inversion learns a single embedding vector from a few images to represent personal concepts inside the text embedding space of a frozen text-to-image model, enabling their composition in natural language prompts.
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
A diffusion generative inverse model conditioned on temperature targets produces diverse, physically plausible urban vegetation patterns that achieve specified regional temperature shifts.
LPNSR derives optimal intermediate noise for diffusion SR via MLE and implements it with an LR-guided noise predictor, reaching SOTA perceptual quality in 4 steps without text priors.
LooseRoPE modulates RoPE in diffusion attention maps to continuously trade off between preserving a pasted object's identity and harmonizing it with its new surroundings.
SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.
Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines in inpainting and super-resolution.
RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.
StructDiff adds adaptive receptive fields and 3D positional encoding to a single-scale diffusion model to preserve structure and enable spatial control in single-image generation.
Under a Gaussian prior assumption, zero-shot diffusion posterior samplers for inverse problems admit closed-form spectral representations that enable a new parameter-selection framework balancing perceptual quality and signal fidelity.
GenFocal uses probabilistic ML to downscale coarse climate projections to fine-scale weather events without paired training data and samples rare high-impact events more accurately than prior methods.
DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.
TopoStyle provides an interactive system using 2D diffusion models for 2.5D topology optimization that supports hand-drawn and point-based edits plus masking to enable iterative customization balancing performance and aesthetics.
SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.
A dual ascent optimization framework is introduced for MAP estimation with diffusion priors, claimed to outperform prior methods on image restoration in quality, noise robustness, speed, and data fidelity.
SOW uses MLLMs and attention to selectively control unidirectional diffusion for pixel-level fidelity and contextual coherence in text-vision-to-image tasks.
citing papers explorer
-
Dual Ascent Diffusion for Inverse Problems
A dual ascent optimization framework is introduced for MAP estimation with diffusion priors, claimed to outperform prior methods on image restoration in quality, noise robustness, speed, and data fidelity.