hub Canonical reference

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu · 2021 · cs.CV · arXiv 2108.01073

Canonical reference. 82% of citing Pith papers cite this work as background.

61 Pith papers citing it

Background 82% of classified citations

open full Pith review browse 61 citing papers arXiv PDF

abstract

Guided image synthesis enables everyday users to create and edit photo-realistic images with minimum effort. The key challenge is balancing faithfulness to the user input (e.g., hand-drawn colored strokes) and realism of the synthesized image. Existing GAN-based methods attempt to achieve such balance using either conditional GANs or GAN inversions, which are challenging and often require additional training data or loss functions for individual applications. To address these issues, we introduce a new image synthesis and editing method, Stochastic Differential Editing (SDEdit), based on a diffusion model generative prior, which synthesizes realistic images by iteratively denoising through a stochastic differential equation (SDE). Given an input image with user guide of any type, SDEdit first adds noise to the input, then subsequently denoises the resulting image through the SDE prior to increase its realism. SDEdit does not require task-specific training or inversions and can naturally achieve the balance between realism and faithfulness. SDEdit significantly outperforms state-of-the-art GAN-based methods by up to 98.09% on realism and 91.72% on overall satisfaction scores, according to a human perception study, on multiple tasks, including stroke-based image synthesis and editing as well as image compositing.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 9 baseline 1 method 1

citation-polarity summary

background 9 baseline 1 use method 1

representative citing papers

Consistency Models

cs.LG · 2023-03-02 · conditional · novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

cs.LG · 2022-09-07 · unverdicted · novelty 8.0

Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.

Chameleon: Style-Content Disentangled Framework for Cross-Domain Object Compositing

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

Chameleon proposes the first large-scale cross-domain compositing dataset and a disentangled encoder plus gated diffusion transformer that outperforms prior in-domain and cross-domain methods on plausibility and fidelity.

StableHand: Quality-Aware Flow Matching for World-Space Dual-Hand Motion Estimation from Egocentric Video

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

StableHand introduces a quality-aware flow matching framework conditioned on predicted four-channel per-frame hand observation quality to estimate dual-hand world-space motion from egocentric video, achieving SOTA results with 20-25% W-MPJPE reduction on HOT3D and ARCTIC benchmarks.

Functionalization via Structure Completion and Motion Rectification

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.

From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

A planner-orchestrator system learns long-horizon image editing by maximizing outcome-based rewards from a vision-language judge and refining plans from successful trajectories.

Amortized Guidance for Image Inpainting with Pretrained Diffusion Models

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

AID amortizes guidance for diffusion inpainting by training a reusable module via an auxiliary Gaussian formulation and continuous-time actor-critic algorithm, improving quality-speed trade-off with under 1% overhead.

RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

RevealLayer decomposes natural images into multiple RGBA layers using diffusion models with region-aware attention, occlusion-guided adaptation, and a composite loss, outperforming prior methods on a new benchmark dataset.

Geometrically Constrained Stenosis Editing in Coronary Angiography via Entropic Optimal Transport

cs.CV · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

OT-Bridge Editor uses geometrically constrained entropic optimal transport to synthesize CAG images with precise stenosis, improving downstream detection by 27.8% on ARCADE and 23.0% on a multi-center dataset.

A Call to Lagrangian Action: Learning Population Mechanics from Temporal Snapshots

cs.LG · 2026-05-08 · unverdicted · novelty 7.0 · 3 refs

Wasserstein Lagrangian Mechanics formalizes second-order dynamics in Wasserstein space and provides an algorithm to learn them from observed marginals without specifying the Lagrangian, outperforming gradient flows on various dynamics.

ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent

cs.CV · 2026-04-28 · unverdicted · novelty 7.0

ResetEdit embeds a recoverable discrepancy signal during image generation in diffusion models to reconstruct an approximate original latent for high-fidelity text-guided editing.

GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models

cs.LG · 2026-04-27 · unverdicted · novelty 7.0

GeoEdit constructs local tangent frames from small perturbations to initial noise, enabling Jacobian-free on-manifold edits in diffusion models via alternating tangent steps and diffusion projections.

StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition

cs.GR · 2026-04-23 · unverdicted · novelty 7.0

StyleID supplies human-perception-aligned benchmarks and fine-tuned encoders that improve facial identity recognition robustness across stylization types and strengths.

Latent Fourier Transform

cs.SD · 2026-04-20 · unverdicted · novelty 7.0

LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.

UniEditBench: A Unified and Cost-Effective Benchmark for Image and Video Editing via Distilled MLLMs

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

UniEditBench unifies image and video editing evaluation with a nine-plus-eight operation taxonomy and cost-effective 4B/8B distilled MLLM evaluators that align with human judgments.

Your Pre-trained Diffusion Model Secretly Knows Restoration

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

Pre-trained diffusion models inherently support image restoration that can be unlocked by optimizing prompt embeddings at the text encoder output using a diffusion bridge formulation, achieving competitive results on models like WAN and FLUX without fine-tuning.

In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

cs.CV · 2025-04-29 · unverdicted · novelty 7.0

ICEdit achieves state-of-the-art instructional image editing in Diffusion Transformers via in-context generation, requiring only 0.1% of prior training data and 1% trainable parameters.

VIPaint: Image Inpainting with Pre-Trained Diffusion Models via Variational Inference

cs.CV · 2024-11-28 · unverdicted · novelty 7.0

VIPaint uses hierarchical variational inference to optimize a non-Gaussian Markov approximation of the diffusion posterior, enabling better inpainting and inverse problems with pre-trained and latent diffusion models.

DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation

cs.CV · 2023-09-28 · unverdicted · novelty 7.0

DreamGaussian creates high-quality textured 3D meshes from single-view images in 2 minutes via generative Gaussian Splatting with mesh extraction and UV refinement.

High-Resolution Image Synthesis with Latent Diffusion Models

cs.CV · 2021-12-20 · conditional · novelty 7.0

Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

cs.CV · 2021-12-20 · accept · novelty 7.0

A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.

Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders

cs.CV · 2026-05-30 · unverdicted · novelty 6.0

C-GSPN scales 2D spatial propagation to foundation vision encoders via a fast CUDA kernel, compressed blocks, and two-stage distillation, matching ViT performance with 15% fewer parameters and 4x block speedup at 2K resolution.

Variance Reduction for Expectations with Diffusion Teachers

cs.LG · 2026-05-20 · unverdicted · novelty 6.0 · 2 refs

CARV amortizes upstream diffusion teacher costs over noise resamples with timestep importance sampling and stratified-inverse-CDF sampling, delivering 2-3x effective compute gains in text-to-3D experiments and order-of-magnitude variance cuts in single-step distillation.

Are Watermarked Images Editable? SafeMark for Watermark-Preserving Text-Guided Image Editing

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

SafeMark integrates a thresholded watermark-decoding loss into diffusion editors to enable text-guided edits that preserve embedded watermarks with high bit accuracy.

citing papers explorer

Showing 50 of 61 citing papers.

Consistency Models cs.LG · 2023-03-02 · conditional · none · ref 40 · internal anchor
Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow cs.LG · 2022-09-07 · unverdicted · none · ref 50 · internal anchor
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
Chameleon: Style-Content Disentangled Framework for Cross-Domain Object Compositing cs.CV · 2026-05-31 · unverdicted · none · ref 21 · internal anchor
Chameleon proposes the first large-scale cross-domain compositing dataset and a disentangled encoder plus gated diffusion transformer that outperforms prior in-domain and cross-domain methods on plausibility and fidelity.
StableHand: Quality-Aware Flow Matching for World-Space Dual-Hand Motion Estimation from Egocentric Video cs.CV · 2026-05-18 · unverdicted · none · ref 26 · internal anchor
StableHand introduces a quality-aware flow matching framework conditioned on predicted four-channel per-frame hand observation quality to estimate dual-hand world-space motion from egocentric video, achieving SOTA results with 20-25% W-MPJPE reduction on HOT3D and ARCTIC benchmarks.
Functionalization via Structure Completion and Motion Rectification cs.CV · 2026-05-18 · unverdicted · none · ref 239 · internal anchor
Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.
From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing cs.CV · 2026-05-14 · unverdicted · none · ref 28 · internal anchor
A planner-orchestrator system learns long-horizon image editing by maximizing outcome-based rewards from a vision-language judge and refining plans from successful trajectories.
Amortized Guidance for Image Inpainting with Pretrained Diffusion Models cs.CV · 2026-05-13 · unverdicted · none · ref 28 · internal anchor
AID amortizes guidance for diffusion inpainting by training a reusable module via an auxiliary Gaussian formulation and continuous-time actor-critic algorithm, improving quality-speed trade-off with under 1% overhead.
RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition cs.CV · 2026-05-12 · unverdicted · none · ref 53 · internal anchor
RevealLayer decomposes natural images into multiple RGBA layers using diffusion models with region-aware attention, occlusion-guided adaptation, and a composite loss, outperforming prior methods on a new benchmark dataset.
Geometrically Constrained Stenosis Editing in Coronary Angiography via Entropic Optimal Transport cs.CV · 2026-05-09 · unverdicted · none · ref 8 · 2 links · internal anchor
OT-Bridge Editor uses geometrically constrained entropic optimal transport to synthesize CAG images with precise stenosis, improving downstream detection by 27.8% on ARCADE and 23.0% on a multi-center dataset.
A Call to Lagrangian Action: Learning Population Mechanics from Temporal Snapshots cs.LG · 2026-05-08 · unverdicted · none · ref 207 · 3 links · internal anchor
Wasserstein Lagrangian Mechanics formalizes second-order dynamics in Wasserstein space and provides an algorithm to learn them from observed marginals without specifying the Lagrangian, outperforming gradient flows on various dynamics.
ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent cs.CV · 2026-04-28 · unverdicted · none · ref 12 · internal anchor
ResetEdit embeds a recoverable discrepancy signal during image generation in diffusion models to reconstruct an approximate original latent for high-fidelity text-guided editing.
GeoEdit: Local Frames for Fast, Training-Free On-Manifold Editing in Diffusion Models cs.LG · 2026-04-27 · unverdicted · none · ref 17 · internal anchor
GeoEdit constructs local tangent frames from small perturbations to initial noise, enabling Jacobian-free on-manifold edits in diffusion models via alternating tangent steps and diffusion projections.
StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition cs.GR · 2026-04-23 · unverdicted · none · ref 13 · internal anchor
StyleID supplies human-perception-aligned benchmarks and fine-tuned encoders that improve facial identity recognition robustness across stylization types and strengths.
Latent Fourier Transform cs.SD · 2026-04-20 · unverdicted · none · ref 29 · internal anchor
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
UniEditBench: A Unified and Cost-Effective Benchmark for Image and Video Editing via Distilled MLLMs cs.CV · 2026-04-17 · unverdicted · none · ref 36 · internal anchor
UniEditBench unifies image and video editing evaluation with a nine-plus-eight operation taxonomy and cost-effective 4B/8B distilled MLLM evaluators that align with human judgments.
Your Pre-trained Diffusion Model Secretly Knows Restoration cs.CV · 2026-04-06 · unverdicted · none · ref 36 · internal anchor
Pre-trained diffusion models inherently support image restoration that can be unlocked by optimizing prompt embeddings at the text encoder output using a diffusion bridge formulation, achieving competitive results on models like WAN and FLUX without fine-tuning.
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer cs.CV · 2025-04-29 · unverdicted · none · ref 38 · internal anchor
ICEdit achieves state-of-the-art instructional image editing in Diffusion Transformers via in-context generation, requiring only 0.1% of prior training data and 1% trainable parameters.
VIPaint: Image Inpainting with Pre-Trained Diffusion Models via Variational Inference cs.CV · 2024-11-28 · unverdicted · none · ref 17 · internal anchor
VIPaint uses hierarchical variational inference to optimize a non-Gaussian Markov approximation of the diffusion posterior, enabling better inpainting and inverse problems with pre-trained and latent diffusion models.
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation cs.CV · 2023-09-28 · unverdicted · none · ref 124 · internal anchor
DreamGaussian creates high-quality textured 3D meshes from single-view images in 2 minutes via generative Gaussian Splatting with mesh extraction and UV refinement.
High-Resolution Image Synthesis with Latent Diffusion Models cs.CV · 2021-12-20 · conditional · none · ref 53 · internal anchor
Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models cs.CV · 2021-12-20 · accept · none · ref 15 · internal anchor
A 3.5-billion-parameter diffusion model with classifier-free guidance generates images preferred over DALL-E by human raters and can be fine-tuned for text-guided inpainting.
Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders cs.CV · 2026-05-30 · unverdicted · none · ref 139 · internal anchor
C-GSPN scales 2D spatial propagation to foundation vision encoders via a fast CUDA kernel, compressed blocks, and two-stage distillation, matching ViT performance with 15% fewer parameters and 4x block speedup at 2K resolution.
Variance Reduction for Expectations with Diffusion Teachers cs.LG · 2026-05-20 · unverdicted · none · ref 47 · 2 links · internal anchor
CARV amortizes upstream diffusion teacher costs over noise resamples with timestep importance sampling and stratified-inverse-CDF sampling, delivering 2-3x effective compute gains in text-to-3D experiments and order-of-magnitude variance cuts in single-step distillation.
Are Watermarked Images Editable? SafeMark for Watermark-Preserving Text-Guided Image Editing cs.CV · 2026-05-19 · unverdicted · none · ref 16 · internal anchor
SafeMark integrates a thresholded watermark-decoding loss into diffusion editors to enable text-guided edits that preserve embedded watermarks with high bit accuracy.
MIRAGE: Robust multi-modal architectures translate fMRI-to-image models from vision to mental imagery q-bio.NC · 2026-05-16 · unverdicted · none · ref 73 · internal anchor
MIRAGE achieves state-of-the-art mental image reconstruction from fMRI on the NSD-Imagery benchmark by using a linear backbone with multi-modal text and image features fed to a diffusion model.
Contrastive-SDXL: Annotation-Preserving Night-Time Augmentation for Pedestrian Detection cs.CV · 2026-05-13 · unverdicted · none · ref 2 · internal anchor
Contrastive-SDXL augments daytime images into realistic night-time versions using SDXL-Turbo with LoRA and multi-level DINOv2 contrastive losses, yielding 6-7% lower miss rate on pedestrian detection versus daytime-only training.
Score-Based Generative Modeling through Anisotropic Stochastic Partial Differential Equations cs.CE · 2026-05-09 · unverdicted · none · ref 19 · internal anchor
Anisotropic SPDEs preserve geometric data structure over longer timescales in score-based generative modeling, yielding better image quality than standard SDE baselines and flow matching in unconditional and conditional tasks.
Conservative Flows: A New Paradigm of Generative Models cs.LG · 2026-05-07 · unverdicted · none · ref 49 · internal anchor
Conservative flows generate by running probability-preserving stochastic dynamics initialized at data points rather than noise, using corrected Langevin or predictor-corrector mechanisms on top of any pretrained flow model and showing gains on Swiss-roll, ImageNet-256 and Oxford Flowers-102.
Physical Fidelity Reconstruction via Improved Consistency-Distilled Flow Matching for Dynamical Systems cs.LG · 2026-05-07 · unverdicted · none · ref 22 · internal anchor
Distilled one-step consistency model from optimal-transport flow-matching teacher reconstructs high-fidelity dynamical system flows from low-fidelity data with 12x speedup, half the parameters, and 23.1% better SSIM than scratch-trained baselines.
MooD: Perception-Enhanced Efficient Affective Image Editing via Continuous Valence-Arousal Modeling cs.CV · 2026-05-04 · unverdicted · none · ref 21 · 2 links · internal anchor
MooD introduces continuous valence-arousal modeling with VA-aware retrieval and perception-enhanced guidance for efficient, controllable affective image editing, plus a new AffectSet dataset.
REVIVE 3D: Refinement via Encoded Voluminous Inflated prior for Volume Enhancement cs.CV · 2026-04-30 · unverdicted · none · ref 30 · internal anchor
REVIVE 3D generates voluminous 3D assets from flat 2D images via an inflated prior construction followed by latent-space refinement, plus new metrics for volume and flatness validated by user study.
FluSplat: Sparse-View 3D Editing without Test-Time Optimization cs.CV · 2026-04-21 · unverdicted · none · ref 30 · internal anchor
FluSplat trains a model with geometric alignment constraints on multi-view edits to produce consistent 3D scene edits from sparse views in a single forward pass without test-time optimization.
MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model cs.CV · 2026-03-27 · unverdicted · none · ref 41 · internal anchor
MPDiT uses a hierarchical multi-patch design in transformers to lower computation in diffusion models by handling coarse global features first then fine local details, plus faster-converging embeddings.
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback cs.CV · 2025-10-19 · unverdicted · none · ref 14 · internal anchor
UniWorld-V2 applies policy optimization via DiffusionNFT and MLLM logit feedback with group filtering to reach state-of-the-art scores of 4.49 on ImgEdit and 7.83 on GEdit-Bench while remaining model-agnostic.
Training-Free Reward-Guided Image Editing via Trajectory Optimal Control cs.CV · 2025-09-30 · unverdicted · none · ref 10 · internal anchor
A trajectory optimal control framework for reward-guided image editing in diffusion models that balances reward maximization with source fidelity better than prior inversion-based baselines.
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning cs.CV · 2025-09-24 · unverdicted · none · ref 19 · internal anchor
EditVerse unifies image and video editing and generation in one transformer model via unified token sequences and in-context learning, trained jointly on curated video editing data plus image/video corpora and evaluated on a new instruction-based benchmark.
LeakyCLIP: Extracting Training Data from CLIP cs.CR · 2025-08-01 · conditional · none · ref 30 · internal anchor
LeakyCLIP reconstructs images from CLIP embeddings with over 258% SSIM gain versus baselines and enables membership inference from reconstruction metrics on LAION-2B data.
NullFace: Training-Free Localized Face Anonymization cs.CV · 2025-03-11 · unverdicted · none · ref 51 · internal anchor
NullFace performs training-free localized face anonymization by inverting images to noise and denoising with modified identity embeddings from a pre-trained diffusion model.
Semantics Disentanglement and Composition for Universal Image Coding with Efficiently LLM Reasoning and Generative Diffusion cs.CV · 2024-12-24 · unverdicted · none · ref 57 · internal anchor
UniCodec uses LLM-driven semantic disentanglement at the encoder and diffusion-based compositional generation at the decoder to enable one codec for both human perception and machine vision tasks without task-specific retraining.
Autoregressive Video Generation without Vector Quantization cs.CV · 2024-12-18 · unverdicted · none · ref 17 · internal anchor
NOVA reformulates video generation as non-quantized autoregressive frame-by-frame temporal prediction combined with set-by-set spatial prediction, outperforming prior AR video models and some diffusion models in efficiency and quality.
VideoPoet: A Large Language Model for Zero-Shot Video Generation cs.CV · 2023-12-21 · unverdicted · none · ref 24 · internal anchor
VideoPoet is a large language model that performs zero-shot video generation with audio from diverse multimodal conditioning signals.
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models cs.CV · 2023-08-13 · unverdicted · none · ref 50 · internal anchor
IP-Adapter adds effective image prompting to text-to-image diffusion models using a lightweight decoupled cross-attention adapter that works alongside text prompts and other controls.
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis cs.CV · 2023-07-04 · conditional · none · ref 28 · internal anchor
SDXL improves upon prior Stable Diffusion versions through a larger UNet backbone, dual text encoders, novel conditioning, and a refinement model, producing higher-fidelity images competitive with black-box state-of-the-art generators.
MagicVideo: Efficient Video Generation With Latent Diffusion Models cs.CV · 2022-11-20 · unverdicted · none · ref 24 · internal anchor
MagicVideo generates 256x256 text-conditioned video clips via latent diffusion with a custom 3D U-Net, achieving roughly 64 times lower compute than prior video diffusion models.
One-Step Distillation of Discrete Diffusion Image Generators via Fixed-Point Iteration cs.CV · 2026-05-20 · unverdicted · none · ref 33 · internal anchor
Fixed-Point Distillation constructs one-step correction targets for discrete diffusion generators via partial corruption and single teacher refinement, lifted into continuous features with a multi-bandwidth drift loss and straight-through estimation.
Stable and Near-Reversible Diffusion ODE Solvers for Image Editing cs.CV · 2026-05-12 · unverdicted · none · ref 43 · internal anchor
Near-reversible Runge-Kutta diffusion ODE solvers with vector-field smoothing improve stability and edit fidelity for large changes in text-guided image editing compared to exactly reversible alternatives.
Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges cs.LG · 2026-05-03 · unverdicted · none · ref 109 · 2 links · internal anchor
A structured diffusion bridge method achieves near fully-paired modality translation quality using alignment constraints even in unpaired or semi-paired regimes.
Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging cs.LG · 2026-03-23 · unverdicted · none · ref 39 · 2 links · internal anchor
SFM improves generalization under distribution shift for scientific imaging tasks while AVUQ supplies sample-efficient epistemic and aleatoric uncertainty estimates plus anomaly scores.
Diffusion Models are Secretly Zero-Shot 3DGS Harmonizers cs.CV · 2025-03-09 · unverdicted · none · ref 19 · internal anchor
D3DR optimizes inserted 3DGS objects with a DDS-inspired diffusion objective plus a new personalization step to match scene lighting, reporting 2 dB PSNR gain over prior methods.
RectifiedHR: Enable Efficient High-Resolution Synthesis via Energy Rectification cs.CV · 2025-03-04 · unverdicted · none · ref 39 · internal anchor
RectifiedHR is a training-free method that uses noise refresh and latent energy analysis to enable efficient high-resolution synthesis in diffusion models.

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer