hub

Improved denoising diffusion probabilistic models

Alexander Quinn Nichol, Prafulla Dhariwal

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

browse 13 citing papers

hub tools

JSON dossier citing papers JSON

representative citing papers

Instance-level Visual Active Tracking with Occlusion-Aware Planning

cs.CV · 2026-04-23 · unverdicted · novelty 7.0

OA-VAT improves visual active tracking by combining instance-level prototype discrimination with occlusion-aware diffusion planning, reporting gains over prior SOTA on simulated and real drone benchmarks.

HP-Edit: A Human-Preference Post-Training Framework for Image Editing

cs.CV · 2026-04-21 · unverdicted · novelty 7.0

HP-Edit introduces a post-training framework and RealPref-50K dataset that uses a VLM-based HP-Scorer to align diffusion image editing models with human preferences, improving outputs on Qwen-Image-Edit-2509.

From Synthetic Data to Real Restorations: Diffusion Model for Patient-specific Dental Crown Completion

cs.CV · 2026-03-27 · unverdicted · novelty 7.0

A diffusion model trained on synthetically damaged teeth from public datasets completes crowns with 81.8% IoU and 0.00034 Chamfer distance, and produces real-world restorations with minimal opposing-tooth interference.

MultiAnimate: Pose-Guided Image Animation Made Extensible

cs.CV · 2026-02-25 · unverdicted · novelty 7.0

MultiAnimate adds Identifier Assigner and Identifier Adapter modules to diffusion video models so they can handle multiple characters without identity mix-ups, generalizing from two-character training data to more characters.

PhysiGen: Integrating Collision-Aware Physical Constraints for High-Fidelity Human-Human Interaction Generation

cs.CV · 2026-05-01 · unverdicted · novelty 6.0

PhysiGen reduces interpenetration in text-driven 3D human interaction generation by simplifying meshes to geometric primitives for fast collision detection and guiding optimization with collision regions.

Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation

cs.CV · 2026-04-21 · unverdicted · novelty 6.0

Patch Forcing enables diffusion models to denoise image patches at varying rates based on predicted difficulty, advancing easier regions first to improve context and achieve better generation quality on ImageNet while scaling to text-to-image tasks.

Bias at the End of the Score

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

Reward models used as quality scorers in text-to-image generation encode demographic biases that cause reward-guided training to sexualize female subjects, reinforce stereotypes, and reduce diversity.

EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

cs.CV · 2025-12-04 · conditional · novelty 6.0

Reward Forcing combines EMA-Sink tokens and Rewarded Distribution Matching Distillation to deliver state-of-the-art streaming video generation at 23.1 FPS without copying initial frames.

Fast Kernel-Space Diffusion for Remote Sensing Pansharpening

cs.CV · 2025-05-25 · unverdicted · novelty 6.0

KSDiff generates convolutional kernels in kernel space using low-rank core tensor and factor generators with multi-head attention for fast, high-quality pansharpening.

Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena

cs.CV · 2026-04-24 · unverdicted · novelty 5.0

GSAL combines diffusion-based visual difficulty scoring with hierarchical semantic coverage to improve active learning retrieval of subtle and rare visual anomalies over standard uncertainty and diversity methods.

SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance

cs.CV · 2026-04-15 · unverdicted · novelty 5.0

SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.

Degradation-Aware and Structure-Preserving Diffusion for Real-World Image Super-Resolution

cs.CV · 2026-04-13 · unverdicted · novelty 4.0

Two new lightweight modules for diffusion-based real-world image super-resolution deliver competitive perceptual quality and better structure preservation on DIV2K and RealSR datasets.

citing papers explorer

Showing 13 of 13 citing papers.

Instance-level Visual Active Tracking with Occlusion-Aware Planning cs.CV · 2026-04-23 · unverdicted · none · ref 31
OA-VAT improves visual active tracking by combining instance-level prototype discrimination with occlusion-aware diffusion planning, reporting gains over prior SOTA on simulated and real drone benchmarks.
HP-Edit: A Human-Preference Post-Training Framework for Image Editing cs.CV · 2026-04-21 · unverdicted · none · ref 33
HP-Edit introduces a post-training framework and RealPref-50K dataset that uses a VLM-based HP-Scorer to align diffusion image editing models with human preferences, improving outputs on Qwen-Image-Edit-2509.
From Synthetic Data to Real Restorations: Diffusion Model for Patient-specific Dental Crown Completion cs.CV · 2026-03-27 · unverdicted · none · ref 15
A diffusion model trained on synthetically damaged teeth from public datasets completes crowns with 81.8% IoU and 0.00034 Chamfer distance, and produces real-world restorations with minimal opposing-tooth interference.
MultiAnimate: Pose-Guided Image Animation Made Extensible cs.CV · 2026-02-25 · unverdicted · none · ref 20
MultiAnimate adds Identifier Assigner and Identifier Adapter modules to diffusion video models so they can handle multiple characters without identity mix-ups, generalizing from two-character training data to more characters.
PhysiGen: Integrating Collision-Aware Physical Constraints for High-Fidelity Human-Human Interaction Generation cs.CV · 2026-05-01 · unverdicted · none · ref 27
PhysiGen reduces interpenetration in text-driven 3D human interaction generation by simplifying meshes to geometric primitives for fast collision detection and guiding optimization with collision regions.
Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation cs.CV · 2026-04-21 · unverdicted · none · ref 43
Patch Forcing enables diffusion models to denoise image patches at varying rates based on predicted difficulty, advancing easier regions first to improve context and achieve better generation quality on ImageNet while scaling to text-to-image tasks.
Bias at the End of the Score cs.CV · 2026-04-14 · unverdicted · none · ref 46
Reward models used as quality scorers in text-to-image generation encode demographic biases that cause reward-guided training to sexualize female subjects, reinforce stereotypes, and reduce diversity.
EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure cs.CV · 2026-04-10 · unverdicted · none · ref 35
EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation cs.CV · 2025-12-04 · conditional · none · ref 52
Reward Forcing combines EMA-Sink tokens and Rewarded Distribution Matching Distillation to deliver state-of-the-art streaming video generation at 23.1 FPS without copying initial frames.
Fast Kernel-Space Diffusion for Remote Sensing Pansharpening cs.CV · 2025-05-25 · unverdicted · none · ref 36
KSDiff generates convolutional kernels in kernel space using low-rank core tensor and factor generators with multi-head attention for fast, high-quality pansharpening.
Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena cs.CV · 2026-04-24 · unverdicted · none · ref 18
GSAL combines diffusion-based visual difficulty scoring with hierarchical semantic coverage to improve active learning retrieval of subtle and rare visual anomalies over standard uncertainty and diversity methods.
SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance cs.CV · 2026-04-15 · unverdicted · none · ref 38
SocialMirror reconstructs 3D meshes of closely interacting humans from monocular videos using semantic guidance from vision-language models and geometric constraints in a diffusion model to handle occlusions and maintain temporal and spatial consistency.
Degradation-Aware and Structure-Preserving Diffusion for Real-World Image Super-Resolution cs.CV · 2026-04-13 · unverdicted · none · ref 24
Two new lightweight modules for diffusion-based real-world image super-resolution deliver competitive perceptual quality and better structure preservation on DIV2K and RealSR datasets.

Improved denoising diffusion probabilistic models

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer