Dif- fueraser: A diffusion model for video inpainting

· 2025 · arXiv 2501.10018

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

representative citing papers

PROVE: A Perceptual RemOVal cohErence Benchmark for Visual Media

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

PROVE proposes RC metrics for perceptual removal coherence and releases PROVE-Bench to better align automatic scores with human judgments on object removal tasks.

Sparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

Sparkle supplies a large-scale dataset and benchmark for instruction-driven video background replacement, enabling models that generate more natural and temporally consistent new scenes than earlier approaches.

YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal

cs.CV · 2026-04-30 · unverdicted · novelty 7.0

YOSE accelerates DiT video object removal up to 2.5x by using BVI for adaptive token selection and DiffSim to simulate unmasked token effects, while preserving visual quality.

Physics-Aware Video Instance Removal Benchmark

cs.CV · 2026-04-07 · unverdicted · novelty 7.0

The PVIR benchmark tests video object removal on physical consistency using 95 annotated videos and shows that existing methods struggle with complex interactions like lingering shadows.

Tube-Structured Incremental Semantic HARQ for Generative Video Receivers

eess.IV · 2026-05-11 · unverdicted · novelty 6.0

Tube-structured incremental semantic HARQ reduces time-weighted recovery cost and enables earlier stabilization in generative video reconstruction compared to block-based methods under matched budgets and channel conditions.

When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

NUMINA improves counting accuracy in text-to-video diffusion models by up to 7.4% via a training-free identify-then-guide framework on the new CountBench dataset.

GA-GS: Generation-Assisted Gaussian Splatting for Static Scene Reconstruction

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

GA-GS uses motion segmentation, diffusion-based inpainting for pseudo-ground-truth, and per-Gaussian authenticity scalars to achieve SOTA static scene reconstruction from videos with dynamic occlusions.

CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal

cs.CV · 2026-03-23 · unverdicted · novelty 6.0

CLEAR achieves end-to-end mask-free video subtitle removal via dual-encoder self-supervised orthogonality and LoRA-based generation feedback, delivering +6.77 dB PSNR gains and strong zero-shot multilingual performance.

citing papers explorer

Showing 8 of 8 citing papers.

PROVE: A Perceptual RemOVal cohErence Benchmark for Visual Media cs.CV · 2026-05-14 · unverdicted · none · ref 18
PROVE proposes RC metrics for perceptual removal coherence and releases PROVE-Bench to better align automatic scores with human judgments on object removal tasks.
Sparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance cs.CV · 2026-05-07 · unverdicted · none · ref 12
Sparkle supplies a large-scale dataset and benchmark for instruction-driven video background replacement, enabling models that generate more natural and temporally consistent new scenes than earlier approaches.
YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal cs.CV · 2026-04-30 · unverdicted · none · ref 10
YOSE accelerates DiT video object removal up to 2.5x by using BVI for adaptive token selection and DiffSim to simulate unmasked token effects, while preserving visual quality.
Physics-Aware Video Instance Removal Benchmark cs.CV · 2026-04-07 · unverdicted · none · ref 12
The PVIR benchmark tests video object removal on physical consistency using 95 annotated videos and shows that existing methods struggle with complex interactions like lingering shadows.
Tube-Structured Incremental Semantic HARQ for Generative Video Receivers eess.IV · 2026-05-11 · unverdicted · none · ref 7
Tube-structured incremental semantic HARQ reduces time-weighted recovery cost and enables earlier stabilization in generative video reconstruction compared to block-based methods under matched budgets and channel conditions.
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models cs.CV · 2026-04-09 · unverdicted · none · ref 35
NUMINA improves counting accuracy in text-to-video diffusion models by up to 7.4% via a training-free identify-then-guide framework on the new CountBench dataset.
GA-GS: Generation-Assisted Gaussian Splatting for Static Scene Reconstruction cs.CV · 2026-04-06 · unverdicted · none · ref 32
GA-GS uses motion segmentation, diffusion-based inpainting for pseudo-ground-truth, and per-Gaussian authenticity scalars to achieve SOTA static scene reconstruction from videos with dynamic occlusions.
CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal cs.CV · 2026-03-23 · unverdicted · none · ref 3
CLEAR achieves end-to-end mask-free video subtitle removal via dual-encoder self-supervised orthogonality and LoRA-based generation feedback, delivering +6.77 dB PSNR gains and strong zero-shot multilingual performance.

Dif- fueraser: A diffusion model for video inpainting

fields

years

verdicts

representative citing papers

citing papers explorer