arXiv preprint arXiv:2306.02018 , year=

Wang, X · 2023 · arXiv 2306.02018

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

iTryOn is a diffusion-based framework that adds spatial 3D hand guidance and semantic action-aware embeddings to handle complex garment deformations during human-clothing interactions in videos.

CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

CameraNoise embeds camera motion into the noise space of video diffusion via Geometry-guided Reprojection Flow and noise warping to achieve faithful trajectory control while preserving the diffusion prior.

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

cs.CV · 2023-10-30 · unverdicted · novelty 6.0

Open-source text-to-video and image-to-video diffusion models generate high-quality 1024x576 videos, with the I2V variant claimed as the first to strictly preserve reference image content.

InfiniVerse: Occupancy Guided Unbounded Scene Generation for Autonomous Driving

cs.CV · 2026-06-30 · unverdicted · novelty 5.0

InfiniVerse reconstructs 3D occupancy from one frame, extends scenes autoregressively, converts to video via diffusion, and uses re-projection feedback to achieve SOTA FID 6.4 and FVD 67.97 on Waymo and nuScenes.

ModelScope Text-to-Video Technical Report

cs.CV · 2023-08-12 · unverdicted · novelty 4.0

ModelScopeT2V is a 1.7-billion-parameter text-to-video model built on Stable Diffusion that adds temporal modeling and outperforms prior methods on three evaluation metrics.

citing papers explorer

Showing 1 of 1 citing paper after filters.

ModelScope Text-to-Video Technical Report cs.CV · 2023-08-12 · unverdicted · none · ref 59
ModelScopeT2V is a 1.7-billion-parameter text-to-video model built on Stable Diffusion that adds temporal modeling and outperforms prior methods on three evaluation metrics.

arXiv preprint arXiv:2306.02018 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer