AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data
read the original abstract
This paper introduces an effective method for computation-efficient personalized style video generation without requiring access to any personalized video data. It reduces the necessary generation time of similarly sized video diffusion models from 25 seconds to around 1 second while maintaining the same level of performance. The method's effectiveness lies in its dual-level decoupling learning approach: 1) separating the learning of video style from video generation acceleration, which allows for personalized style video generation without any personalized style video data, and 2) separating the acceleration of image generation from the acceleration of video motion generation, enhancing training efficiency and mitigating the negative effects of low-quality video data.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
From Competition to Coopetition: Coopetitive Training-Free Image Editing Based on Text Guidance
CoEdit is a zero-shot coopetitive framework for text-guided image editing that uses dual-entropy attention manipulation and entropic latent refinement to improve editing harmony and structural preservation.
-
FIS-DiT: Breaking the Few-Step Video Inference Barrier via Training-Free Frame Interleaved Sparsity
FIS-DiT achieves 2.11-2.41x speedup on video DiT models in few-step regimes with negligible quality loss by exploiting frame-wise sparsity and consistency through a training-free interleaved execution strategy.
-
DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization
DOLLAR combines variational score and consistency distillation for few-step video generation plus latent reward optimization, reporting 82.57 VBench score and up to 278x speedup over the teacher diffusion model for 12...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.