DIRECT uses a three-level multi-agent framework to solve video mashup creation as a multimodal coherency problem, outperforming baselines on a new benchmark.
Prompt-driven agentic video editing system: Autonomous comprehension of long-form, story-driven media
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
BEAT introduces MuVA encoder and Bar-DP algorithm for rhythm-elastic music-guided trailer generation, claiming SOTA results on the new TrailerArena benchmark.
citing papers explorer
-
DIRECT: Video Mashup Creation via Hierarchical Multi-Agent Planning and Intent-Guided Editing
DIRECT uses a three-level multi-agent framework to solve video mashup creation as a multimodal coherency problem, outperforming baselines on a new benchmark.
-
BEAT: Rhythm-Elastic Alignment for Agentic Music-guided Movie Trailer Generation
BEAT introduces MuVA encoder and Bar-DP algorithm for rhythm-elastic music-guided trailer generation, claiming SOTA results on the new TrailerArena benchmark.