Video Analysis and Generation via a Semantic Progress Function

· 2026 · cs.CV · arXiv 2604.22554

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Transformations produced by image and video generation models often evolve in a highly non-linear manner: long stretches where the content barely changes are followed by sudden, abrupt semantic jumps. To analyze and correct this behavior, we introduce a Semantic Progress Function, a one-dimensional representation that captures how the meaning of a given sequence evolves over time. For each frame, we compute distances between semantic embeddings and fit a smooth curve that reflects the cumulative semantic shift across the sequence. Departures of this curve from a straight line reveal uneven semantic pacing. Building on this insight, we propose a semantic linearization procedure that reparameterizes (or retimes) the sequence so that semantic change unfolds at a constant rate, yielding smoother and more coherent transitions. Beyond linearization, our framework provides a model-agnostic foundation for identifying temporal irregularities, comparing semantic pacing across different generators, and steering both generated and real-world video sequences toward arbitrary target pacing.

representative citing papers

Token-to-Token Alignment of Text Embeddings for Semantic Blending

cs.CV · 2026-06-22 · unverdicted · novelty 4.0

Token-to-Token alignment rephrases prompts into shared structure then matches token embeddings by semantic similarity, making linear interpolation a meaningful operation for blending in text-to-image models.

citing papers explorer

Showing 1 of 1 citing paper.

Token-to-Token Alignment of Text Embeddings for Semantic Blending cs.CV · 2026-06-22 · unverdicted · none · ref 41 · internal anchor
Token-to-Token alignment rephrases prompts into shared structure then matches token embeddings by semantic similarity, making linear interpolation a meaningful operation for blending in text-to-image models.

Video Analysis and Generation via a Semantic Progress Function

fields

years

verdicts

representative citing papers

citing papers explorer