{"paper":{"title":"TIE: Time Interval Encoding for Video Generation over Events","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"TIE encodes time as intervals rather than points inside diffusion transformers, allowing overlapping events to be represented natively in attention.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Bo Ye, Fan Cheng, Jian Zhao, Qianyu Peng, Ruili Feng, Shangwen Zhu, Xiaofan Li, Xinyu Cui, Yang Cao, Yiming Li, Zheng-Jun Zha, Zhilei Shu, Zihang Liang","submitted_at":"2026-05-11T13:23:14Z","abstract_excerpt":"Director-style prompting, robotic action prediction, and interactive video agents demand temporal grounding over concurrent events -- a regime in which 68% of general clips and over 99% of robotics/gameplay clips contain overlapping events, yet existing multi-event generators rest on a single-active-prompt assumption. However, modern video generators, such as Diffusion Transformers (DiT), represent time as discrete points through point-wise positional encodings. This formulation creates a fundamental dimension mismatch: temporally extended intervals and overlapping events are mathematically un"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"TIE is characterized by two basic principles: Temporal Integrability, which requires an event to aggregate positional evidence over its full duration, and Duration Invariance, which removes the trivial bias toward longer intervals. Under a uniform kernel, this characterization yields an efficient closed-form sinc-based solution that preserves the standard attention interface.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the uniform kernel plus the two stated principles are sufficient to produce a general, artifact-free interval encoding that works across diverse video domains without post-hoc tuning or loss of visual fidelity.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"TIE derives a sinc-based interval encoding from temporal integrability and duration invariance principles, raising temporal constraint satisfaction from 77% to 96% on the OmniEvents dataset while preserving visual quality.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"TIE encodes time as intervals rather than points inside diffusion transformers, allowing overlapping events to be represented natively in attention.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"c93fcf778260cd109400dc5f3c2efc5c905cc02e72074b34143155865a2c7cfa"},"source":{"id":"2605.10543","kind":"arxiv","version":2},"verdict":{"id":"9d9d6084-e39f-4cd7-bcfd-a6cd2c149c21","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-12T04:22:06.763369Z","strongest_claim":"TIE is characterized by two basic principles: Temporal Integrability, which requires an event to aggregate positional evidence over its full duration, and Duration Invariance, which removes the trivial bias toward longer intervals. Under a uniform kernel, this characterization yields an efficient closed-form sinc-based solution that preserves the standard attention interface.","one_line_summary":"TIE derives a sinc-based interval encoding from temporal integrability and duration invariance principles, raising temporal constraint satisfaction from 77% to 96% on the OmniEvents dataset while preserving visual quality.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the uniform kernel plus the two stated principles are sufficient to produce a general, artifact-free interval encoding that works across diverse video domains without post-hoc tuning or loss of visual fidelity.","pith_extraction_headline":"TIE encodes time as intervals rather than points inside diffusion transformers, allowing overlapping events to be represented natively in attention."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.10543/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"claim_evidence","ran_at":"2026-05-20T05:42:00.965604Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T14:41:49.786984Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T11:01:17.805441Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T09:12:00.483992Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"00009e8e70502bbf985401820299746df5c5f257e642d1b48495a4ea9e3cad23"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"53b3a569e4629df5ca1b1d0c4e78fcb73c66e0bcceae0141b3a879ec769f9ada"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}