Video-t1: Test-time scaling for video generation.arXiv preprint arXiv:2503.18942, 2025a

Video-T1: Test-Time Scaling for Video Generation , author= · 2025 · arXiv 2503.18942

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Through the PRISM: Preference Representation in Intermediate States of Video Diffusion Models

cs.CV · 2026-06-18 · unverdicted · novelty 7.0

PRISM shows video diffusion models inherently encode preference information in noisy latents, achieving SOTA accuracy and enabling noise-robust early-stage sampling with a correlation to generative performance.

TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.

GEOPHYS: The Geometry of Physical Plausibility

cs.CV · 2026-06-15 · unverdicted · novelty 6.0

GEOPHYS defines five geometric properties of per-frame embeddings from image encoders that detect physical implausibility in videos with SOTA accuracy and serve as an efficient verifier.

Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

MIGA introduces two-stage alignment to close train-inference gaps and dual consistency enhancement via self-reflection and long-range guidance to achieve SOTA temporal consistency in infinite-frame video generation on VBench and NarrLV.

Test-Time Scaling in Multimodal Foundation Models: A Comprehensive Survey of Generation and Reasoning

cs.CV · 2026-06-06 · unverdicted · novelty 5.0

A survey of test-time scaling for multimodal foundation models that introduces a three-way taxonomy of sampling, feedback, and search approaches along with applications and benchmarks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens cs.CV · 2026-04-16 · unverdicted · none · ref 26
TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.

Video-t1: Test-time scaling for video generation.arXiv preprint arXiv:2503.18942, 2025a

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer