TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.
Video-t1: Test-time scaling for video generation.arXiv preprint arXiv:2503.18942, 2025a
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
MIGA introduces two-stage alignment to close train-inference gaps and dual consistency enhancement via self-reflection and long-range guidance to achieve SOTA temporal consistency in infinite-frame video generation on VBench and NarrLV.
A survey of test-time scaling for multimodal foundation models that introduces a three-way taxonomy of sampling, feedback, and search approaches along with applications and benchmarks.
citing papers explorer
-
TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens
TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.