Recognition: unknown
ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction
Pith reviewed 2026-05-10 13:39 UTC · model grok-4.3
The pith
ClipGStream divides long multi-view dynamic videos into short clips and optimizes Gaussian splatting with inherited anchors to achieve scalable, flicker-free 3D reconstruction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ClipGStream performs stream optimization at the clip level by modeling dynamic motion through clip-independent spatio-temporal fields and residual anchor compensation for local variations, while inter-clip inherited anchors and decoders maintain structural consistency across the full sequence, enabling scalable, flicker-free reconstruction of long dynamic videos with high temporal coherence and reduced memory overhead.
What carries the argument
Clip-Stream Gaussian Splatting, which combines clip-level spatio-temporal fields and residual anchors with inter-clip inherited anchors to balance local motion capture and global consistency.
If this is right
- Reconstruction of videos of any length becomes feasible without memory growing linearly with duration.
- Temporal coherence improves enough to eliminate flicker in scenes with large or complex motion.
- Memory overhead drops relative to methods that optimize entire clips or full sequences at once.
- State-of-the-art reconstruction quality holds while processing efficiency rises on multi-view dynamic data.
- The approach handles arbitrary motion patterns by localizing optimization within each clip.
Where Pith is reading between the lines
- The clip-boundary inheritance mechanism could be extended to streaming scenarios where new clips arrive continuously.
- Combining this with other Gaussian variants might allow hybrid static-dynamic scene handling without full re-optimization.
- Experiments on sequences longer than those tested could reveal practical upper limits on clip size choices.
Load-bearing premise
That splitting the sequence into clips and passing anchors and decoders between them preserves full structural consistency without losing local motion details or creating new boundary artifacts.
What would settle it
Reconstructed long sequences that display visible flickering, discontinuities at clip boundaries, or memory consumption that does not scale sublinearly with video length.
Figures
read the original abstract
Dynamic 3D scene reconstruction is essential for immersive media such as VR, MR, and XR, yet remains challenging for long multi-view sequences with large-scale motion. Existing dynamic Gaussian approaches are either Frame-Stream, offering scalability but poor temporal stability, or Clip, achieving local consistency at the cost of high memory and limited sequence length. We propose ClipGStream, a hybrid reconstruction framework that performs stream optimization at the clip level rather than the frame level. The sequence is divided into short clips, where dynamic motion is modeled using clip-independent spatio-temporal fields and residual anchor compensation to capture local variations efficiently, while inter-clip inherited anchors and decoders maintain structural consistency across clips. This Clip-Stream design enables scalable, flicker-free reconstruction of long dynamic videos with high temporal coherence and reduced memory overhead. Extensive experiments demonstrate that ClipGStream achieves state-of-the-art reconstruction quality and efficiency. The project page is available at: https://liangjie1999.github.io/ClipGStreamWeb/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ClipGStream, a hybrid Clip-Stream Gaussian Splatting framework for reconstructing long multi-view dynamic 3D scenes. Sequences are partitioned into short clips; within each clip, motion is modeled via clip-independent spatio-temporal fields and residual anchor compensation, while inter-clip inherited anchors and decoders are used to propagate structural consistency. The design is claimed to deliver scalable, flicker-free reconstruction of arbitrary-length videos with large motions, high temporal coherence, and lower memory footprint than prior clip-based methods, with experiments asserted to demonstrate state-of-the-art quality and efficiency.
Significance. If the central claims are substantiated, the work would meaningfully advance dynamic scene reconstruction by offering a practical middle ground between memory-efficient but temporally unstable frame-stream methods and locally consistent but length-limited clip methods. Successful validation of drift-free inheritance across clips could enable reliable long-sequence modeling for VR/MR/XR applications.
major comments (3)
- [Abstract and §3] Abstract and §3 (Method): The inter-clip inheritance of anchors and decoders is presented only at a high level as the mechanism for global consistency and flicker-free output. No explicit alignment loss, shared canonical space, drift-correction term, or error-propagation bound is defined, leaving open whether residual compensation remains purely local and therefore permits cumulative misalignment on long sequences with large motions. This directly underpins the central 'flicker-free' and 'high temporal coherence' guarantees.
- [§4] §4 (Experiments): The abstract asserts state-of-the-art reconstruction quality and efficiency, yet the provided text supplies no quantitative metrics, baseline comparisons, ablation studies on clip length or inheritance, or error analysis across clip boundaries. Without these, the empirical support for the Clip-Stream design cannot be evaluated.
- [§3.1–3.2] §3.1–3.2: The claim that clip-independent spatio-temporal fields plus residual anchor compensation capture local variations 'efficiently' while inheritance preserves global structure requires a concrete formulation (e.g., the precise definition of the residual term and how inherited decoders are initialized or updated) to confirm it does not trade one form of artifact for another at clip transitions.
minor comments (2)
- [§3] Notation for 'clip-independent spatio-temporal fields' and 'residual anchor compensation' should be introduced with explicit equations or pseudocode on first use to improve readability.
- [Abstract] The project page URL is given but no supplementary video or long-sequence qualitative results are referenced in the text; adding these would help illustrate the claimed temporal coherence.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We agree that the current presentation of the inter-clip inheritance mechanism and the experimental validation require expansion to fully support the central claims. We will revise the manuscript to provide more explicit formulations, quantitative results, and analyses. Below we respond to each major comment.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (Method): The inter-clip inheritance of anchors and decoders is presented only at a high level as the mechanism for global consistency and flicker-free output. No explicit alignment loss, shared canonical space, drift-correction term, or error-propagation bound is defined, leaving open whether residual compensation remains purely local and therefore permits cumulative misalignment on long sequences with large motions. This directly underpins the central 'flicker-free' and 'high temporal coherence' guarantees.
Authors: We acknowledge that the abstract and §3 describe the inheritance at a high level. In the method, anchors and decoder parameters optimized at the end of one clip are directly inherited as initialization for the subsequent clip, while residual anchor compensation optimizes only local deviations within the current clip. This design aims to propagate global structure while allowing efficient local motion modeling. However, the comment correctly identifies the absence of an explicit alignment loss or error-propagation bound. In the revision we will expand §3 with a precise description of the inheritance process, the residual term definition, and a discussion of temporal coherence at clip boundaries (including qualitative evidence from our results). We will also consider adding a lightweight consistency regularizer if it improves stability without harming efficiency. revision: yes
-
Referee: [§4] §4 (Experiments): The abstract asserts state-of-the-art reconstruction quality and efficiency, yet the provided text supplies no quantitative metrics, baseline comparisons, ablation studies on clip length or inheritance, or error analysis across clip boundaries. Without these, the empirical support for the Clip-Stream design cannot be evaluated.
Authors: The full experiments section contains quantitative comparisons (PSNR, SSIM, LPIPS) against prior dynamic Gaussian methods on multiple multi-view datasets, together with runtime and memory measurements. We agree, however, that ablations specifically on clip length, the effect of inheritance, and error metrics at clip boundaries are not presented with sufficient detail or tables. We will add these elements in the revised §4, including new ablation tables, boundary-specific error plots, and direct comparisons that isolate the contribution of the Clip-Stream components. revision: yes
-
Referee: [§3.1–3.2] §3.1–3.2: The claim that clip-independent spatio-temporal fields plus residual anchor compensation capture local variations 'efficiently' while inheritance preserves global structure requires a concrete formulation (e.g., the precise definition of the residual term and how inherited decoders are initialized or updated) to confirm it does not trade one form of artifact for another at clip transitions.
Authors: The spatio-temporal fields are modeled independently per clip, and the residual compensation is realized by optimizing anchor offsets relative to the inherited anchors from the prior clip. Inherited decoders are initialized with the converged parameters of the previous clip and then fine-tuned at a lower learning rate. We will make these definitions explicit with equations in the revised §3.1–3.2, clarify the initialization and update rules, and add discussion showing that the design avoids introducing new transition artifacts (supported by our visual results across clip boundaries). revision: yes
Circularity Check
No circularity: architectural description with no self-referential equations or fitted predictions.
full rationale
The abstract and provided text describe a hybrid Clip-Stream framework that divides sequences into clips, uses clip-independent spatio-temporal fields plus residual anchor compensation for local motion, and employs inter-clip inherited anchors/decoders for consistency. No equations, parameter-fitting steps, or derivation chains are exhibited that would reduce any claimed performance metric (e.g., flicker-free reconstruction or temporal coherence) to quantities defined by the inputs themselves. The central claims are presented as consequences of the design choices rather than tautological re-statements. No self-citations, uniqueness theorems, or ansatzes are invoked in the given material. This is a standard non-circular finding for a methods paper whose contributions are algorithmic rather than derived from closed-form identities.
Axiom & Free-Parameter Ledger
free parameters (1)
- clip_length
axioms (1)
- domain assumption Gaussian splatting can be extended to dynamic scenes via spatio-temporal fields and anchor mechanisms
invented entities (3)
-
clip-independent spatio-temporal fields
no independent evidence
-
residual anchor compensation
no independent evidence
-
inter-clip inherited anchors and decoders
no independent evidence
Reference graph
Works this paper leans on
-
[1]
What is xr? towards a framework for augmented and virtual reality.Computers in human be- havior, 133:107289, 2022
Philipp A Rauschnabel, Reto Felix, Chris Hinsch, Hamza Shahab, and Florian Alt. What is xr? towards a framework for augmented and virtual reality.Computers in human be- havior, 133:107289, 2022. 2
2022
-
[2]
John Wiley & Sons, 2003
Grigore C Burdea and Philippe Coiffet.Virtual reality tech- nology. John Wiley & Sons, 2003
2003
-
[3]
What is mixed reality? InProceedings of the 2019 CHI conference on human factors in computing systems, pages 1–15, 2019
Maximilian Speicher, Brian D Hall, and Michael Nebeling. What is mixed reality? InProceedings of the 2019 CHI conference on human factors in computing systems, pages 1–15, 2019. 2
2019
-
[4]
Mip-nerf: A multiscale representation for anti-aliasing neu- ral radiance fields
Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neu- ral radiance fields. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 5855–5864,
-
[5]
Tri-miprf: Tri-mip represen- tation for efficient anti-aliasing neural radiance fields
Wenbo Hu, Yuling Wang, Lin Ma, Bangbang Yang, Lin Gao, Xiao Liu, and Yuewen Ma. Tri-miprf: Tri-mip represen- tation for efficient anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 19774–19783, 2023
2023
-
[6]
Cl-mvsnet: Unsupervised multi-view stereo with dual-level contrastive learning, 2025
Kaiqiang Xiong, Rui Peng, Zhe Zhang, Tianxing Feng, Jianbo Jiao, Feng Gao, and Ronggang Wang. Cl-mvsnet: Unsupervised multi-view stereo with dual-level contrastive learning, 2025
2025
-
[7]
Zip-nerf: Anti-aliased grid-based neural radiance fields
Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Zip-nerf: Anti-aliased grid-based neural radiance fields. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 19697–19705, 2023
2023
-
[8]
Tensorf: Tensorial radiance fields
Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. Tensorf: Tensorial radiance fields. InEuropean con- ference on computer vision, pages 333–350. Springer, 2022
2022
-
[9]
Structure consistent gaussian splatting with matching prior for few-shot novel view syn- thesis, 2024
Rui Peng, Wangze Xu, Luyang Tang, Liwei Liao, Jianbo Jiao, and Ronggang Wang. Structure consistent gaussian splatting with matching prior for few-shot novel view syn- thesis, 2024
2024
-
[10]
Yang Deng, Zhanke Wang, Jiahao Wu, Jie Liang, Jingui Ma, Yang Hu, and Ronggang Wang. Pano-gs: Perception- aware gaussian optimization with gradient consistency and multi-criteria densification for high-quality rendering.Pro- ceedings of the AAAI Conference on Artificial Intelligence, 40(5):3560–3568, Mar. 2026
2026
-
[11]
Mip-nerf 360: Unbounded anti-aliased neural radiance fields
Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5470–5479, 2022. 2
2022
-
[12]
From capture to display: A survey on volumetric video
Yili Jin, Kaiyuan Hu, Junhua Liu, Fangxin Wang, and Xue Liu. From capture to display: A survey on volumetric video. arXiv preprint arXiv:2309.05658, 2023. 2
-
[13]
Mixed neural voxels for fast multi-view video synthesis.arXiv preprint arXiv:2212.00190, 2022
Feng Wang, Sinan Tan, Xinghang Li, Zeyue Tian, and Huap- ing Liu. Mixed neural voxels for fast multi-view video syn- thesis.arXiv preprint arXiv:2212.00190, 2022. 6
-
[14]
Neural 3d video synthesis from multi-view video
Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, et al. Neural 3d video synthesis from multi-view video. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 5521–5531, 2022. 6, 7
2022
-
[15]
4d visualization of dynamic events from unconstrained multi-view videos
Aayush Bansal, Minh V o, Yaser Sheikh, Deva Ramanan, and Srinivasa Narasimhan. 4d visualization of dynamic events from unconstrained multi-view videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5366–5375, 2020
2020
-
[16]
Neural vol- umes: Learning dynamic renderable volumes from images
Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. Neural vol- umes: Learning dynamic renderable volumes from images. arXiv preprint arXiv:1906.07751, 2019
-
[17]
High-quality video view interpolation using a layered representation.ACM transactions on graphics (TOG), 23(3):600–608, 2004
C Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and Richard Szeliski. High-quality video view interpolation using a layered representation.ACM transactions on graphics (TOG), 23(3):600–608, 2004. 2
2004
-
[18]
Instant gaussian stream: Fast and generalizable streaming of dy- namic scene reconstruction via gaussian splatting
Jinbo Yan, Rui Peng, Zhiyan Wang, Luyang Tang, Jiayu Yang, Jie Liang, Jiahao Wu, and Ronggang Wang. Instant gaussian stream: Fast and generalizable streaming of dy- namic scene reconstruction via gaussian splatting. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 16520–16531, 2025. 2
2025
-
[19]
Plenoxels: Radiance fields without neural networks
Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5501–5510, 2022. 2
2022
-
[20]
Stopthepop: Sorted gaussian splatting for view-consistent real-time rendering.ACM Transactions on Graphics (TOG), 43(4):1–17, 2024
Lukas Radl, Michael Steiner, Mathias Parger, Alexan- der Weinrauch, Bernhard Kerbl, and Markus Steinberger. Stopthepop: Sorted gaussian splatting for view-consistent real-time rendering.ACM Transactions on Graphics (TOG), 43(4):1–17, 2024
2024
-
[21]
Zehao Yu, Torsten Sattler, and Andreas Geiger. Gaussian opacity fields: Efficient and compact surface reconstruc- tion in unbounded scenes.arXiv preprint arXiv:2404.10772,
-
[22]
Kerui Ren, Lihan Jiang, Tao Lu, Mulin Yu, Linning Xu, Zhangkai Ni, and Bo Dai. Octree-gs: Towards consistent real-time rendering with lod-structured 3d gaussians.arXiv preprint arXiv:2403.17898, 2024. 2
-
[23]
Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accu- rate radiance fields.arXiv preprint arXiv:2403.17888, 2024. 2, 6
-
[24]
arXiv preprint arXiv:2406.01467 (2024)
Baowen Zhang, Chuan Fang, Rakesh Shrestha, Yixun Liang, Xiaoxiao Long, and Ping Tan. Rade-gs: Rasterizing depth in gaussian splatting.arXiv preprint arXiv:2406.01467, 2024
-
[25]
Intrinsic geometry-appearance consistency optimization for sparse-view gaussian splatting, 2026
Kaiqiang Xiong, Rui Peng, Jiahao Wu, Zhanke Wang, Jie Liang, Xiaoyun Zheng, Feng Gao, and Ronggang Wang. Intrinsic geometry-appearance consistency optimization for sparse-view gaussian splatting, 2026. 2
2026
-
[26]
Dynamic 3d gaussians: Tracking by per- sistent dynamic view synthesis
Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by per- sistent dynamic view synthesis. In3DV, 2024. 2, 7
2024
-
[27]
3dgstream: On-the-fly training of 3d gaussians for efficient streaming of photo-realistic free- viewpoint videos
Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, and Wei Xing. 3dgstream: On-the-fly training of 3d gaussians for efficient streaming of photo-realistic free- viewpoint videos. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 20675–20685, 2024. 2, 6
2024
-
[28]
Zeyu Yang, Hongye Yang, Zijie Pan, Xiatian Zhu, and Li Zhang. Real-time photorealistic dynamic scene repre- sentation and rendering with 4d gaussian splatting.arXiv preprint arXiv:2310.10642, 2023. 2, 3, 6
-
[29]
4d gaussian splatting for real-time dynamic scene rendering
Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20310–20320, 2024. 2, 3, 6, 7
2024
-
[30]
Spacetime gaus- sian feature splatting for real-time dynamic view synthesis
Zhan Li, Zhang Chen, Zhong Li, and Yi Xu. Spacetime gaus- sian feature splatting for real-time dynamic view synthesis. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8508–8520, 2024. 2, 3, 6
2024
-
[31]
Jiahao Wu, Rui Peng, Jianbo Jiao, Jiayu Yang, Luyang Tang, Kaiqiang Xiong, Jie Liang, Jinbo Yan, Runling Liu, and Rong Wang. Localdygs: Multi-view global dynamic scene modeling via adaptive local implicit feature decou- pling.ArXiv, abs/2507.02363, 2025. 2, 4, 6, 7
-
[32]
Nerf-ds: Neural ra- diance fields for dynamic specular objects
Zhiwen Yan, Chen Li, and Gim Hee Lee. Nerf-ds: Neural ra- diance fields for dynamic specular objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8285–8295, 2023. 2
2023
-
[33]
Pear: Pixel-aligned expressive human mesh recovery, 2026
Jiahao Wu, Yunfei Liu, Lijian Lin, Ye Zhu, Lei Zhu, Jingyi Li, and Yu Li. Pear: Pixel-aligned expressive human mesh recovery, 2026
2026
-
[34]
Deformable 3d gaussians for high- fidelity monocular dynamic scene reconstruction
Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. Deformable 3d gaussians for high- fidelity monocular dynamic scene reconstruction. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20331–20341, 2024. 3
2024
-
[35]
Nerfies: Deformable neural radiance fields
Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. Nerfies: Deformable neural radiance fields. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 5865–5874, 2021. 3
2021
-
[36]
HyperReel: High-fidelity 6-DoF video with ray- conditioned sampling.arXiv preprint arXiv:2301.02238,
Benjamin Attal, Jia-Bin Huang, Christian Richardt, Michael Zollhoefer, Johannes Kopf, Matthew O’Toole, and Changil Kim. HyperReel: High-fidelity 6-DoF video with ray- conditioned sampling.arXiv preprint arXiv:2301.02238,
-
[37]
Nerf- player: A streamable dynamic scene representation with de- composed neural radiance fields.IEEE Transactions on Visu- alization and Computer Graphics, 29(5):2732–2742, 2023
Liangchen Song, Anpei Chen, Zhong Li, Zhang Chen, Lele Chen, Junsong Yuan, Yi Xu, and Andreas Geiger. Nerf- player: A streamable dynamic scene representation with de- composed neural radiance fields.IEEE Transactions on Visu- alization and Computer Graphics, 29(5):2732–2742, 2023. 6
2023
-
[38]
Mixed neural voxels for fast multi- view video synthesis
Feng Wang, Sinan Tan, Xinghang Li, Zeyue Tian, Yafei Song, and Huaping Liu. Mixed neural voxels for fast multi- view video synthesis. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision, pages 19706– 19716, 2023. 2, 6
2023
-
[39]
Swift4d:adaptive divide-and-conquer gaussian splatting for compact and efficient reconstruction of dynamic scene,
Jiahao Wu, Rui Peng, Zhiyan Wang, Lu Xiao, Luyang Tang, Jinbo Yan, Kaiqiang Xiong, and Ronggang Wang. Swift4d:adaptive divide-and-conquer gaussian splatting for compact and efficient reconstruction of dynamic scene,
-
[40]
Streaming radiance fields for 3d video synthe- sis.Advances in Neural Information Processing Systems, 35:13485–13498, 2022
Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, and Ping Tan. Streaming radiance fields for 3d video synthe- sis.Advances in Neural Information Processing Systems, 35:13485–13498, 2022. 2, 6
2022
-
[41]
4k4d: Real-time 4d view synthesis at 4k resolution
Zhen Xu, Sida Peng, Haotong Lin, Guangzhao He, Jiaming Sun, Yujun Shen, Hujun Bao, and Xiaowei Zhou. 4k4d: Real-time 4d view synthesis at 4k resolution. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20029–20040, 2024
2024
-
[42]
Grid4D: 4D decomposed hash encoding for high-fidelity dynamic gaus- sian splatting.The Thirty-eighth Annual Conference on Neu- ral Information Processing Systems, 2024
Xu Jiawei, Fan Zexin, Yang Jian, and Xie Jin. Grid4D: 4D decomposed hash encoding for high-fidelity dynamic gaus- sian splatting.The Thirty-eighth Annual Conference on Neu- ral Information Processing Systems, 2024. 2, 7
2024
-
[43]
Plenoxels: Radiance fields without neural networks
Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InCVPR, 2022. 2
2022
-
[44]
Compressing streamable free- viewpoint videos to 0.1 mb per frame
Luyang Tang, Jiayu Yang, Rui Peng, Yongqi Zhai, Shihe Shen, and Ronggang Wang. Compressing streamable free- viewpoint videos to 0.1 mb per frame. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 7257–7265, 2025. 2, 6
2025
-
[45]
Hicom: Hierarchical coherent motion for dynamic streamable scenes with 3d gaussian splat- ting.Advances in Neural Information Processing Systems, 37:80609–80633, 2024
Qiankun Gao, Jiarui Meng, Chengxiang Wen, Jie Chen, and Jian Zhang. Hicom: Hierarchical coherent motion for dynamic streamable scenes with 3d gaussian splat- ting.Advances in Neural Information Processing Systems, 37:80609–80633, 2024. 2, 6
2024
-
[46]
K-planes: Explicit radiance fields in space, time, and appearance
Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 12479–12488, 2023. 2, 6
2023
-
[47]
Hexplane: A fast representa- tion for dynamic scenes
Ang Cao and Justin Johnson. Hexplane: A fast representa- tion for dynamic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 130–141, 2023. 2, 6
2023
-
[48]
D-nerf: Neural radiance fields for dynamic scenes
Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. D-nerf: Neural radiance fields for dynamic scenes. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 10318–10327, 2021. 3
2021
-
[49]
Sc-gs: Sparse-controlled gaussian splatting for editable dynamic scenes
Yi-Hua Huang, Yang-Tian Sun, Ziyi Yang, Xiaoyang Lyu, Yan-Pei Cao, and Xiaojuan Qi. Sc-gs: Sparse-controlled gaussian splatting for editable dynamic scenes. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4220–4230, 2024. 3
2024
-
[50]
Dynmf: Neural motion factorization for real-time dynamic view synthesis with 3d gaussian splatting,
Agelos Kratimenos, Jiahui Lei, and Kostas Daniilidis. Dynmf: Neural motion factorization for real-time dynamic view synthesis with 3d gaussian splatting.arXiv preprint arXiv:2312.00112, 2023. 3
-
[51]
Gaussian-flow: 4d reconstruction with dynamic 3d gaus- sian particle
Youtian Lin, Zuozhuo Dai, Siyu Zhu, and Yao Yao. Gaussian-flow: 4d reconstruction with dynamic 3d gaus- sian particle. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21136– 21145, 2024. 3
2024
-
[52]
arXiv preprint arXiv:2403.12365 (2024)
Quankai Gao, Qiangeng Xu, Zhe Cao, Ben Mildenhall, Wen- chao Ma, Le Chen, Danhang Tang, and Ulrich Neumann. Gaussianflow: Splatting gaussian dynamics for 4d content creation.arXiv preprint arXiv:2403.12365, 2024. 3
-
[53]
Neural scene flow fields for space-time view synthesis of dy- namic scenes
Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. Neural scene flow fields for space-time view synthesis of dy- namic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. 3
2021
-
[54]
3d gaussian splatting for real-time radiance field rendering.ACM Trans
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1,
-
[55]
Ewa splatting.IEEE Transactions on Visual- ization and Computer Graphics, 8(3):223–238, 2002
Matthias Zwicker, Hanspeter Pfister, Jeroen Van Baar, and Markus Gross. Ewa splatting.IEEE Transactions on Visual- ization and Computer Graphics, 8(3):223–238, 2002. 4
2002
-
[56]
Scaffold-gs: Structured 3d gaussians for view-adaptive rendering
Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20654–20664, 2024. 4, 6
2024
-
[57]
Structure- from-motion revisited
Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 4104–4113, 2016. 4
2016
-
[58]
High fidelity aggregated planar prior assisted patchmatch multi-view stereo
Jie Liang, Rongjie Wang, Rui Peng, Zhe Zhang, Kaiqiang Xiong, and Ronggang Wang. High fidelity aggregated planar prior assisted patchmatch multi-view stereo. InProceedings of the 32nd ACM International Conference on Multimedia, MM ’24, page 3141–3150, New York, NY , USA, 2024. As- sociation for Computing Machinery. 4
2024
-
[59]
Open3D: A Modern Library for 3D Data Processing
Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. Open3D: A modern library for 3D data processing.arXiv:1801.09847,
work page internal anchor Pith review arXiv
-
[60]
Mix- ture of volumetric primitives for efficient neural rendering
Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, and Jason Saragih. Mix- ture of volumetric primitives for efficient neural rendering. ACM Transactions on Graphics (ToG), 40(4):1–13, 2021. 6
2021
-
[61]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,
work page internal anchor Pith review Pith/arXiv arXiv
-
[62]
https://www.avs.org.cn/
A VS. https://www.avs.org.cn/. 2024. 6
2024
-
[63]
Neural scene flow fields for space-time view synthesis of dy- namic scenes
Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. Neural scene flow fields for space-time view synthesis of dy- namic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6498– 6508, 2021. 6
2021
-
[64]
4dgc: Rate-aware 4d gaussian compression for ef- ficient streamable free-viewpoint video
Qiang Hu, Zihan Zheng, Houqiang Zhong, Sihua Fu, Li Song, Xiaoyun Zhang, Guangtao Zhai, and Yanfeng Wang. 4dgc: Rate-aware 4d gaussian compression for ef- ficient streamable free-viewpoint video. InProceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 875–885, June 2025. 6
2025
-
[65]
4k4d: Real-time 4d view synthesis at 4k resolution
Zhen Xu, Sida Peng, Haotong Lin, Guangzhao He, Jiaming Sun, Yujun Shen, Hujun Bao, and Xiaowei Zhou. 4k4d: Real-time 4d view synthesis at 4k resolution. InCVPR, 2024. 7
2024
-
[66]
Efficient neural radiance fields for interactive free-viewpoint video
Haotong Lin, Sida Peng, Zhen Xu, Yunzhi Yan, Qing Shuai, Hujun Bao, and Xiaowei Zhou. Efficient neural radiance fields for interactive free-viewpoint video. InSIGGRAPH Asia Conference Proceedings, 2022. 7
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.