arxiv: 2605.14525 · v1 · submitted 2026-05-14 · 💻 cs.CV

Recognition: unknown

From Sparse to Dense: Spatio-Temporal Fusion for Multi-View 3D Human Pose Estimation with DenseWarper

Ling Li , Changjie Chen , Yuyan Wang , Jiaqing Lyu , Kenglun Chang , Yiyun Chen , Zhidong Deng

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:16 UTC · model grok-4.3

classification 💻 cs.CV

keywords poseinputmethodsparseestimationhumanimagesmulti-view

0 comments

The pith

Sparse interleaved multi-view inputs with DenseWarper outperform traditional dense simultaneous multi-view methods for 3D human pose estimation on Human3.6M and MPI-INF-3DHP datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Traditional multi-view 3D pose estimation uses photos taken at the exact same moment from several cameras to locate body joints in 3D space. This paper instead uses photos from different cameras taken at slightly different moments, such as one view now and another a fraction of a second later. The idea is that these staggered times add useful motion information while still providing spatial cues. A new model called DenseWarper moves heatmaps of possible joint locations between views using basic camera geometry rules. Experiments on two standard datasets show the sparse staggered inputs produce more accurate poses than using all views at once and reach top reported results. The approach also claims it can output poses at higher rates than the input camera speed and needs less data overall.

Core claim

Results demonstrate that our method, utilizing only sparse interleaved images as input, outperforms traditional dense multi-view input approaches and achieves state-of-the-art performance.

Load-bearing premise

That temporal offsets in the interleaved views can be reliably bridged by epipolar-geometry-based heatmap exchange without introducing motion-induced errors or losing spatial precision.

Figures

Figures reproduced from arXiv: 2605.14525 by Changjie Chen, Jiaqing Lyu, Kenglun Chang, Ling Li, Yiyun Chen, Yuyan Wang, Zhidong Deng.

**Figure 1.** Figure 1: Common approaches for 3D multi-view pose estimation. (a) Our proposed sparse interleaved input, where each view selects a single temporally interleaved image as input to leverage spatio-temporal information across views fully; (b) illustration of dense, full-frame multi-view input; (c) keypoint interpolation input, which enhances the output frame rate; and (d) illustration of single-view image input. We pr… view at source ↗

**Figure 2.** Figure 2: Overview of the DenseWarper architecture. A sliding window is used to sample sparse interleaved images, with a 2D pose estimation model generating initial heatmaps for each view. Missing information is filled to create uncorrected heatmaps. These are then spatially fused and corrected using an epipolar geometry-based method, yielding a spatially fused heatmap. Deformable convolutions are then applied for t… view at source ↗

**Figure 3.** Figure 3: Epipolar geometry-based spatial heatmap fusion architecture. (a) Geometric interpretation of the point-line relationship for keypoints across different views; (b) the pipeline for spatial heatmap fusion based on epipolar geometry. For an inaccurate heatmap point q, we use accurate points q ′ from other views to correct it. First, we compute the corresponding epipolar lines in the other two heatmaps. Then, … view at source ↗

**Figure 4.** Figure 4: The structure of the temporal fusion module (Warper). We perform temporal correction based on the initial corrected heatmaps obtained from multi-view spatial fusion. For each heatmap in a target time frame (i.e., non-diagonal heatmaps in the figure), we compute its difference with the corresponding accurate heatmap in the same view (the diagonal heatmap) and apply a temporal pose feature learning module to… view at source ↗

**Figure 5.** Figure 5: Results of spatiotemporal heatmap fusion and correction using different frame intervals on the Human3.6M dataset. The camera sampling interval in the Human3.6M dataset is 50ms. Panels (a), (b), and (c) represent the results of spatial heatmap fusion with frame intervals of 1 frame, 6 frames, and 12 frames, respectively. Ours Ground Truth Input [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization results showing the effects during continuous motion. Ours Ground Truth Input [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

**Figure 7.** Figure 7: Visualization results demonstrating the effects during complex motion. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Illustration of frame rate enhancement through interleaved multi-view input. The camera frame rate f = 1/δt, where δt represents the camera sampling interval. With a fixed camera frame rate, the input frame rate can be effectively increased using interleaved multi-view inputs, reaching up to M × f, where M denotes the number of camera viewpoints. E SUPPLEMENTARY EXPERIMENTAL RESULTS In the experiments, we … view at source ↗

read the original abstract

In multi-view 3D human pose estimation, models typically rely on images captured simultaneously from different camera views to predict a pose at a specific moment. While providing accurate spatial information, this traditional approach often overlooks the rich temporal dependencies between adjacent frames. We propose a novel 3D human pose estimation input method: the sparse interleaved input to address this. This method leverages images captured from different camera views at various time points (e.g., View 1 at time $t$ and View 2 at time $t+\delta$), allowing our model to capture rich spatio-temporal information and effectively boost performance. More importantly, this approach offers two key advantages: First, it can theoretically increase the output pose frame rate by N times with N cameras, thereby breaking through single-view frame rate limitations and enhancing the temporal resolution of the production. Second, using a sparse subset of available frames, our method can reduce data redundancy and simultaneously achieve better performance. We introduce the DenseWarper model, which leverages epipolar geometry for efficient spatio-temporal heatmap exchange. We conducted extensive experiments on the Human3.6M and MPI-INF-3DHP datasets. Results demonstrate that our method, utilizing only sparse interleaved images as input, outperforms traditional dense multi-view input approaches and achieves state-of-the-art performance. The source code for this work is available at: https://github.com/lingli1724/DenseWarper-ICLR2026

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Sparse interleaved multi-view inputs with epipolar heatmap warping can raise effective frame rate and beat dense simultaneous baselines on standard benchmarks, but the motion compensation claim needs tighter checks.

read the letter

The main thing to know is that this paper swaps the usual simultaneous multi-view captures for sparse interleaved frames taken from different cameras at small time offsets, then fuses the heatmaps with a DenseWarper module that follows epipolar lines. They report that this beats the dense simultaneous baseline on Human3.6M and MPI-INF-3DHP while also allowing higher output frame rates from the same hardware. The code is released, which helps anyone who wants to test the implementation directly. That combination of input change and geometry-guided fusion is the clearest departure from prior multi-view pipelines. The experiments use standard benchmarks and show gains with less data redundancy, which is a practical upside if the numbers hold up under closer inspection. The epipolar warping itself is a straightforward application of existing geometry rather than a new invention, so the novelty sits mostly in how the sparse temporal sampling is paired with it. The soft spot is the lack of detail on how well the warping handles actual motion. The central claim depends on the network being able to correct for any misalignment caused by body movement during the time delta, yet the abstract gives no bound on acceptable offset size or ablations that separate slow versus fast actions. If the test subjects mostly move slowly, the reported outperformance could be optimistic for more dynamic cases where projected points land off the true 3D location. A reader working on multi-view pose for animation or robotics would find the input format worth trying, especially since the code is public. The work is coherent on its own terms and uses established geometry without obvious circularity, so it deserves peer review to pressure-test the motion assumption and the experimental controls.

Referee Report

2 major / 1 minor

Summary. The paper proposes DenseWarper, a spatio-temporal fusion model for multi-view 3D human pose estimation that accepts sparse interleaved inputs (e.g., View 1 at time t and View 2 at t+δ) instead of simultaneous dense multi-view frames. It uses epipolar geometry to exchange heatmaps across views and time, claiming this captures richer temporal information, outperforms traditional dense simultaneous inputs, achieves SOTA on Human3.6M and MPI-INF-3DHP, reduces data redundancy, and can theoretically increase output frame rate by a factor of N with N cameras.

Significance. If the central claim holds under rigorous validation, the work would enable higher temporal resolution in 3D pose estimation without requiring perfectly synchronized dense captures, which could benefit applications with bandwidth or synchronization constraints while also reducing input redundancy. The release of source code at the cited GitHub link is a positive factor for reproducibility.

major comments (2)

[Abstract, §4] Abstract and experimental section: the claim that sparse interleaved inputs outperform dense simultaneous multi-view baselines is load-bearing for the paper's contribution, yet no ablations isolate the effect of temporal offset δ, no quantitative bound on acceptable δ is given, and no error analysis versus motion speed is reported; this leaves the outperformance dependent on the untested assumption that epipolar exchange fully compensates for joint displacements without injecting spatial error.
[§3] Method description: the DenseWarper heatmap exchange step projects epipolar lines between time-offset views, but the manuscript provides no explicit comparison or metric quantifying motion-induced misalignment against a simultaneous dense baseline, which is required to substantiate that the fused features preserve or improve 3D accuracy.

minor comments (1)

[Abstract] The abstract states results on two benchmarks but does not specify the exact train/test splits, camera configurations, or evaluation protocol used; adding these details would improve clarity without altering the core claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below, clarifying the support for our claims while committing to targeted revisions that strengthen the experimental validation without altering the core contributions.

read point-by-point responses

Referee: [Abstract, §4] Abstract and experimental section: the claim that sparse interleaved inputs outperform dense simultaneous multi-view baselines is load-bearing for the paper's contribution, yet no ablations isolate the effect of temporal offset δ, no quantitative bound on acceptable δ is given, and no error analysis versus motion speed is reported; this leaves the outperformance dependent on the untested assumption that epipolar exchange fully compensates for joint displacements without injecting spatial error.

Authors: We agree that isolating the effect of δ and providing motion-speed analysis would strengthen the evidence. The current results on Human3.6M and MPI-INF-3DHP demonstrate consistent outperformance of sparse interleaved inputs over dense simultaneous baselines, with the epipolar-line projection in DenseWarper designed to maintain 3D geometric consistency across small temporal offsets. To directly address the concern, we will add an ablation varying δ, report performance as a function of joint velocity derived from ground truth, and include a practical bound on acceptable δ in the revised manuscript. revision: yes
Referee: [§3] Method description: the DenseWarper heatmap exchange step projects epipolar lines between time-offset views, but the manuscript provides no explicit comparison or metric quantifying motion-induced misalignment against a simultaneous dense baseline, which is required to substantiate that the fused features preserve or improve 3D accuracy.

Authors: Section 3 describes the epipolar projection for heatmap exchange to enable spatio-temporal fusion. While an explicit misalignment metric is not currently reported, the superior 3D accuracy in the experimental tables indicates effective compensation. We will add a quantitative comparison in the revised method section, computing average joint displacement between time-offset frames using ground-truth poses and contrasting it against the simultaneous dense case to explicitly quantify any residual misalignment. revision: yes

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Report based on abstract only; no explicit free parameters, axioms, or invented entities can be extracted. The model presumably inherits standard assumptions from deep pose networks and projective geometry.

pith-pipeline@v0.9.0 · 5588 in / 1007 out tokens · 49730 ms · 2026-05-15T02:16:19.046838+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

211 extracted references · 211 canonical work pages

[1]

Scaling Learning Algorithms Towards

Bengio, Yoshua and LeCun, Yann , booktitle =. Scaling Learning Algorithms Towards

work page
[2]

and Osindero, Simon and Teh, Yee Whye , journal =

Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee Whye , journal =. A Fast Learning Algorithm for Deep Belief Nets , volume =

work page
[3]

2016 , publisher=

Deep learning , author=. 2016 , publisher=

work page 2016
[4]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

Rhodin, Helge and Sp. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

work page
[5]

Baumgartner, Tobias and Klatt, Stefanie , booktitle=

work page
[6]

Jiang, Boyuan and Hu, Lei and Xia, Shihong , booktitle=

work page
[7]

Qiu, Haibo and Wang, Chunyu and Wang, Jingdong and Wang, Naiyan and Zeng, Wenjun , booktitle=

work page
[8]

Wu, Size and Jin, Sheng and Liu, Wentao and Bai, Lei and Qian, Chen and Liu, Dong and Ouyang, Wanli , booktitle=

work page
[9]

Chen, Long and Ai, Haizhou and Chen, Rui and Zhuang, Zijie and Liu, Shuang , booktitle=

work page
[10]

He, Yihui and Yan, Rui and Fragkiadaki, Katerina and Yu, Shoou-I , booktitle=

work page
[11]

Hyla, Pawel , journal=

work page
[12]

Ionescu, Catalin and Papava, Dragos and Olaru, Vlad and Sminchisescu, Cristian , journal=

work page
[13]

Li, Sijin and Chan, Antoni B , booktitle=

work page
[14]

Martinez, Julieta and Hossain, Rayat and Romero, Javier and Little, James J , booktitle=

work page
[15]

Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris N , booktitle=

work page
[16]

Pavlakos, Georgios and Zhu, Luyang and Zhou, Xiaowei and Daniilidis, Kostas , booktitle=

work page
[17]

Hartley, Richard and Zisserman, Andrew , journal=

work page
[18]

Remelli, Edoardo and Han, Shangchen and Honari, Sina and Fua, Pascal and Wang, Robert , booktitle=

work page
[19]

Hossain, Mir Rayat Imtiaz and Little, James J , booktitle=

work page
[20]

Pavllo, Dario and Feichtenhofer, Christoph and Grangier, David and Auli, Michael , booktitle=

work page
[21]

Graham, Benjamin and Engelcke, Martin and van der Maaten, Laurens , booktitle=

work page
[22]

Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng , booktitle=

work page
[23]

Zhang, Kaihao and Luo, Wenhan and Zhong, Yiran and Ma, Lin and Liu, Wei and Li, Hongdong , journal=

work page
[24]

Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen , booktitle=

work page
[25]

Zheng, Ce and Zhu, Shiwei and Mendieta, Matias and Yang, Taojiannan and Chen, Chen and Ding, Zhengming , booktitle=

work page
[26]

Sun, Xiao and Xiao, Bin and Wei, Fang and Liang, Shuang and Wei, Yichen , booktitle=

work page
[27]

Zheng, Ce and Wu, Wentao and Yang, Tianyu and Zhu, Shiwei and Chen, Chen and Liu, Ronggang and Shen, Jianfei and Kehtarnavaz, Nasser and Shah, Mubarak , journal=

work page
[28]

Zou, Zhe and Huang, Yanyu and Lu, Yang and Liu, Feng , booktitle=

work page
[29]

Mehta, Dushyant and Rhodin, Helge and Casas, Dan and Fua, Pascal and Sotnychenko, Oleksandr and Xu, Weipeng and Theobalt, Christian , booktitle=

work page
[30]

Yu, Fisher and Koltun, Vladlen and Funkhouser, Thomas , booktitle=

work page
[31]

Chen, Ziyi and Sugimoto, Akihiro and Lai, Shang-Hong , booktitle=

work page
[32]

Su, Yukun and Lin, Guosheng and Wu, Qingyao , booktitle=

work page
[33]

Yu, Bruce XB and Zhang, Zhi and Liu, Yongxu and Zhong, Sheng-hua and Liu, Yan and Chen, Chang Wen , booktitle=

work page
[34]

Peng, Jihua and Zhou, Yanghong and Mok, PY , booktitle=

work page
[35]

Ma, Haoyu and Wang, Zhe and Chen, Yifei and Kong, Deying and Chen, Liangjian and Liu, Xingwei and Yan, Xiangyi and Tang, Hao and Xie, Xiaohui , booktitle=

work page
[36]

Chen, Yilun and Wang, Zhicheng and Peng, Yuxiang and Zhang, Zhiqiang and Yu, Gang and Sun, Jian , booktitle=

work page
[37]

Xiao, Bin and Wu, Haiping and Wei, Yichen , booktitle=

work page
[38]

Bridgeman, Lewis and Volino, Marco and Guillemaut, Jean-Yves and Hilton, Adrian , booktitle=

work page
[39]

Chu, Hau and Lee, Jia-Hong and Lee, Yao-Chih and Hsu, Ching-Hsien and Li, Jia-Da and Chen, Chu-Song , booktitle=

work page
[40]

Hindle, Benjamin R and Keogh, Justin WL and Lorimer, Anna V , journal=

work page
[41]

Menolotto, Matteo and Komaris, Dimitrios-Sokratis and Tedesco, Salvatore and O’Flynn, Brendan and Walsh, Michael , journal=

work page
[42]

Jiang, Wenjie and Yin, Yongkai and Jiao, Junpeng and Zhao, Xian and Sun, Baoqing , journal=

work page
[43]

Construction and Building Materials , volume=

Mei, Qipei and G. Construction and Building Materials , volume=

work page
[44]

Lee, Chunggi and Kim, Yeonjun and Jin, Seungmin and Kim, Dongmin and Maciejewski, Ross and Ebert, David and Ko, Sungahn , journal=

work page
[45]

Yang, Ming and Wang, Shige and Bakita, Joshua and Vu, Thanh and Smith, F Donelson and Anderson, James H and Frahm, Jan-Michael , booktitle=

work page
[46]

Zhou, Kangkang and Zhang, Lijun and Lu, Feng and Zhou, Xiang-Dong and Shi, Yu , booktitle=

work page
[47]

Sturm, Peter , booktitle=

work page
[48]

Moon, Gyeongsik and Chang, Ju Yong and Lee, Kyoung Mu , booktitle=

work page
[49]

Park, Sungheon and Hwang, Jihye and Kwak, Nojun , booktitle=

work page
[50]

Pavlakos, Georgios and Zhou, Xiaowei and Daniilidis, Kostas , booktitle=

work page
[51]

Tekin, Bugra and Rozantsev, Artem and Lepetit, Vincent and Fua, Pascal , booktitle=

work page
[52]

Wehrbein, Tom and Rudolph, Marco and Rosenhahn, Bodo and Wandt, Bastian , booktitle=

work page
[53]

Chen, Tianlang and Fang, Chen and Shen, Xiaohui and Zhu, Yiheng and Chen, Zhili and Luo, Jiebo , journal=

work page
[54]

Liu, Ruixu and Shen, Ju and Wang, He and Chen, Chen and Cheung, Sen-ching and Asari, Vijayan , booktitle=

work page
[55]

Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

Zheng, Ce and Zhu, Sijie and Mendieta, Matias and Yang, Taojiannan and Chen, Chen and Ding, Zhengming , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

work page
[56]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Zhao, Qitao and Zheng, Ce and Liu, Mengyuan and Wang, Pichao and Chen, Chen , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[57]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Li, Wenhao and Liu, Hong and Tang, Hao and Wang, Pichao and Van Gool, Luc , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[58]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Zhang, Jinlu and Tu, Zhigang and Yang, Jianyu and Chen, Yujin and Yuan, Junsong , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[59]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Tang, Zhenhua and Qiu, Zhaofan and Hao, Yanbin and Hong, Richang and Yao, Ting , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[60]

Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

Shan, Wenkang and Liu, Zhenhua and Zhang, Xinfeng and Wang, Zhao and Han, Kai and Wang, Shanshe and Ma, Siwei and Gao, Wen , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

work page
[61]

Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

Cai, Yujun and Ge, Liuhao and Liu, Jun and Cai, Jianfei and Cham, Tat-Jen and Yuan, Junsong and Thalmann, Nadia Magnenat , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

work page
[62]

Hu, Wenbo and Zhang, Changgong and Zhan, Fangneng and Zhang, Lei and Wong, Tien-Tsin , booktitle=

work page
[63]

Liu, Kenkun and Ding, Rongqi and Zou, Zhiming and Wang, Le and Tang, Wei , booktitle=

work page
[64]

Wang, Jingbo and Yan, Sijie and Xiong, Yuanjun and Lin, Dahua , booktitle=

work page
[65]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Xu, Tianhan and Takano, Wataru , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[66]

and Zhang, Zhi and Liu, Yongxu and Zhong, Sheng-hua and Liu, Yan and Chen, Chang Wen , title =

Yu, Bruce X.B. and Zhang, Zhi and Liu, Yongxu and Zhong, Sheng-hua and Liu, Yan and Chen, Chang Wen , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

work page
[67]

, title =

Zhao, Long and Peng, Xi and Tian, Yu and Kapadia, Mubbasir and Metaxas, Dimitris N. , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[68]

Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

Zou, Zhiming and Tang, Wei , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , year =

work page
[69]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Gong, Jia and Foo, Lin Geng and Fan, Zhipeng and Ke, Qiuhong and Rahmani, Hossein and Liu, Jun , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[70]

Li, Han and Shi, Bowen and Dai, Wenrui and Zheng, Hongwei and Wang, Botao and Sun, Yu and Guo, Min and Li, Chenglin and Zou, Junni and Xiong, Hongkai , booktitle=

work page
[71]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Zhao, Weixi and Wang, Weiqiang and Tian, Yunjie , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[72]

Zhu, Yiran and Xu, Xing and Shen, Fumin and Ji, Yanli and Gao, Lianli and Shen, Heng Tao , booktitle=

work page
[73]

and Picard, David and Tabia, Hedi , title =

Luvizon, Diogo C. and Picard, David and Tabia, Hedi , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , year =

work page
[74]

Luvizon, Diogo C and Tabia, Hedi and Picard, David , journal=

work page
[75]

Moon, Gyeongsik and Lee, Kyoung Mu , booktitle=

work page
[76]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , year =

work page
[77]

2017 , pages=

Vaswani, A , booktitle=. 2017 , pages=

work page 2017
[78]

Han, Xiao and Ren, Yiming and Yao, Yichen and Sun, Yujing and Ma, Yuexin , booktitle=

work page
[79]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

Li, Jialian and Zhang, Jingyi and Wang, Zhiyong and Shen, Siqi and Wen, Chenglu and Ma, Yuexin and Xu, Lan and Yu, Jingyi and Wang, Cheng , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

work page
[80]

Ren, Yiming and Zhao, Chengfeng and He, Yannan and Cong, Peishan and Liang, Han and Yu, Jingyi and Xu, Lan and Ma, Yuexin , journal=

work page

Showing first 80 references.