Recognition: unknown
SynAgent: Generalizable Cooperative Humanoid Manipulation via Solo-to-Cooperative Agent Synergy
Pith reviewed 2026-05-10 05:11 UTC · model grok-4.3
The pith
SynAgent transfers single-agent human-object skills to multi-agent cooperative humanoid manipulation via retargeting and policy adaptation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SynAgent is a unified framework that enables scalable and physically plausible cooperative manipulation by leveraging Solo-to-Cooperative Agent Synergy to transfer skills from single-agent human-object interaction to multi-agent human-object-human scenarios. It maintains semantic integrity during motion transfer with an interaction-preserving retargeting method based on an Interact Mesh constructed via Delaunay tetrahedralization, which faithfully maintains spatial relationships among humans and objects. Building on this data, it uses a single-agent pretraining and adaptation paradigm that bootstraps synergistic collaborative behaviors through decentralized training and multi-agent PPO, and,
What carries the argument
Solo-to-Cooperative Agent Synergy, the pipeline that retargets solo motions via an Interact Mesh and adapts them with decentralized PPO plus a conditional VAE policy distilled from motion priors.
If this is right
- Cooperative imitation and trajectory control can be achieved without collecting new multi-agent motion data.
- Policies generalize across diverse object geometries after training on retargeted solo interactions.
- Decentralized training with multi-agent PPO produces stable collaborative behaviors from single-agent priors.
- A conditional VAE policy enables controllable object-level trajectory execution in multi-human settings.
Where Pith is reading between the lines
- The same retargeting approach could reduce data collection costs for other multi-agent tasks such as collaborative assembly or transport.
- If the mesh construction preserves contacts reliably, it may serve as a general bridge between single-robot and multi-robot motion datasets.
- Real-world validation on physical humanoids would test whether simulation-trained policies transfer without additional fine-tuning.
Load-bearing premise
The retargeting method using the Interact Mesh from Delaunay tetrahedralization faithfully maintains spatial relationships and semantic integrity of human-object interactions when moving from solo to cooperative scenarios.
What would settle it
Demonstrating that retargeted cooperative motions contain unnatural intersections, broken contacts, or lost interaction semantics, or that the resulting policy fails to track object trajectories accurately on unseen object shapes.
Figures
read the original abstract
Controllable cooperative humanoid manipulation is a fundamental yet challenging problem for embodied intelligence, due to severe data scarcity, complexities in multi-agent coordination, and limited generalization across objects. In this paper, we present SynAgent, a unified framework that enables scalable and physically plausible cooperative manipulation by leveraging Solo-to-Cooperative Agent Synergy to transfer skills from single-agent human-object interaction to multi-agent human-object-human scenarios. To maintain semantic integrity during motion transfer, we introduce an interaction-preserving retargeting method based on an Interact Mesh constructed via Delaunay tetrahedralization, which faithfully maintains spatial relationships among humans and objects. Building upon this refined data, we propose a single-agent pretraining and adaptation paradigm that bootstraps synergistic collaborative behaviors from abundant single-human data through decentralized training and multi-agent PPO. Finally, we develop a trajectory-conditioned generative policy using a conditional VAE, trained via multi-teacher distillation from motion imitation priors to achieve stable and controllable object-level trajectory execution. Extensive experiments demonstrate that SynAgent significantly outperforms existing baselines in both cooperative imitation and trajectory-conditioned control, while generalizing across diverse object geometries. Codes and data will be available after publication. Project Page: http://yw0208.github.io/synagent
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents SynAgent, a unified framework for controllable cooperative humanoid manipulation. It leverages Solo-to-Cooperative Agent Synergy to transfer skills from single-agent human-object interaction to multi-agent human-object-human scenarios using an interaction-preserving retargeting method based on an Interact Mesh constructed via Delaunay tetrahedralization. The framework includes a single-agent pretraining and adaptation paradigm with decentralized training and multi-agent PPO, and a trajectory-conditioned generative policy using a conditional VAE trained via multi-teacher distillation. The authors claim that extensive experiments show significant outperformance over baselines in cooperative imitation and trajectory-conditioned control, with generalization across diverse object geometries.
Significance. If the central claims hold, this work could meaningfully advance embodied AI by addressing data scarcity in multi-agent cooperative manipulation through efficient transfer from abundant solo demonstrations. The synergy of geometric retargeting, RL-based pretraining, and generative policies provides a scalable paradigm that may improve physical plausibility and object-level generalization in humanoid control tasks.
major comments (1)
- [Abstract and Methods (retargeting description)] The interaction-preserving retargeting method based on an Interact Mesh constructed via Delaunay tetrahedralization (described in the abstract and methods) is load-bearing for the Solo-to-Cooperative Agent Synergy paradigm and the downstream physical-plausibility claims. Delaunay tetrahedralization is a static geometric construction on keypoints that does not encode contact normals, friction, or velocity constraints; retargeting from solo to cooperative scenarios can therefore introduce or remove contacts, alter penetration depths, or change force transmission paths. Without quantitative validation (e.g., contact preservation metrics or physics simulation checks on the transferred motions), the pretraining data for multi-agent PPO and the conditional VAE may contain artifacts that undermine both generalization and physical plausibility assertions.
minor comments (1)
- [Abstract] The abstract asserts 'significant outperformance' and 'extensive experiments' but supplies no quantitative results, baselines, error bars, or ablation details. A brief summary of key metrics (e.g., success rates or trajectory errors) would improve clarity without altering the technical content.
Simulated Author's Rebuttal
We are grateful to the referee for their detailed and constructive feedback on our manuscript. The major comment raises an important point about validating the retargeting method, which we address below. We have incorporated additional quantitative analysis in the revised manuscript to strengthen the presentation of our results.
read point-by-point responses
-
Referee: The interaction-preserving retargeting method based on an Interact Mesh constructed via Delaunay tetrahedralization (described in the abstract and methods) is load-bearing for the Solo-to-Cooperative Agent Synergy paradigm and the downstream physical-plausibility claims. Delaunay tetrahedralization is a static geometric construction on keypoints that does not encode contact normals, friction, or velocity constraints; retargeting from solo to cooperative scenarios can therefore introduce or remove contacts, alter penetration depths, or change force transmission paths. Without quantitative validation (e.g., contact preservation metrics or physics simulation checks on the transferred motions), the pretraining data for multi-agent PPO and the conditional VAE may contain artifacts that undermine both generalization and physical plausibility assertions.
Authors: We thank the referee for this insightful observation. The Interact Mesh via Delaunay tetrahedralization is intended to preserve the geometric configuration of the interaction by connecting keypoints in a way that reflects their spatial arrangement in the solo scenario. Since the retargeting is applied to adapt the solo motion to a cooperative setting while keeping the mesh intact, the contacts defined by close proximity in the original data are maintained through the preserved tetrahedron volumes and edge lengths. That said, we agree that additional quantitative evidence would be beneficial to support the physical plausibility claims. In the revised manuscript, we have included new experiments in Section 4.3 that evaluate the retargeted motions using physics-based simulation. Specifically, we report metrics such as the percentage of preserved contacts (defined as pairs with distance < 5cm), average penetration depth, and force transmission consistency. These results indicate minimal artifacts, with contact preservation rates above 92% and average penetration under 2cm, thereby validating the method for use in pretraining the multi-agent PPO and conditional VAE. We believe this revision addresses the concern and reinforces the effectiveness of the Solo-to-Cooperative Agent Synergy paradigm. revision: yes
Circularity Check
No circularity: derivation chain is self-contained with independent components
full rationale
The paper introduces an interaction-preserving retargeting method (Interact Mesh via Delaunay tetrahedralization), a single-agent pretraining/adaptation paradigm using decentralized PPO, and a CVAE-based generative policy with multi-teacher distillation. None of these reduce by construction to their inputs or to self-citations; each is presented as a newly proposed technique building on standard RL and generative modeling. No fitted parameters are relabeled as predictions, no uniqueness theorems are imported from prior self-work, and no ansatzes are smuggled via citation. The central claims rest on the empirical performance of these independent additions rather than definitional equivalence.
Axiom & Free-Parameter Ledger
free parameters (2)
- multi-agent PPO hyperparameters
- conditional VAE architecture parameters
axioms (2)
- domain assumption Delaunay tetrahedralization of the Interact Mesh preserves semantic and spatial integrity of human-object interactions
- domain assumption Single-agent pretraining data contains transferable synergistic behaviors for multi-agent coordination
invented entities (2)
-
Interact Mesh
no independent evidence
-
Solo-to-Cooperative Agent Synergy paradigm
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Perpetual humanoid control for real-time simulated avatars,
Z. Luo, J. Cao, K. Kitani, W. Xuet al., “Perpetual humanoid control for real-time simulated avatars,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 10 895–10 904
2023
-
[2]
Universal humanoid motion representations for physics-based control,
Z. Luo, J. Cao, J. Merel, A. Winkler, J. Huang, K. Kitani, and W. Xu, “Universal humanoid motion representations for physics-based control,” arXiv preprint arXiv:2310.04582, 2023
-
[3]
Omni- grasp: Grasping diverse objects with simulated humanoids,
Z. Luo, J. Cao, S. Christen, A. Winkler, K. Kitani, and W. Xu, “Omni- grasp: Grasping diverse objects with simulated humanoids,”Advances in Neural Information Processing Systems, vol. 37, pp. 2161–2184, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12
2024
-
[4]
Masked- mimic: Unified physics-based character control through masked motion inpainting,
C. Tessler, Y . Guo, O. Nabati, G. Chechik, and X. B. Peng, “Masked- mimic: Unified physics-based character control through masked motion inpainting,”ACM Transactions on Graphics (TOG), vol. 43, no. 6, pp. 1–21, 2024
2024
-
[5]
Amass: Archive of motion capture as surface shapes,
N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. J. Black, “Amass: Archive of motion capture as surface shapes,” inProceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 5442–5451
2019
-
[6]
Intergen: Diffusion- based multi-human motion generation under complex interactions,
H. Liang, W. Zhang, W. Li, J. Yu, and L. Xu, “Intergen: Diffusion- based multi-human motion generation under complex interactions,” International Journal of Computer Vision, pp. 1–21, 2024
2024
-
[7]
Inter-x: Towards versatile human-human interaction analysis,
L. Xu, X. Lv, Y . Yan, X. Jin, S. Wu, C. Xu, Y . Liu, Y . Zhou, F. Rao, X. Shenget al., “Inter-x: Towards versatile human-human interaction analysis,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 22 260–22 271
2024
-
[8]
Grab: A dataset of whole-body human grasping of objects,
O. Taheri, N. Ghorbani, M. J. Black, and D. Tzionas, “Grab: A dataset of whole-body human grasping of objects,” inEuropean conference on computer vision. Springer, 2020, pp. 581–600
2020
-
[9]
Au- tonomous character-scene interaction synthesis from text instruction,
N. Jiang, Z. He, Z. Wang, H. Li, Y . Chen, S. Huang, and Y . Zhu, “Au- tonomous character-scene interaction synthesis from text instruction,” in SIGGRAPH Asia 2024 Conference Papers, 2024, pp. 1–11
2024
-
[10]
Scaling up dynamic human-scene interaction modeling,
N. Jiang, Z. Zhang, H. Li, X. Ma, Z. Wang, Y . Chen, T. Liu, Y . Zhu, and S. Huang, “Scaling up dynamic human-scene interaction modeling,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 1737–1747
2024
-
[11]
Core4d: A 4d human-object-human interaction dataset for collaborative object rearrangement,
Y . Liu, C. Zhang, R. Xing, B. Tang, B. Yang, and L. Yi, “Core4d: A 4d human-object-human interaction dataset for collaborative object rearrangement,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 1769–1782
2025
-
[12]
Spatial relationship preserving character motion adaptation,
E. S. Ho, T. Komura, and C.-L. Tai, “Spatial relationship preserving character motion adaptation,” inACM SIGGRAPH 2010 papers, 2010, pp. 1–8
2010
-
[13]
The surprising effectiveness of ppo in cooperative multi-agent games,
C. Yu, A. Velu, E. Vinitsky, J. Gao, Y . Wang, A. Bayen, and Y . Wu, “The surprising effectiveness of ppo in cooperative multi-agent games,” Advances in neural information processing systems, vol. 35, pp. 24 611– 24 624, 2022
2022
-
[14]
Deepmimic: Example-guided deep reinforcement learning of physics-based character skills,
X. B. Peng, P. Abbeel, S. Levine, and M. Van de Panne, “Deepmimic: Example-guided deep reinforcement learning of physics-based character skills,”ACM Transactions On Graphics (TOG), vol. 37, no. 4, pp. 1–14, 2018
2018
-
[15]
Mimickit: A reinforcement learning framework for motion imitation and control
X. B. Peng, “Mimickit: A reinforcement learning framework for motion imitation and control,”arXiv preprint arXiv:2510.13794, 2025
-
[16]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox- imal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[17]
Physiinter: Integrating physical mapping for high-fidelity human interaction generation,
W. Yao, Y . Sun, C. Liu, H. Zhang, and J. Tang, “Physiinter: Integrating physical mapping for high-fidelity human interaction generation,”arXiv preprint arXiv:2506.07456, 2025
-
[18]
Sm- plolympics: Sports environments for physically simulated humanoids
Z. Luo, J. Wang, K. Liu, H. Zhang, C. Tessler, J. Wang, Y . Yuan, J. Cao, Z. Lin, F. Wanget al., “Smplolympics: Sports environments for physically simulated humanoids,”arXiv preprint arXiv:2407.00187, 2024
-
[19]
Skillmimic: Learning reusable basketball skills from demonstrations,
Y . Wang, Q. Zhao, R. Yu, A. Zeng, J. Lin, Z. Luo, H. W. Tsui, J. Yu, X. Li, Q. Chenet al., “Skillmimic: Learning reusable basketball skills from demonstrations,”arXiv e-prints, pp. arXiv–2408, 2024
2024
-
[20]
Learning agile soccer skills for a bipedal robot with deep reinforcement learning,
T. Haarnoja, B. Moran, G. Lever, S. H. Huang, D. Tirumala, J. Humplik, M. Wulfmeier, S. Tunyasuvunakool, N. Y . Siegel, R. Hafneret al., “Learning agile soccer skills for a bipedal robot with deep reinforcement learning,”Science Robotics, vol. 9, no. 89, p. eadi8022, 2024
2024
-
[21]
Pmp: Learning to physically interact with environments using part-wise motion priors,
J. Bae, J. Won, D. Lim, C.-H. Min, and Y . M. Kim, “Pmp: Learning to physically interact with environments using part-wise motion priors,” in ACM SIGGRAPH 2023 Conference Proceedings, 2023, pp. 1–10
2023
-
[22]
Simulation and retargeting of complex multi-character interactions,
Y . Zhang, D. Gopinath, Y . Ye, J. Hodgins, G. Turk, and J. Won, “Simulation and retargeting of complex multi-character interactions,” inACM SIGGRAPH 2023 Conference Proceedings, 2023, pp. 1–11
2023
-
[23]
Retargeting human-object interaction to virtual avatars,
Y . Kim, H. Park, S. Bang, and S.-H. Lee, “Retargeting human-object interaction to virtual avatars,”IEEE transactions on visualization and computer graphics, vol. 22, no. 11, pp. 2405–2412, 2016
2016
-
[24]
Skinned motion retargeting with preservation of body part relationships,
J.-Q. Zhang, M. Wang, F.-C. Zhang, and F.-L. Zhang, “Skinned motion retargeting with preservation of body part relationships,”IEEE Trans- actions on Visualization and Computer Graphics, 2024
2024
-
[25]
Learning human-to-humanoid real-time whole-body teleoperation,
T. He, Z. Luo, W. Xiao, C. Zhang, K. Kitani, C. Liu, and G. Shi, “Learning human-to-humanoid real-time whole-body teleoperation,” in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 8944–8951
2024
-
[26]
L. Yang, X. Huang, Z. Wu, A. Kanazawa, P. Abbeel, C. Sferrazza, C. K. Liu, R. Duan, and G. Shi, “Omniretarget: Interaction-preserving data generation for humanoid whole-body loco-manipulation and scene interaction,”arXiv preprint arXiv:2509.26633, 2025
-
[27]
Spider: Scalable physics-informed dexterous retargeting,
C. Pan, C. Wang, H. Qi, Z. Liu, H. Bharadhwaj, A. Sharma, T. Wu, G. Shi, J. Malik, and F. Hogan, “Spider: Scalable physics-informed dexterous retargeting,”arXiv preprint arXiv:2511.09484, 2025
-
[28]
A two-part transformer network for controllable motion synthesis,
S. Hou, H. Tao, H. Bao, and W. Xu, “A two-part transformer network for controllable motion synthesis,”IEEE Transactions on Visualization and Computer Graphics, vol. 30, no. 8, pp. 5047–5062, 2023
2023
-
[29]
Guess: Gradually enriching synthesis for text-driven human motion generation,
X. Gao, Y . Yang, Z. Xie, S. Du, Z. Sun, and Y . Wu, “Guess: Gradually enriching synthesis for text-driven human motion generation,”IEEE Transactions on Visualization and Computer Graphics, vol. 30, no. 12, pp. 7518–7530, 2024
2024
-
[30]
Text2hoi: Text-guided 3d motion generation for hand-object interaction,
J. Cha, J. Kim, J. S. Yoon, and S. Baek, “Text2hoi: Text-guided 3d motion generation for hand-object interaction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 1577–1585
2024
-
[31]
Diffh2o: Diffusion-based synthesis of hand- object interactions from textual descriptions,
S. Christen, S. Hampali, F. Sener, E. Remelli, T. Hodan, E. Sauser, S. Ma, and B. Tekin, “Diffh2o: Diffusion-based synthesis of hand- object interactions from textual descriptions,” inSIGGRAPH Asia 2024 Conference Papers, 2024, pp. 1–11
2024
-
[32]
Task-oriented human-object interactions generation with implicit neural representations,
Q. Li, J. Wang, C. C. Loy, and B. Dai, “Task-oriented human-object interactions generation with implicit neural representations,” inProceed- ings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 3035–3044
2024
-
[33]
Diff-ip2d: Diffusion-based hand-object interaction prediction on egocentric videos,
J. Ma, J. Xu, X. Chen, and H. Wang, “Diff-ip2d: Diffusion-based hand-object interaction prediction on egocentric videos,”arXiv preprint arXiv:2405.04370, 2024
-
[34]
Gaze-guided hand-object interaction synthesis: Dataset and method,
J. Tian, R. Ji, L. Yang, S. Ni, Y . Ma, L. Xu, J. Yu, Y . Shi, and J. Wang, “Gaze-guided hand-object interaction synthesis: Dataset and method,” arXiv preprint arXiv:2403.16169, 2024
-
[35]
Artigrasp: Physically plausible synthesis of bi-manual dexterous grasping and articulation,
H. Zhang, S. Christen, Z. Fan, L. Zheng, J. Hwangbo, J. Song, and O. Hilliges, “Artigrasp: Physically plausible synthesis of bi-manual dexterous grasping and articulation,” in2024 International Conference on 3D Vision (3DV). IEEE, 2024, pp. 235–246
2024
-
[36]
Manidext: Hand-object manipulation synthesis via continuous corre- spondence embeddings and residual-guided diffusion,
J. Zhang, Y . Zhang, L. An, M. Li, H. Zhang, Z. Hu, and Y . Liu, “Manidext: Hand-object manipulation synthesis via continuous corre- spondence embeddings and residual-guided diffusion,”IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 2025
2025
-
[37]
Compositional 3d human-object neural animation,
Z. Hou, B. Yu, and D. Tao, “Compositional 3d human-object neural animation,”arXiv preprint arXiv:2304.14070, 2023
-
[38]
Ncho: Unsupervised learning for neural 3d composition of humans and objects,
T. Kim, S. Saito, and H. Joo, “Ncho: Unsupervised learning for neural 3d composition of humans and objects,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 14 817–14 828
2023
-
[39]
Object pop-up: Can we infer 3d objects and their poses from human interactions alone?
I. A. Petrov, R. Marin, J. Chibane, and G. Pons-Moll, “Object pop-up: Can we infer 3d objects and their poses from human interactions alone?” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4726–4736
2023
-
[40]
Intertrack: Tracking human object interaction without object templates,
X. Xie, J. E. Lenssen, and G. Pons-Moll, “Intertrack: Tracking human object interaction without object templates,” in2025 International Conference on 3D Vision (3DV). IEEE, 2025, pp. 1427–1439
2025
-
[41]
Person in place: Generating associative skeleton-guidance maps for human-object interaction image editing,
C. Yang, C. Kang, K. Kong, H. Oh, and S.-J. Kang, “Person in place: Generating associative skeleton-guidance maps for human-object interaction image editing,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 8164–8175
2024
-
[42]
Lemon: Learning 3d human-object interaction relation from 2d images,
Y . Yang, W. Zhai, H. Luo, Y . Cao, and Z.-J. Zha, “Lemon: Learning 3d human-object interaction relation from 2d images,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 16 284–16 295
2024
-
[43]
Imos: Intent-driven full-body motion synthesis for human-object interactions,
A. Ghosh, R. Dabral, V . Golyanik, C. Theobalt, and P. Slusallek, “Imos: Intent-driven full-body motion synthesis for human-object interactions,” inComputer Graphics Forum, vol. 42, no. 2. Wiley Online Library, 2023, pp. 1–12
2023
-
[44]
The kit bimanual manipulation dataset,
F. Krebs, A. Meixner, I. Patzer, and T. Asfour, “The kit bimanual manipulation dataset,” in2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids). IEEE, 2021, pp. 499–506
2021
-
[45]
Nifty: Neural object interaction fields for guided human motion synthesis,
N. Kulkarni, D. Rempe, K. Genova, A. Kundu, J. Johnson, D. Fouhey, and L. Guibas, “Nifty: Neural object interaction fields for guided human motion synthesis,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 947–957
2024
-
[46]
Locomotion-action-manipulation: Synthesizing human-scene interactions in complex 3d environments,
J. Lee and H. Joo, “Locomotion-action-manipulation: Synthesizing human-scene interactions in complex 3d environments,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 9663–9674
2023
-
[47]
Object motion guided human motion synthesis,
J. Li, J. Wu, and C. K. Liu, “Object motion guided human motion synthesis,”ACM Transactions on Graphics (TOG), vol. 42, no. 6, pp. 1–11, 2023. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13
2023
-
[48]
Action-conditioned generation of bimanual object manipulation sequences,
H. Razali and Y . Demiris, “Action-conditioned generation of bimanual object manipulation sequences,” inProceedings of the AAAI conference on artificial intelligence, vol. 37, no. 2, 2023, pp. 2146–2154
2023
-
[49]
Goal: Generating 4d whole-body motion for hand-object grasping,
O. Taheri, V . Choutas, M. J. Black, and D. Tzionas, “Goal: Generating 4d whole-body motion for hand-object grasping,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 263–13 273
2022
-
[50]
Learn to predict how humans manipulate large-sized objects from interactive motions,
W. Wan, L. Yang, L. Liu, Z. Zhang, R. Jia, Y .-K. Choi, J. Pan, C. Theobalt, T. Komura, and W. Wang, “Learn to predict how humans manipulate large-sized objects from interactive motions,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4702–4709, 2022
2022
-
[51]
Saga: Stochastic whole-body grasping with contact,
Y . Wu, J. Wang, Y . Zhang, S. Zhang, O. Hilliges, F. Yu, and S. Tang, “Saga: Stochastic whole-body grasping with contact,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 257–274
2022
-
[52]
D3d-hoi: Dynamic 3d human-object interactions from videos.arXiv preprint arXiv:2108.08420, 2021
X. Xu, H. Joo, G. Mori, and M. Savva, “D3d-hoi: Dynamic 3d human- object interactions from videos,”arXiv preprint arXiv:2108.08420, 2021
-
[53]
Couch: Towards controllable human-chair interactions,
X. Zhang, B. L. Bhatnagar, S. Starke, V . Guzov, and G. Pons-Moll, “Couch: Towards controllable human-chair interactions,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 518–535
2022
-
[54]
Synthesizing diverse human motions in 3d indoor scenes,
K. Zhao, Y . Zhang, S. Wang, T. Beeler, and S. Tang, “Synthesizing diverse human motions in 3d indoor scenes,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 14 738–14 749
2023
-
[55]
Hosig: Full-body human-object-scene interaction generation with hierarchical scene perception
W. Yao, Y . Sun, H. Zhang, Y . Liu, and J. Tang, “Hosig: Full-body human-object-scene interaction generation with hierarchical scene per- ception,”arXiv preprint arXiv:2506.01579, 2025
-
[56]
Pose2gaze: Eye-body coordination during daily activities for gaze prediction from full-body poses,
Z. Hu, J. Xu, S. Schmitt, and A. Bulling, “Pose2gaze: Eye-body coordination during daily activities for gaze prediction from full-body poses,”IEEE Transactions on Visualization and Computer Graphics, 2024
2024
-
[57]
Machine learning approaches for 3d motion synthesis and musculoskeletal dynamics estimation: a survey,
I. Loi, E. I. Zacharaki, and K. Moustakas, “Machine learning approaches for 3d motion synthesis and musculoskeletal dynamics estimation: a survey,”IEEE transactions on visualization and computer graphics, vol. 30, no. 8, pp. 5810–5829, 2023
2023
-
[58]
Multi-character physical and behavioral interactions controller,
J. Vaillant, K. Bouyarmane, and A. Kheddar, “Multi-character physical and behavioral interactions controller,”IEEE transactions on visualiza- tion and computer graphics, vol. 23, no. 6, pp. 1650–1662, 2016
2016
-
[59]
Heterogeneous crowd simulation using parametric reinforcement learning,
K. Hu, B. Haworth, G. Berseth, V . Pavlovic, P. Faloutsos, and M. Kapa- dia, “Heterogeneous crowd simulation using parametric reinforcement learning,”IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 4, pp. 2036–2052, 2021
2036
-
[60]
Evolution-based shape and behavior co-design of virtual agents,
Z. Wang, B. Benes, A. H. Qureshi, and C. Mousas, “Evolution-based shape and behavior co-design of virtual agents,”IEEE Transactions on Visualization and Computer Graphics, vol. 30, no. 12, pp. 7579–7591, 2024
2024
-
[61]
Real-time physics-based 3d biped character animation using an inverted pendulum model,
Y .-Y . Tsai, W.-C. Lin, K. B. Cheng, J. Lee, and T.-Y . Lee, “Real-time physics-based 3d biped character animation using an inverted pendulum model,”IEEE transactions on visualization and computer graphics, vol. 16, no. 2, pp. 325–337, 2009
2009
-
[62]
Coohoi: Learning cooperative human-object interaction with manipulated object dynamics,
J. Gao, Z. Wang, Z. Xiao, J. Wang, T. Wang, J. Cao, X. Hu, S. Liu, J. Dai, and J. Pang, “Coohoi: Learning cooperative human-object interaction with manipulated object dynamics,”Advances in Neural Information Processing Systems, vol. 37, pp. 79 741–79 763, 2024
2024
-
[63]
Adam: A Method for Stochastic Optimization
D. P. Kingma, “Adam: A method for stochastic optimization,”arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[64]
Expressive body capture: 3d hands, face, and body from a single image,
G. Pavlakos, V . Choutas, N. Ghorbani, T. Bolkart, A. A. Osman, D. Tzionas, and M. J. Black, “Expressive body capture: 3d hands, face, and body from a single image,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10 975–10 985
2019
-
[65]
Smoothnet: A plug-and-play network for refining human poses in videos,
A. Zeng, L. Yang, X. Ju, J. Li, J. Wang, and Q. Xu, “Smoothnet: A plug-and-play network for refining human poses in videos,”ArXiv, vol. abs/2112.13715, 2021. [Online]. Available: https://api.semanticscholar. org/CorpusID:245502027
-
[66]
Intermimic: Towards universal whole-body control for physics-based human-object interac- tions,
S. Xu, H. Y . Ling, Y .-X. Wang, and L.-Y . Gui, “Intermimic: Towards universal whole-body control for physics-based human-object interac- tions,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 12 266–12 277. Wei Yaoreceived the B.E. degree from the Uni- versity of South China, Hengyang, China, in 2021. He is now a Ph.D...
2025
-
[67]
degree with the Department of Automation, Tsinghua University, Beijing, China, under the supervision of Prof
He is currently pursuing the M.S. degree with the Department of Automation, Tsinghua University, Beijing, China, under the supervision of Prof. Yebin Liu. His current research interests include embodied AI, with a specific focus on mobile manipulation and dexterous manipulation for humanoid robots. His work involves reinforcement learning for whole- body ...
2021
-
[68]
Her research interests include power big data anal- ysis, artificial intelligence, fault diagnosis, and other applications in energy and power systems
She is currently an Assistant Professor with the Shenzhen Institute of Science and Technology, Chinese Academy of Sciences, Shenzhen, China. Her research interests include power big data anal- ysis, artificial intelligence, fault diagnosis, and other applications in energy and power systems. Yebin Liu(Member, IEEE) received the BE degree from the Beijing ...
2002
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.