DAMA: Disentangled Body-Anchored Gaussians for Controllable Multi-Layered Avatars
Pith reviewed 2026-05-21 05:57 UTC · model grok-4.3
The pith
DAMA reconstructs 3D avatars as layered Gaussians bound to body model faces, delivering non-penetrating garments and user-controlled stacking order from ordinary multi-view photos.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DAMA is the first Gaussian avatar reconstruction method from multi-view images to achieve physically plausible layering, clean garment separation, and explicit stacking control by binding Gaussians to SMPL-X faces via barycentric in-plane coordinates and a positive normal offset, then lifting 2D segmentations, applying topology-guided correction, and jointly optimizing geometry and appearance.
What carries the argument
Body-anchored Gaussians parameterized by barycentric in-plane coordinates on SMPL-X faces plus a positive normal offset, which enforces layer ordering and non-penetration during reconstruction and editing.
If this is right
- Produces state-of-the-art geometry accuracy, garment separation quality, and near-zero penetration depth on the full 4D-DRESS dataset of 82 scans.
- Supports immediate user-driven reordering of any garment layer on the finished avatar.
- Converts the layered Gaussian representation into simulation-ready meshes with little extra processing.
- Maintains high visual fidelity while preserving explicit layer boundaries that prior single-surface or entangled methods lose.
Where Pith is reading between the lines
- The same anchoring could supply clean initial geometry for physics-based cloth simulators that currently struggle with self-intersections.
- Extending the offset and barycentric binding to time-varying sequences might produce layered 4D avatars ready for animation without post-processing fixes.
- Because layers remain editable after reconstruction, the approach may reduce the manual cleanup now required for virtual try-on or character customization pipelines.
Load-bearing premise
Anchoring Gaussians to body faces with barycentric coordinates and a positive normal offset will by itself keep garment layers from intersecting and maintain correct topological order without any separate collision handling.
What would settle it
Reconstruct an avatar from multi-view images of a person wearing overlapping garments, reorder the layers in software, and check whether any rendered frame shows visible intersections or incorrect depth ordering between the layers.
Figures
read the original abstract
Existing 3D clothed avatar reconstruction methods achieve high visual fidelity but ignore geometric structure and physical plausibility. They either model clothed humans as a single deformable surface or attempt garment disentanglement without enforcing geometric constraints, resulting in ambiguous garment boundaries and no control over stacking or layer ordering. To address these limitations, we introduce DAMA (Disentangled body-Anchored Gaussians for Controllable Multi-layered Avatars), a 3D avatar reconstruction method that produces physically plausible clothed avatars through a dedicated representation and reconstruction method. At the representation level, we bind Gaussians to SMPL-X faces using barycentric in-plane coordinates and a positive normal offset. Based on this parameterization, the reconstruction method lifts 2D segmentations to body-anchored Gaussians, refines layers using topology-guided correction, and jointly optimizes geometry and appearance. DAMA is the first Gaussian avatar reconstruction method from multi-view images to achieve physically plausible layering, clean garment separation, and explicit stacking control. On the full 4D-DRESS dataset (82 scans), it achieves state-of-the-art performance in geometry reconstruction, garment separation, penetration rate, and penetration depth. The representation further supports user-defined garment reordering and fast conversion of body-conforming garments to simulation-ready meshes. Project Page: https://danieleskandar.github.io/dama/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces DAMA, a 3D avatar reconstruction method from multi-view images that produces controllable multi-layered clothed avatars. It defines a body-anchored Gaussian representation that binds each Gaussian to SMPL-X faces via barycentric in-plane coordinates plus a positive normal offset. The pipeline lifts 2D segmentations to these Gaussians, applies topology-guided correction for layer refinement, and performs joint optimization of geometry and appearance. The work claims to be the first Gaussian-based method to deliver physically plausible layering, clean garment separation, and explicit stacking control, reporting SOTA results on the full 4D-DRESS dataset (82 scans) for geometry reconstruction, garment separation, penetration rate, and penetration depth, plus downstream support for user-defined reordering and conversion to simulation-ready meshes.
Significance. If the binding parameterization and refinement steps reliably enforce non-penetrating, topologically ordered layers, the contribution would be significant for Gaussian avatar modeling. It moves beyond single-surface or unconstrained disentanglement approaches by embedding geometric structure and layer ordering directly into the representation, enabling practical applications such as garment reordering and mesh export for simulation.
major comments (2)
- [Abstract] Abstract (representation level): The central claim that the body-anchored Gaussian parameterization achieves physically plausible layering 'by construction' depends on binding via barycentric in-plane coordinates and positive normal offset from SMPL-X. No explicit collision, repulsion, or signed-distance loss term is referenced in the optimization; the topology-guided correction is described only as a refinement step. This leaves open whether inter-layer penetrations are prevented during joint optimization or only mitigated afterward, particularly for loose clothing or high-curvature regions.
- [Abstract] Abstract (results): The SOTA claims on geometry, separation, penetration rate, and depth are reported on the full 4D-DRESS dataset, yet the abstract provides no quantitative values, baseline comparisons, or ablation references. Without these details or a table citation, it is difficult to assess whether the reported low penetration metrics stem from the representation itself or from dataset-specific properties.
minor comments (1)
- [Abstract] The abstract states that the method 'supports user-defined garment reordering and fast conversion of body-conforming garments to simulation-ready meshes,' but does not indicate the computational cost or quality of the mesh conversion step; a brief complexity or timing reference would clarify the practical utility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below in detail and indicate where revisions will be made to improve clarity without misrepresenting the work.
read point-by-point responses
-
Referee: [Abstract] Abstract (representation level): The central claim that the body-anchored Gaussian parameterization achieves physically plausible layering 'by construction' depends on binding via barycentric in-plane coordinates and positive normal offset from SMPL-X. No explicit collision, repulsion, or signed-distance loss term is referenced in the optimization; the topology-guided correction is described only as a refinement step. This leaves open whether inter-layer penetrations are prevented during joint optimization or only mitigated afterward, particularly for loose clothing or high-curvature regions.
Authors: The body-anchored representation initializes and constrains each Gaussian via barycentric in-plane coordinates on SMPL-X faces together with a strictly positive normal offset. This design places all Gaussians outside the body surface by construction and prevents body penetration throughout optimization. For garment layers, the topology-guided correction is applied after lifting 2D segmentations and explicitly reorders Gaussians according to the underlying SMPL-X topology before and during joint optimization; this ordering is preserved because subsequent gradient updates operate on the already-corrected layer assignments. While no explicit repulsion or signed-distance loss is added to the objective, the combination of the parameterization and the topology-guided step is what produces the observed low penetration rates. We will revise the abstract to more precisely distinguish the representation-level constraints from the refinement procedure. revision: partial
-
Referee: [Abstract] Abstract (results): The SOTA claims on geometry, separation, penetration rate, and depth are reported on the full 4D-DRESS dataset, yet the abstract provides no quantitative values, baseline comparisons, or ablation references. Without these details or a table citation, it is difficult to assess whether the reported low penetration metrics stem from the representation itself or from dataset-specific properties.
Authors: We agree that the abstract would benefit from explicit numerical support for the SOTA claims. In the revised manuscript we will insert the key quantitative results (e.g., penetration rate and depth on the full 82-scan 4D-DRESS set) together with a direct citation to the corresponding table that compares against baselines. This addition will allow readers to immediately evaluate the magnitude of the improvements. revision: yes
Circularity Check
No circularity: DAMA parameterization is an explicit design choice evaluated on external data.
full rationale
The paper defines its core representation directly as binding Gaussians to SMPL-X faces via barycentric in-plane coordinates plus positive normal offset, then lifts 2D segmentations and performs joint optimization with topology-guided correction. No equations reduce the claimed physically plausible layering or garment separation to a fitted parameter optimized on the target result, nor to a self-citation chain or imported uniqueness theorem. Performance metrics on the full 4D-DRESS dataset (82 scans) are reported independently, making the derivation self-contained against external benchmarks rather than tautological.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption SMPL-X provides a topologically consistent mesh suitable for barycentric anchoring of Gaussians.
invented entities (1)
-
Body-anchored Gaussians with barycentric in-plane coordinates and positive normal offset
independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
bind Gaussians to SMPL-X faces using barycentric in-plane coordinates and a positive normal offset... prevents interpenetration with the body and lower layers
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DAMA is the first Gaussian avatar reconstruction method... physically plausible layering, clean garment separation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Layered-garment net: Generating multiple implicit garment layers from a single image
Alakh Aggarwal, Jikai Wang, Steven Hogue, Saifeng Ni, Madhukar Budagavi, and Xiaohu Guo. Layered-garment net: Generating multiple implicit garment layers from a single image. InProceedings of the Asian Conference on Computer Vision (ACCV), 2022. 3
work page 2022
-
[2]
Video based reconstruction of 3d people models
Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, and Gerard Pons-Moll. Video based reconstruction of 3d people models. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 2
work page 2018
-
[3]
imghum: Implicit generative models of 3d human shape and articulated pose
Thiemo Alldieck, Hongyi Xu, and Cristian Sminchisescu. imghum: Implicit generative models of 3d human shape and articulated pose. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV), pages 5461– 5470, 2021. 2
work page 2021
-
[4]
Close: A 3d clothing segmentation dataset and model
Dimitrije Anti ´c, Garvita Tiwari, Batuhan Ozcomlekci, Ric- cardo Marin, and Gerard Pons-Moll. Close: A 3d clothing segmentation dataset and model. In2024 international con- ference on 3D vision (3DV), pages 591–601. IEEE, 2024. 2
work page 2024
-
[5]
Multi-garment net: Learning to dress 3d people from images
Bharat Lal Bhatnagar, Garvita Tiwari, Christian Theobalt, and Gerard Pons-Moll. Multi-garment net: Learning to dress 3d people from images. InProceedings of the IEEE/CVF international conference on computer vision, pages 5420– 5430, 2019. 2, 3
work page 2019
-
[6]
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram V oleti, Adam Letts, et al. Stable video diffusion: Scaling latent video diffusion models to large datasets.arXiv preprint arXiv:2311.15127, 2023. 6
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[7]
Haodong Chen, Yongle Huang, Haojian Huang, Xiangsheng Ge, and Dian Shao. Gaussianvton: 3d human virtual try- on via multi-stage gaussian splatting editing with image prompting.arXiv preprint arXiv:2405.07472, 2024. 3
-
[8]
Gaussian wardrobe: Composi- tional 3d gaussian avatars for free-form virtual try-on
Zhiyi Chen, Hsuan-I Ho, Tianjian Jiang, Jie Song, Manuel Kaufmann, and Chen Guo. Gaussian wardrobe: Composi- tional 3d gaussian avatars for free-form virtual try-on. In Proceedings of the International Conference on 3D Vision (3DV), 2026. 2, 3, 5
work page 2026
-
[9]
CLO Virtual Fashion, Seoul, South Korea, 2026
CLO Virtual Fashion.CLO3D (Version 2025.2.368). CLO Virtual Fashion, Seoul, South Korea, 2026. Updated March 19, 2026. 8, 16
work page 2025
-
[10]
Smplicit: Topology-aware generative model for clothed people
Enric Corona, Albert Pumarola, Guillem Alenya, Ger- ard Pons-Moll, and Francesc Moreno-Noguer. Smplicit: Topology-aware generative model for clothed people. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 11875–11885,
-
[11]
Drapenet: Garment generation and self-supervised draping
Luca De Luigi, Ren Li, Beno ˆıt Guillard, Mathieu Salz- mann, and Pascal Fua. Drapenet: Garment generation and self-supervised draping. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1451–1460, 2023. 3
work page 2023
-
[12]
Tela: Text to layer-wise 3d clothed human generation
Junting Dong, Qi Fang, Zehuan Huang, Xudong Xu, Jingbo Wang, Sida Peng, and Bo Dai. Tela: Text to layer-wise 3d clothed human generation. InComputer Vision – ECCV 2024, pages 19–36, Cham, 2025. Springer Nature Switzer- land. 3
work page 2024
-
[13]
Zijian Dong, Longteng Duan, Jie Song, Michael J. Black, and Andreas Geiger. Moga: 3d generative avatar prior for monocular gaussian avatar reconstruction. InInternational Conference on Computer Vision (ICCV), 2025. 3
work page 2025
-
[14]
Capturing and animation of body and clothing from monocular video
Yao Feng, Jinlong Yang, Marc Pollefeys, Michael J Black, and Timo Bolkart. Capturing and animation of body and clothing from monocular video. InSIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022. 1, 2
work page 2022
-
[15]
Learning disentangled avatars with hybrid 3d representations.arXiv preprint arXiv:2309.06441, 2023
Yao Feng, Weiyang Liu, Timo Bolkart, Jinlong Yang, Marc Pollefeys, and Michael J Black. Learning disentangled avatars with hybrid 3d representations.arXiv preprint arXiv:2309.06441, 2023. 1, 2
-
[16]
Hood: Hierarchical graphs for generalized modelling of clothing dynamics
Artur Grigorev, Michael J Black, and Otmar Hilliges. Hood: Hierarchical graphs for generalized modelling of clothing dynamics. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16965– 16974, 2023. 3
work page 2023
-
[17]
ContourCraft: Learning to resolve intersections in neural multi-garment simulations
Artur Grigorev, Giorgio Becherini, Michael Black, Ot- mar Hilliges, and Bernhard Thomaszewski. ContourCraft: Learning to resolve intersections in neural multi-garment simulations. InACM SIGGRAPH 2024 Conference Papers, pages 1–10, 2024. 3
work page 2024
-
[18]
Vid2avatar: 3d avatar reconstruction from videos in the wild via self-supervised scene decomposition
Chen Guo, Tianjian Jiang, Xu Chen, Jie Song, and Otmar Hilliges. Vid2avatar: 3d avatar reconstruction from videos in the wild via self-supervised scene decomposition. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12858–12868, 2023. 2
work page 2023
-
[19]
Reloo: Reconstructing humans dressed in loose garments from monocular video in the wild
Chen Guo, Tianjian Jiang, Manuel Kaufmann, Chengwei Zheng, Julien Valentin, Jie Song, and Otmar Hilliges. Reloo: Reconstructing humans dressed in loose garments from monocular video in the wild. InEuropean conference on computer vision, pages 21–38. Springer, 2024. 1, 2, 3
work page 2024
-
[20]
Vid2avatar-pro: Authentic avatar from videos in the wild via universal prior
Chen Guo, Junxuan Li, Yash Kant, Yaser Sheikh, Shunsuke Saito, and Chen Cao. Vid2avatar-pro: Authentic avatar from videos in the wild via universal prior. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5559–5570, 2025. 2
work page 2025
-
[21]
Pgc: Physics-based gaussian cloth from a single pose
Michelle Guo, Matt Jen-Yuan Chiang, Igor Santesteban, Nikolaos Sarafianos, Hsiao-yu Chen, Oshri Halimi, Alja ˇz 9 Boˇziˇc, Shunsuke Saito, Jiajun Wu, C Karen Liu, et al. Pgc: Physics-based gaussian cloth from a single pose. InProceed- ings of the Computer Vision and Pattern Recognition Confer- ence, pages 21215–21225, 2025. 3
work page 2025
-
[22]
Livecap: Real-time human performance capture from monocular video.ACM Trans
Marc Habermann, Weipeng Xu, Michael Zollh ¨ofer, Gerard Pons-Moll, and Christian Theobalt. Livecap: Real-time human performance capture from monocular video.ACM Trans. Graph., 38(2), 2019. 2
work page 2019
-
[23]
Deepcap: Monocular human performance capture using weak supervision
Marc Habermann, Weipeng Xu, Michael Zollhofer, Gerard Pons-Moll, and Christian Theobalt. Deepcap: Monocular human performance capture using weak supervision. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 2
work page 2020
-
[24]
Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. Viton: An image-based virtual try-on network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 3
work page 2018
-
[25]
Vton 360: High-fidelity virtual try-on from any viewing direction
Zijian He, Yuwei Ning, Yipeng Qin, Guangrun Wang, Sibei Yang, Liang Lin, and Guanbin Li. Vton 360: High-fidelity virtual try-on from any viewing direction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 26388–26398, 2025. 1
work page 2025
-
[26]
Learn- ing locally editable virtual humans
Hsuan-I Ho, Lixin Xue, Jie Song, and Otmar Hilliges. Learn- ing locally editable virtual humans. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21024–21035, 2023. 3
work page 2023
-
[27]
Neural-abc: Neural parametric models for articulated body with clothes
Chen Honghu, Yao Yuxin, and Juyong Zhang. Neural-abc: Neural parametric models for articulated body with clothes. IEEE Transactions on Visualization and Computer Graphics,
-
[28]
Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang, Boyao Zhou, Boning Liu, Shengping Zhang, and Liqiang Nie. Gaussianavatar: Towards realistic human avatar model- ing from a single video via animatable 3d gaussians. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 1, 2
work page 2024
-
[29]
Gauhuman: Articu- lated gaussian splatting from monocular human videos
Shoukang Hu, Tao Hu, and Ziwei Liu. Gauhuman: Articu- lated gaussian splatting from monocular human videos. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 20418–20431, 2024. 1, 2
work page 2024
-
[30]
Humanliff: Layer-wise 3d human diffusion model: Humanliff: Layer- wise 3d human diffusion model.Int
Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, and Ziwei Liu. Humanliff: Layer-wise 3d human diffusion model: Humanliff: Layer- wise 3d human diffusion model.Int. J. Comput. Vision, 133 (9):5938–5957, 2025. 3
work page 2025
-
[31]
2d gaussian splatting for geometrically accu- rate radiance fields
Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accu- rate radiance fields. InSIGGRAPH 2024 Conference Papers. Association for Computing Machinery, 2024. 1, 2, 3, 5, 6, 14
work page 2024
-
[32]
Sith: Single- view textured human reconstruction with image-conditioned diffusion
Hsuan I Ho, Jie Song, and Otmar Hilliges. Sith: Single- view textured human reconstruction with image-conditioned diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 538–549, 2024. 3
work page 2024
-
[33]
Bcnet: Learning body and cloth shape from a single image
Boyi Jiang, Juyong Zhang, Yang Hong, Jinhao Luo, Ligang Liu, and Hujun Bao. Bcnet: Learning body and cloth shape from a single image. InEuropean Conference on Computer Vision, pages 18–35. Springer, 2020. 2, 3
work page 2020
-
[34]
In- stantavatar: Learning avatars from monocular video in 60 seconds
Tianjian Jiang, Xu Chen, Jie Song, and Otmar Hilliges. In- stantavatar: Learning avatars from monocular video in 60 seconds. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16922– 16932, 2023. 2, 3
work page 2023
-
[35]
Prioravatar: Efficient and robust avatar creation from monocular video using learned priors
Tianjian Jiang, Hsuan-I Ho, Manuel Kaufmann, and Jie Song. Prioravatar: Efficient and robust avatar creation from monocular video using learned priors. InProceedings of the SIGGRAPH Asia 2025 Conference Papers, pages 1–10,
work page 2025
-
[36]
Neuman: Neural human radiance field from a single video
Wei Jiang, Kwang Moo Yi, Golnoosh Samei, Oncel Tuzel, and Anurag Ranjan. Neuman: Neural human radiance field from a single video. InComputer Vision – ECCV 2022, pages 402–418, Cham, 2022. Springer Nature Switzerland. 2, 3
work page 2022
-
[37]
Total cap- ture: A 3d deformation model for tracking faces, hands, and bodies
Hanbyul Joo, Tomas Simon, and Yaser Sheikh. Total cap- ture: A 3d deformation model for tracking faces, hands, and bodies. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 2
work page 2018
-
[38]
Physhead: Simulation- ready gaussian head avatars, 2026
Berna Kabadayi, Vanessa Sklyarova, Wojciech Zielonka, Justus Thies, and Gerard Pons-Moll. Physhead: Simulation- ready gaussian head avatars, 2026. 3
work page 2026
-
[39]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42 (4), 2023. 1, 2, 3
work page 2023
-
[40]
Gala: Generating animatable layered assets from a sin- gle scan
Taeksoo Kim, Byungjun Kim, Shunsuke Saito, and Hanbyul Joo. Gala: Generating animatable layered assets from a sin- gle scan. InCVPR, 2024. 2, 3, 5, 6, 7
work page 2024
-
[41]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 4015–4026, 2023. 6
work page 2023
-
[42]
Muhammed Kocabas, Jen-Hao Rick Chang, James Gabriel, Oncel Tuzel, and Anurag Ranjan. Hugs: Human gaussian splats. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 505–515, 2024. 1, 2
work page 2024
-
[43]
Gart: Gaussian articulated template mod- els
Jiahui Lei, Yufu Wang, Georgios Pavlakos, Lingjie Liu, and Kostas Daniilidis. Gart: Gaussian articulated template mod- els. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 19876–19887,
-
[44]
Dig: Draping implicit garment over the human body
Ren Li, Benoit Guillard, Edoardo Remelli, and Pascal Fua. Dig: Draping implicit garment over the human body. In Proceedings of the Asian Conference on Computer Vision (ACCV), pages 2780–2795, 2022. 3
work page 2022
-
[45]
Tianye Li, Timo Bolkart, Michael. J. Black, Hao Li, and Javier Romero. Learning a model of facial shape and ex- pression from 4D scans.ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), 36(6):194:1–194:17, 2017. 3
work page 2017
-
[46]
Diffavatar: Simulation-ready garment optimization with differentiable simulation
Yifei Li, Hsiao-yu Chen, Egor Larionov, Nikolaos Sarafi- anos, Wojciech Matusik, and Tuur Stuyck. Diffavatar: Simulation-ready garment optimization with differentiable simulation. InProceedings of the IEEE/CVF Conference 10 on Computer Vision and Pattern Recognition (CVPR), pages 4368–4378, 2024. 3
work page 2024
-
[47]
Zhe Li, Zerong Zheng, Lizhen Wang, and Yebin Liu. Ani- matable gaussians: Learning pose-dependent gaussian maps for high-fidelity human avatar modeling. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19711–19722, 2024. 1, 2, 3
work page 2024
-
[48]
Tingting Liao, Hongwei Yi, Yuliang Xiu, Jiaxiang Tang, Yangyi Huang, Justus Thies, and Michael J. Black. TADA! Text to Animatable Digital Avatars. InInternational Confer- ence on 3D Vision (3DV), 2024. 3
work page 2024
-
[49]
Layga: Layered gaussian avatars for animatable clothing transfer
Siyou Lin, Zhe Li, Zhaoqi Su, Zerong Zheng, Hongwen Zhang, and Yebin Liu. Layga: Layered gaussian avatars for animatable clothing transfer. InSIGGRAPH Conference Pa- pers, 2024. 2, 3, 5
work page 2024
-
[50]
Gas: Generative avatar syn- thesis from a single image
Yixing Lu, Junting Dong, Youngjoong Kwon, Qin Zhao, Bo Dai, and Fernando De la Torre. Gas: Generative avatar syn- thesis from a single image. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 12883– 12893, 2025. 3
work page 2025
-
[51]
Troje, Ger- ard Pons-Moll, and Michael J
Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Ger- ard Pons-Moll, and Michael J. Black. AMASS: Archive of motion capture as surface shapes. InInternational Confer- ence on Computer Vision, pages 5442–5451, 2019. 8, 16
work page 2019
-
[52]
Occupancy networks: Learning 3d reconstruction in function space
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Se- bastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3d reconstruction in function space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), 2019. 2
work page 2019
-
[53]
Srinivasan, Matthew Tancik, Jonathan T
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. InECCV, 2020. 1, 2
work page 2020
-
[54]
3d clothed human reconstruction in the wild
Gyeongsik Moon, Hyeongjin Nam, Takaaki Shiratori, and Kyoung Mu Lee. 3d clothed human reconstruction in the wild. InEuropean conference on computer vision, pages 184–200. Springer, 2022. 3
work page 2022
-
[55]
Expressive whole-body 3d gaussian avatar
Gyeongsik Moon, Takaaki Shiratori, and Shunsuke Saito. Expressive whole-body 3d gaussian avatar. InEuropean Conference on Computer Vision, pages 19–35. Springer,
-
[56]
Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a mul- tiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1–15, 2022. 1, 2
work page 2022
-
[57]
Disco4d: Disentangled 4d human generation and animation from a single image
Hui En Pang, Shuai Liu, Zhongang Cai, Lei Yang, Tianwei Zhang, and Ziwei Liu. Disco4d: Disentangled 4d human generation and animation from a single image. InCVPR,
-
[58]
Deepsdf: Learning con- tinuous signed distance functions for shape representation
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning con- tinuous signed distance functions for shape representation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 2
work page 2019
-
[59]
Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. Expressive body capture: 3D hands, face, and body from a single image. InProceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 10975–10985, 2019. 1, 3
work page 2019
-
[60]
Pica: Physics-integrated clothed avatar.arXiv preprint arXiv:2407.05324, 2024
Bo Peng, Yunfan Tao, Haoyu Zhan, Yudong Guo, and Juy- ong Zhang. Pica: Physics-integrated clothed avatar.arXiv preprint arXiv:2407.05324, 2024. 3
-
[61]
Ani- matable neural radiance fields for modeling dynamic human bodies
Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. Ani- matable neural radiance fields for modeling dynamic human bodies. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 14314–14323, 2021. 2, 3
work page 2021
-
[62]
Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. InCVPR,
-
[63]
Sida Peng, Chen Geng, Yuanqing Zhang, Yinghao Xu, Qian- qian Wang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. Im- plicit neural representations with structured latent codes for human body modeling.IEEE Transactions on Pattern Anal- ysis and Machine Intelligence, 2023. 1, 2, 3
work page 2023
-
[64]
Gerard Pons-Moll, Sergi Pujades, Sonny Hu, and Michael J Black. Clothcap: Seamless 4d clothing capture and retar- geting.ACM Transactions on Graphics (ToG), 36(4):1–15,
-
[65]
Gaus- sianavatars: Photorealistic head avatars with rigged 3d gaus- sians
Shenhan Qian, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Simon Giebenhain, and Matthias Nießner. Gaus- sianavatars: Photorealistic head avatars with rigged 3d gaus- sians. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 20299–20309,
-
[66]
3dgs-avatar: Animatable avatars via deformable 3d gaussian splatting
Zhiyin Qian, Shaofei Wang, Marko Mihajlovic, Andreas Geiger, and Siyu Tang. 3dgs-avatar: Animatable avatars via deformable 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5020–5030, 2024. 1, 2
work page 2024
-
[67]
Black, Bernhard Thomaszewski, Christina Tsalicoglou, and Otmar Hilliges
Boxiang Rong, Artur Grigorev, Wenbo Wang, Michael J. Black, Bernhard Thomaszewski, Christina Tsalicoglou, and Otmar Hilliges. Gaussian Garments: Reconstruct- ing simulation-ready clothing with photorealistic appearance from multi-view video. InInternational Conference on 3D Vision 2025, 2025. 3
work page 2025
-
[68]
Pifu: Pixel-aligned implicit function for high-resolution clothed human digitiza- tion
Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Mor- ishima, Angjoo Kanazawa, and Hao Li. Pifu: Pixel-aligned implicit function for high-resolution clothed human digitiza- tion. InProceedings of the IEEE/CVF International Confer- ence on Computer Vision (ICCV), 2019. 3
work page 2019
-
[69]
Relightable gaussian codec avatars
Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, and Giljoo Nam. Relightable gaussian codec avatars. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 130–141, 2024. 3
work page 2024
-
[70]
Igor Santesteban, Miguel A. Otaduy, and Dan Casas. Snug: Self-supervised neural dynamic garments. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8140–8150, 2022. 3 11
work page 2022
-
[71]
DiffHu- man: Probabilistic Photorealistic 3D Reconstruction of Hu- mans
Akash Sengupta, Thiemo Alldieck, Nikos Kolotouros, Enric Corona, Andrei Zanfir, and Cristian Sminchisescu. DiffHu- man: Probabilistic Photorealistic 3D Reconstruction of Hu- mans. InCVPR, 2024. 3
work page 2024
-
[72]
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting
Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, and Zeyu Wang. SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 3
work page 2024
-
[73]
X- avatar: Expressive human avatars
Kaiyue Shen, Chen Guo, Manuel Kaufmann, Juan Jose Zarate, Julien Valentin, Jie Song, and Otmar Hilliges. X- avatar: Expressive human avatars. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16911–16921, 2023. 3
work page 2023
-
[74]
Caphy: Cap- turing physical properties for animatable human avatars
Zhaoqi Su, Liangxiao Hu, Siyou Lin, Hongwen Zhang, Shengping Zhang, Justus Thies, and Yebin Liu. Caphy: Cap- turing physical properties for animatable human avatars. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14150–14160, 2023. 3
work page 2023
-
[75]
Ke Sun, Jian Cao, Qi Wang, Linrui Tian, Xindi Zhang, Lian Zhuo, Bang Zhang, Liefeng Bo, Wenbo Zhou, Weiming Zhang, and Daiheng Gao. Outfitanyone: Ultra-high quality virtual try-on for any clothing and any person.arXiv preprint arXiv:2407.16224, 2024. 3
-
[76]
Open-vocabulary se- mantic part segmentation of 3d human
Keito Suzuki, Bang Du, Girish Krishnan, Kunyao Chen, Runfa Blark Li, and Truong Nguyen. Open-vocabulary se- mantic part segmentation of 3d human. In2025 International Conference on 3D Vision (3DV), pages 1572–1582. IEEE,
-
[77]
Dressrecon: Freeform 4d human recon- struction from monocular video
Jeff Tan, Donglai Xiang, Shubham Tulsiani, Deva Ramanan, and Gengshan Yang. Dressrecon: Freeform 4d human recon- struction from monocular video. In2025 International Con- ference on 3D Vision (3DV), pages 250–260. IEEE, 2025. 2
work page 2025
-
[78]
Sizer: A dataset and model for parsing 3d clothing and learning size sensitive 3d clothing
Garvita Tiwari, Bharat Lal Bhatnagar, Tony Tung, and Ger- ard Pons-Moll. Sizer: A dataset and model for parsing 3d clothing and learning size sensitive 3d clothing. InEuropean Conference on Computer Vision, pages 1–18. Springer, 2020. 2, 3
work page 2020
-
[79]
Remu: Reconstructing multi- layer 3d clothed human from images
Onat Vuran and Hsuan-I Ho. Remu: Reconstructing multi- layer 3d clothed human from images. InBritish Machine Vision Conference (BMVC), 2025. 3
work page 2025
-
[80]
Disentangled clothed avatar generation from text descriptions
Jionghao Wang, Yuan Liu, Zhiyang Dou, Zhengming Yu, Yongqing Liang, Cheng Lin, Rong Xie, Li Song, Xin Li, and Wenping Wang. Disentangled clothed avatar generation from text descriptions. InEuropean Conference on Com- puter Vision, pages 381–401. Springer, 2024. 3
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.