AnyAct: Towards Human Reenactment of Character Motion From Video
Pith reviewed 2026-05-20 20:04 UTC · model grok-4.3
The pith
Sparse local 2D motion cues from character videos can generate plausible human reenactments without 3D source models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AnyAct formulates character-video-driven human reenactment as conditional human motion generation from transferable sparse local 2D articulated motion. It achieves this through human-motion-only supervision via augmented 3D-to-2D projection, progressive 3D-to-2D training to reduce conditioning ambiguity, and global-local motion decoupling for reliable local control, producing high-fidelity reenactments on a new benchmark of diverse non-human character videos.
What carries the argument
Sparse local 2D articulated motion cues extracted from the character video, used as the conditioning signal for generating human motion sequences.
If this is right
- Initial human reenactments become possible directly from monocular character videos without requiring 3D source data or known topologies.
- The generated motions preserve essential dynamics of the reference character while remaining editable for further animation work.
- A benchmark of diverse non-human videos can be used to measure how well local cues survive large structural differences.
- Global-local decoupling allows separate control over overall pose and fine-grained limb actions in the output human sequence.
Where Pith is reading between the lines
- If local cues alone suffice, then many existing motion retargeting pipelines that rely on full skeletal correspondence may be over-specified for the initial transfer step.
- Progressive training from coarse to fine 2D projections could be tested on other video-to-motion domains such as sign language or dance to see whether ambiguity reduction generalizes.
- The separation of global and local signals suggests a natural next step of feeding the same local cues into physics-based simulators for contact-rich actions.
Load-bearing premise
Training solely on human motion data with projected 2D cues and staged learning will transfer to arbitrary non-human character shapes without losing essential timing or adding unnatural artifacts.
What would settle it
Generate human motions from a video of an octopus or highly asymmetric creature and check whether the output human sequence still matches the original action timings and energy profile or instead collapses into generic human walking.
Figures
read the original abstract
We study the problem of directly deriving an initial human reenactment from a monocular video of a non-human character. Our goal is not to reconstruct the source character itself but to reinterpret its motion as a plausible and editable human performance for downstream animation authoring. This task is challenging because existing video-based motion capture methods are largely restricted to human-centric structural spaces, while motion retargeting methods typically require structured 3D source motions and known source topologies. Our key insight is that sparse local articulated motion cues can preserve essential dynamics across large structural differences, providing a stable bridge from character video to human reenactment. Based on this observation, we propose AnyAct, which formulates character-video-driven human reenactment as conditional human motion generation from transferable sparse local 2D articulated motion. To make this practical, we introduce three key designs: human-motion-only supervision via augmented 3D-to-2D projection, progressive 3D-to-2D training to alleviate conditioning ambiguity, and global-local motion decoupling for reliable local motion control. We further construct a benchmark primarily covering diverse non-human character videos. Experiments on the benchmark show that AnyAct produces high-fidelity initial human reenactments that preserve the essential dynamics of the characters in reference videos, and further ablation studies validate the effectiveness of its core designs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that sparse local articulated 2D motion cues extracted from monocular non-human character videos can serve as a topology-agnostic bridge to generate plausible, editable human reenactments. AnyAct achieves this via human-motion-only supervision through augmented 3D-to-2D projection, progressive training to reduce conditioning ambiguity, and global-local motion decoupling; experiments on a new benchmark of diverse character videos reportedly yield high-fidelity results that preserve essential dynamics.
Significance. If the central generalization claim holds, the work would offer a practical advance for animation authoring pipelines by removing the need for 3D source reconstructions or known topologies, potentially enabling direct video-to-human-motion transfer for arbitrary characters. The emphasis on local 2D cues and progressive training is a concrete technical contribution worth testing.
major comments (2)
- [Abstract / §3] Abstract and §3 (method overview): the central claim that 'sparse local articulated motion cues can preserve essential dynamics across large structural differences' rests on human-motion-only supervision via 3D-to-2D projection, yet no explicit mechanism is described for resolving ambiguous limb correspondences (e.g., mapping a quadruped leg to a human arm or leg) or for penalizing loss of 3D depth/self-occlusion information discarded by the 2D projection; this directly affects whether the approach generalizes beyond human-like topologies.
- [§4] §4 (experiments) and benchmark description: positive benchmark results and ablation studies are reported, but without quantitative tables, error bars, or per-character topology breakdowns (e.g., results on quadrupeds vs. bipeds with non-rigid parts), it is impossible to verify whether the progressive schedule truly prevents plausible-but-incorrect human motions for out-of-distribution characters or merely interpolates within the human distribution.
minor comments (2)
- [§3.1] Notation for 'transferable sparse local 2D articulated motion' is introduced without a clear formal definition or diagram showing how 2D keypoints are extracted and conditioned on the human motion generator.
- [§4.1] The benchmark construction paragraph should include explicit statistics on character diversity (number of topologies, video lengths, motion complexity) to allow readers to assess coverage of the claimed 'large structural differences'.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the presentation of our method and results without altering the core claims.
read point-by-point responses
-
Referee: [Abstract / §3] Abstract and §3 (method overview): the central claim that 'sparse local articulated motion cues can preserve essential dynamics across large structural differences' rests on human-motion-only supervision via 3D-to-2D projection, yet no explicit mechanism is described for resolving ambiguous limb correspondences (e.g., mapping a quadruped leg to a human arm or leg) or for penalizing loss of 3D depth/self-occlusion information discarded by the 2D projection; this directly affects whether the approach generalizes beyond human-like topologies.
Authors: We agree that an explicit limb correspondence mechanism is not described because our design intentionally avoids it: the sparse local 2D articulated cues are topology-agnostic by construction, encoding only local joint velocities and angles that transfer across structures without requiring global alignment or predefined mappings. Human-motion-only supervision via augmented 3D-to-2D projection trains the model to produce plausible outputs conditioned on these cues, while progressive training gradually increases conditioning complexity to reduce ambiguity. Global-local decoupling further isolates local dynamics from global pose, allowing the model to focus on transferable motion patterns. We acknowledge that the manuscript could better articulate how depth and self-occlusion information loss is mitigated implicitly through learned human priors rather than explicit penalties. We will revise §3 to include a dedicated paragraph explaining these aspects and their relation to generalization. revision: yes
-
Referee: [§4] §4 (experiments) and benchmark description: positive benchmark results and ablation studies are reported, but without quantitative tables, error bars, or per-character topology breakdowns (e.g., results on quadrupeds vs. bipeds with non-rigid parts), it is impossible to verify whether the progressive schedule truly prevents plausible-but-incorrect human motions for out-of-distribution characters or merely interpolates within the human distribution.
Authors: We thank the referee for highlighting this presentation issue. The manuscript reports quantitative metrics and ablation studies in §4, but we recognize that the current tables lack error bars and explicit per-topology breakdowns. To address this, we will expand the experimental section with a new table providing mean and standard deviation across multiple runs, plus a breakdown of results grouped by character topology (quadrupeds, bipeds with non-rigid appendages, etc.). This will allow readers to assess whether the progressive schedule supports generalization beyond interpolation within the human motion distribution. revision: yes
Circularity Check
No significant circularity; derivation relies on external human data and standard projections
full rationale
The paper's central claim rests on the observation that sparse local articulated motion cues preserve dynamics across structural differences, then implements AnyAct via human-motion-only supervision with augmented 3D-to-2D projection and progressive training. These components draw from external human motion datasets and conventional projection methods rather than reducing outputs to the model's own fitted parameters or self-citations by construction. No equations equate predictions to inputs tautologically, and the approach is benchmarked against diverse non-human videos without evident self-referential fitting. This yields a low circularity score consistent with self-contained use of independent external supervision.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
sparse local articulated motion cues can preserve essential dynamics across large structural differences... human-motion-only supervision via augmented 3D-to-2D projection, progressive 3D-to-2D training
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
global-local motion decoupling for reliable local motion control
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
ACM Transactions on Graphics (ToG) , volume=
Deepphase: Periodic autoencoders for learning motion phase manifolds , author=. ACM Transactions on Graphics (ToG) , volume=. 2022 , publisher=
work page 2022
-
[2]
ACM Transactions On Graphics (TOG) , volume=
Unpaired motion style transfer from video to animation , author=. ACM Transactions On Graphics (TOG) , volume=. 2020 , publisher=
work page 2020
-
[3]
Proceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games , pages=
Video-Based Motion Retargeting Framework between Characters with Various Skeleton Structure , author=. Proceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games , pages=
-
[4]
2019 International conference on 3D vision (3DV) , pages=
Language2pose: Natural language grounded pose forecasting , author=. 2019 International conference on 3D vision (3DV) , pages=. 2019 , organization=
work page 2019
-
[5]
Proceedings of the 5th ACM International Conference on Multimedia in Asia , pages=
Cross-modal retrieval for motion and text via droptriple loss , author=. Proceedings of the 5th ACM International Conference on Multimedia in Asia , pages=
-
[6]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Generating diverse and natural 3d human motions from text , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[7]
European Conference on Computer Vision , pages=
Motionclip: Exposing human motion generation to clip space , author=. European Conference on Computer Vision , pages=. 2022 , organization=
work page 2022
-
[8]
European conference on computer vision , pages=
Temos: Generating diverse human motions from textual descriptions , author=. European conference on computer vision , pages=. 2022 , organization=
work page 2022
-
[9]
The Eleventh International Conference on Learning Representations , year=
Human Motion Diffusion Model , author=. The Eleventh International Conference on Learning Representations , year=
-
[10]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Executing your commands via motion diffusion in latent space , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[11]
IEEE transactions on pattern analysis and machine intelligence , volume=
Motiondiffuse: Text-driven human motion generation with diffusion model , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2024 , publisher=
work page 2024
-
[12]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Remodiffuse: Retrieval-augmented motion diffusion model , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[13]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Rethinking diffusion for text-driven human motion generation: Redundant representations, evaluation, and masked autoregression , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[14]
arXiv preprint arXiv:2510.26794 , year=
The quest for generalizable motion generation: Data, model, and evaluation , author=. arXiv preprint arXiv:2510.26794 , year=
-
[15]
arXiv preprint arXiv:2512.23464 , year=
HY-Motion 1.0: Scaling Flow Matching Models for Text-To-Motion Generation , author=. arXiv preprint arXiv:2512.23464 , year=
-
[16]
arXiv preprint arXiv:2603.15546 , year=
Kimodo: Scaling Controllable Human Motion Generation , author=. arXiv preprint arXiv:2603.15546 , year=
-
[17]
Advances in Neural Information Processing Systems , volume=
Motiongpt: Human motion as a foreign language , author=. Advances in Neural Information Processing Systems , volume=
-
[18]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Generating human motion from textual descriptions with discrete representations , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[19]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
Attt2m: Text-driven human motion generation with multi-perspective attention mechanism , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
-
[20]
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=
Motiongpt: Human motion synthesis with improved diversity and realism via gpt-3 prompting , author=. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=
-
[21]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Go to zero: Towards zero-shot motion generation with million-scale data , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[22]
European Conference on Computer Vision , pages=
Bamm: bidirectional autoregressive motion model , author=. European Conference on Computer Vision , pages=. 2024 , organization=
work page 2024
-
[23]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Momask: Generative masked modeling of 3d human motions , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[24]
The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
Snapmogen: Human motion generation from expressive texts , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
-
[25]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Mmm: Generative masked motion model , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[26]
ACM Transactions on Graphics (TOG) , volume=
Motion puzzle: Arbitrary motion style transfer by body part , author=. ACM Transactions on Graphics (TOG) , volume=. 2022 , publisher=
work page 2022
-
[27]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[28]
SIGGRAPH Asia 2024 Conference Papers , pages=
Monkey see, monkey do: Harnessing self-attention in motion diffusion for zero-shot motion transfer , author=. SIGGRAPH Asia 2024 Conference Papers , pages=
work page 2024
-
[29]
European Conference on Computer Vision , pages=
SMooDi: Stylized Motion Diffusion Model , author=. European Conference on Computer Vision , pages=
-
[30]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Arbitrary motion style transfer with multi-condition motion latent diffusion model , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[31]
arXiv preprint arXiv:2509.04058 , year=
Smoogpt: Stylized motion generation using large language models , author=. arXiv preprint arXiv:2509.04058 , year=
-
[32]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Avatars grow legs: Generating smooth human motion from sparse tracking inputs with diffusion model , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[33]
ACM Transactions on Graphics (TOG) , volume=
Listen, denoise, action! audio-driven motion synthesis with diffusion models , author=. ACM Transactions on Graphics (TOG) , volume=. 2023 , publisher=
work page 2023
-
[34]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Diffusion-based generation, optimization, and planning in 3d scenes , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[35]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Move as you say interact as you can: Language-guided human motion generation with scene affordance , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[36]
arXiv preprint arXiv:2506.00173 , year=
MotionPersona: Characteristics-aware Locomotion Control , author=. arXiv preprint arXiv:2506.00173 , year=
-
[37]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Personabooth: Personalized text-to-motion generation , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[38]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Auto-regressive diffusion for generating 3d human-object interactions , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[39]
Omnicontrol: Control any joint at any time for human motion generation,
Omnicontrol: Control any joint at any time for human motion generation , author=. arXiv preprint arXiv:2310.08580 , year=
-
[40]
ACM SIGGRAPH 2024 conference papers , pages=
Flexible motion in-betweening with diffusion models , author=. ACM SIGGRAPH 2024 conference papers , pages=
work page 2024
-
[41]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Maskcontrol: Spatio-temporal control for masked motion synthesis , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[42]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Pomp: Physics-consistent motion generative model through phase manifolds , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[43]
ACM Transactions on Graphics (TOG) , year=
Adaptnet: Policy adaptation for physics-based character control , author=. ACM Transactions on Graphics (TOG) , year=
-
[44]
ACM Transactions on Graphics (TOG) , year=
Sketch2anim: Towards transferring sketch storyboards into 3d animation , author=. ACM Transactions on Graphics (TOG) , year=
-
[45]
ACM Transactions on Graphics (TOG) , volume=
Physics-based character controllers using conditional vaes , author=. ACM Transactions on Graphics (TOG) , volume=. 2022 , publisher=
work page 2022
-
[46]
ACM Transactions on Graphics (ToG) , volume=
Amp: Adversarial motion priors for stylized physics-based character control , author=. ACM Transactions on Graphics (ToG) , volume=. 2021 , publisher=
work page 2021
-
[47]
ACM Transactions On Graphics (TOG) , volume=
Maskedmimic: Unified physics-based character control through masked motion inpainting , author=. ACM Transactions On Graphics (TOG) , volume=. 2024 , publisher=
work page 2024
-
[48]
ACM SIGGRAPH 2024 Conference Papers , pages=
Strategy and skill learning for physics-based table tennis animation , author=. ACM SIGGRAPH 2024 Conference Papers , pages=
work page 2024
-
[49]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Motion-2-to-3: Leveraging 2D Motion Data for 3D Motion Generations , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[50]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Humandreamer: Generating controllable human-motion videos via decoupled generation , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[51]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Mas: Multi-view ancestral sampling for 3d motion generation using 2d diffusion , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[52]
SIGGRAPH Asia 2024 Conference Papers , pages=
Motionfix: Text-driven 3d human motion editing , author=. SIGGRAPH Asia 2024 Conference Papers , pages=
work page 2024
-
[53]
ACM SIGGRAPH 2024 Conference Papers , pages=
Iterative motion editing with natural language , author=. ACM SIGGRAPH 2024 Conference Papers , pages=
work page 2024
-
[54]
Advances in Neural Information Processing Systems , volume=
Finemogen: Fine-grained spatio-temporal motion generation and editing , author=. Advances in Neural Information Processing Systems , volume=
-
[55]
arXiv preprint arXiv:2512.24200 , year=
PartMotionEdit: Fine-Grained Text-Driven 3D Human Motion Editing via Part-Level Modulation , author=. arXiv preprint arXiv:2512.24200 , year=
-
[56]
Proceedings of the AAAI conference on artificial intelligence , volume=
Flame: Free-form language-based motion synthesis & editing , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[57]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Motionlab: Unified human motion generation and editing via the motion-condition-motion paradigm , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[58]
Motionclr: Motion generation and training-free editing via understanding attention mechanisms , author=. openreview , year=
-
[59]
arXiv preprint arXiv:2512.19159 , year=
OmniMoGen: Unifying Human Motion Generation via Learning from Interleaved Text-Motion Instructions , author=. arXiv preprint arXiv:2512.19159 , year=
-
[60]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
AMASS: Archive of motion capture as surface shapes , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
-
[61]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
BABEL: Bodies, action and behavior with english labels , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[62]
Advances in Neural Information Processing Systems , volume=
Motion-x: A large-scale 3d expressive whole-body human motion dataset , author=. Advances in Neural Information Processing Systems , volume=
-
[63]
CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos
CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos , author=. arXiv preprint arXiv:2601.10632 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[64]
arXiv preprint arXiv:2512.18814 , year=
EchoMotion: Unified Human Video and Motion Generation via Dual-Modality Diffusion Transformer , author=. arXiv preprint arXiv:2512.18814 , year=
-
[65]
arXiv preprint arXiv:2510.03909 , year=
Generating Human Motion Videos using a Cascaded Text-to-Video Framework , author=. arXiv preprint arXiv:2510.03909 , year=
-
[66]
Anytop: Character animation diffusion with any topology , author=. Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers , pages=
-
[67]
arXiv preprint arXiv:2508.05162 , year=
X-MoGen: Unified Motion Generation across Humans and Animals , author=. arXiv preprint arXiv:2508.05162 , year=
-
[68]
ACM Transactions on Graphics (ToG) , volume=
Physcap: Physically plausible monocular 3d motion capture in real time , author=. ACM Transactions on Graphics (ToG) , volume=. 2020 , publisher=
work page 2020
-
[69]
Acm transactions on graphics (tog) , volume=
Motionet: 3d human motion reconstruction from monocular video with skeleton consistency , author=. Acm transactions on graphics (tog) , volume=. 2020 , publisher=
work page 2020
-
[70]
SIGGRAPH Asia 2023 conference papers , pages=
A locality-based neural solver for optical motion capture , author=. SIGGRAPH Asia 2023 conference papers , pages=
work page 2023
-
[71]
SIGGRAPH Asia 2024 Conference Papers , pages=
RoMo: A Robust Solver for Full-body Unlabeled Optical Motion Capture , author=. SIGGRAPH Asia 2024 Conference Papers , pages=
work page 2024
-
[72]
Proceedings of the SIGGRAPH Asia 2025 Conference Papers , pages=
StableMotion: Training Motion Cleanup Models with Unpaired Corrupted Data , author=. Proceedings of the SIGGRAPH Asia 2025 Conference Papers , pages=
work page 2025
-
[73]
arXiv preprint arXiv:2603.17704 , year=
DancingBox: A Lightweight MoCap System for Character Animation from Physical Proxies , author=. arXiv preprint arXiv:2603.17704 , year=
-
[74]
Thirteenth International Conference on 3D Vision , year=
EgoMDM: Diffusion-based Human Motion Synthesis from Sparse Egocentric Sensors , author=. Thirteenth International Conference on 3D Vision , year=
-
[75]
EasyMoCap - Make human motion capture easier. , howpublished =. 2021 , url =
work page 2021
-
[76]
SIGGRAPH Conference Proceedings , year=
Novel View Synthesis of Human Interactions from Sparse Multi-view Videos , author=. SIGGRAPH Conference Proceedings , year=
-
[77]
SIGGRAPH Asia Conference Proceedings , year=
Efficient Neural Radiance Fields for Interactive Free-viewpoint Video , author=. SIGGRAPH Asia Conference Proceedings , year=
-
[78]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Realtime multi-person 2d pose estimation using part affinity fields , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[79]
Proceedings of the IEEE conference on Computer Vision and Pattern Recognition , pages=
Hand keypoint detection in single images using multiview bootstrapping , author=. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition , pages=
-
[80]
IEEE transactions on pattern analysis and machine intelligence , volume=
Openpose: Realtime multi-person 2d pose estimation using part affinity fields , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2019 , publisher=
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.