{"paper":{"title":"Controllable Egocentric Video Generation via Occlusion-Aware Sparse 3D Hand Joints","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Sparse 3D hand joints with occlusion-aware weighting generate controllable egocentric videos from one reference frame.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Alexandros Delitzas, Boqi Chen, Botao Ye, Chenyangguang Zhang, Fangjinhua Wang, Marc Pollefeys, Xi Wang","submitted_at":"2026-03-12T10:02:23Z","abstract_excerpt":"Controllable video generation for complex hand-object interactions is a critical step toward building visual world models. However, existing methods often struggle to achieve fine-grained, 3D-consistent hand articulation in generated videos. By relying on dense 2D trajectories or implicit pose representations, they collapse crucial geometric structures into spatially ambiguous signals, leading to severe motion inconsistencies and hallucinated artifacts under egocentric occlusions. To address this, we propose leveraging sparse 3D hand joints as explicit control signals with three key advantages"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"our approach significantly outperforms state-of-the-art baselines, generating high-fidelity egocentric videos with realistic interactions and exhibiting exceptional cross-embodiment generalization to robotic hands.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Sparse 3D hand joints plus the occlusion-aware weighting mechanism are assumed to supply enough geometric and semantic information to prevent motion inconsistencies and hallucinations under severe egocentric occlusions without additional human-centric priors.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A new occlusion-aware control module generates high-fidelity egocentric videos from sparse 3D hand joints, supported by a million-clip dataset and cross-embodiment benchmark.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Sparse 3D hand joints with occlusion-aware weighting generate controllable egocentric videos from one reference frame.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"b41c1fc7674260aad19af0f342ad7409d7932868424bd5eae69f511de4c2e306"},"source":{"id":"2603.11755","kind":"arxiv","version":2},"verdict":{"id":"6da77c41-c9cd-4997-bbfb-85b615f42614","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T12:15:52.540726Z","strongest_claim":"our approach significantly outperforms state-of-the-art baselines, generating high-fidelity egocentric videos with realistic interactions and exhibiting exceptional cross-embodiment generalization to robotic hands.","one_line_summary":"A new occlusion-aware control module generates high-fidelity egocentric videos from sparse 3D hand joints, supported by a million-clip dataset and cross-embodiment benchmark.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Sparse 3D hand joints plus the occlusion-aware weighting mechanism are assumed to supply enough geometric and semantic information to prevent motion inconsistencies and hallucinations under severe egocentric occlusions without additional human-centric priors.","pith_extraction_headline":"Sparse 3D hand joints with occlusion-aware weighting generate controllable egocentric videos from one reference frame."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2603.11755/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"6bc8472d81a2f10cce16defba6aa0a0c76b4b9cf5c0d18928bbbab448cf81dc7"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}