{"paper":{"title":"Face Anything: 4D Face Reconstruction from Any Image Sequence","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Canonical facial point prediction unifies depth estimation, dense 3D geometry, and point tracking for 4D face reconstruction from single-view sequences.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Matthias Nie{\\ss}ner, Richard Shaw, Simon Giebenhain, Umut Kocasari","submitted_at":"2026-04-21T17:22:39Z","abstract_excerpt":"Accurate reconstruction and tracking of dynamic human faces from image sequences is challenging because non-rigid deformations, expression changes, and viewpoint variations occur simultaneously, creating significant ambiguity in geometry and correspondence estimation. We present a unified method for high-fidelity 4D facial reconstruction based on canonical facial point prediction, a representation that assigns each pixel a normalized facial coordinate in a shared canonical space. This formulation transforms dense tracking and dynamic reconstruction into a canonical reconstruction problem, enab"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"By jointly predicting depth and canonical coordinates, our method enables accurate depth estimation, temporally stable reconstruction, dense 3D geometry, and robust facial point tracking within a single architecture.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That multi-view geometry data can be reliably non-rigidly warped into a shared canonical space to train a model that then generalizes to arbitrary single-view image sequences without additional constraints or post-processing.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A single transformer model jointly predicts depth and normalized canonical coordinates to deliver state-of-the-art 4D facial geometry and tracking with 3x lower correspondence error and 16% better depth accuracy.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Canonical facial point prediction unifies depth estimation, dense 3D geometry, and point tracking for 4D face reconstruction from single-view sequences.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"0c973eedaf73717e87a7c0c4ec4e891fbd622e54a8db687148c229a89559509a"},"source":{"id":"2604.19702","kind":"arxiv","version":2},"verdict":{"id":"b3bba5ef-5e97-4d41-a539-ac790fad971d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-10T02:35:12.216586Z","strongest_claim":"By jointly predicting depth and canonical coordinates, our method enables accurate depth estimation, temporally stable reconstruction, dense 3D geometry, and robust facial point tracking within a single architecture.","one_line_summary":"A single transformer model jointly predicts depth and normalized canonical coordinates to deliver state-of-the-art 4D facial geometry and tracking with 3x lower correspondence error and 16% better depth accuracy.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That multi-view geometry data can be reliably non-rigidly warped into a shared canonical space to train a model that then generalizes to arbitrary single-view image sequences without additional constraints or post-processing.","pith_extraction_headline":"Canonical facial point prediction unifies depth estimation, dense 3D geometry, and point tracking for 4D face reconstruction from single-view sequences."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.19702/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-21T16:33:35.140812Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-20T02:38:43.976560Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"1ba399b336e0736b6e79eade72f0e67c10684a7e920254469fa34e01b24e72c8"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}