pith. sign in
Pith Number

pith:LBNIFEJU

pith:2026:LBNIFEJUZMTBQLSBNHPI6MOZKD
not attested not anchored not stored refs resolved

GTA: Advancing Image-to-3D World Generation via Geometry Then Appearance Video Diffusion

Cong Wang, Hanxin Zhu, Jiayi Luo, Peiyan Tu, Tianyu He, Xin Jin, Zhibo Chen

GTA generates 3D worlds from single images by first creating coarse geometry then synthesizing appearance with separate video diffusion models.

arxiv:2605.12957 v1 · 2026-05-13 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{LBNIFEJUZMTBQLSBNHPI6MOZKD}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

GTA adopts a two-stage framework with two dedicated video diffusion models, which first generate coarse geometric structure from novel viewpoints and then synthesize fine-grained appearance conditioned on the predicted geometry.

C2weakest assumption

That separating geometry generation from appearance synthesis in a coarse-to-fine video diffusion pipeline will reliably improve structural fidelity and cross-view consistency without introducing new inconsistencies.

C3one line summary

GTA generates 3D worlds from single images via a two-stage video diffusion process that prioritizes geometry before appearance to improve structural consistency.

References

87 extracted · 87 resolved · 10 Pith anchors

[1] In: Proceedings of the First International Conference on Computer Vision Theory and Applications, pp 2006
[2] Advances in 3d generation: A survey 2024
[3] In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2023
[4] Artificial Intelligence Review56(9), 9175–9219 (2023) 2023
[5] 3d scene genera- tion: A survey 2025

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-18T03:09:09.273688Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

585a829134cb26182e4169de8f31d950d0e84fd767f478809a839a4d2e1efe7b

Aliases

arxiv: 2605.12957 · arxiv_version: 2605.12957v1 · doi: 10.48550/arxiv.2605.12957 · pith_short_12: LBNIFEJUZMTB · pith_short_16: LBNIFEJUZMTBQLSB · pith_short_8: LBNIFEJU
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LBNIFEJUZMTBQLSBNHPI6MOZKD \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 585a829134cb26182e4169de8f31d950d0e84fd767f478809a839a4d2e1efe7b
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "9199662d2c2859de0f801a7e8b85791b55289e115a41ebbd7eea1f17dfb21783",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-13T03:43:02Z",
    "title_canon_sha256": "921cfe4edaa25919cb9d3d57457337900aa02a066643659230508a414daa01fb"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12957",
    "kind": "arxiv",
    "version": 1
  }
}