pith. sign in
Pith Number

pith:LO5BPYEG

pith:2026:LO5BPYEGQGSJHTVVY4WAFAQSFL
not attested not anchored not stored refs resolved

EVA01: Unified Native 3D Understanding and Generation via Mixture-of-Transformers

Baolin Liu, Bocheng Li, Chenzhuo Fan, Mingjing Yi, Shimu Wang, Wanli Ma, Yingde Song, Yongping Xiong, Yuke Lou, Zhengdong Guo, Zongyuan Yang

EVA01 integrates 3D meshes as a native modality inside multimodal language models using a mixture-of-transformers split.

arxiv:2605.16745 v1 · 2026-05-16 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{LO5BPYEGQGSJHTVVY4WAFAQSFL}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

EVA01 achieves state-of-the-art native text-to-3D generation fidelity and unlocks robust long-context multi-turn geometric editing with identity preservation, a capability fundamentally inaccessible to stateless reconstruction pipelines.

C2weakest assumption

The assumption that decoupling into a pre-trained Understanding Expert and a structurally mirrored Generation Expert, coupled through shared global self-attention with hard modality routing, will align the semantic latent space of the MLLM backbone with the geometric manifold without performance loss or the need for intermediate 2D representations.

C3one line summary

EVA01 introduces a Mixture-of-Transformers model that natively adds 3D mesh understanding, generation, and multi-turn editing to MLLMs by decoupling understanding and generation experts with shared global self-attention.

References

77 extracted · 77 resolved · 13 Pith anchors

[1] Qwen3-VL Technical Report 2025 · arXiv:2511.21631
[2] METEOR : An automatic metric for MT evaluation with improved correlation with human judgments 2005
[3] Instant3DiT : Multiview inpainting for fast editing of 3D objects 2025
[4] TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment 2026 · arXiv:2604.12012
[5] ShapeNet: An Information-Rich 3D Model Repository 1918 · arXiv:1512.03012

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:02:39.527485Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

5bba17e08681a493ceb5c72c0282122ae8dc119b204e9c2dc78ae592a76f2ad5

Aliases

arxiv: 2605.16745 · arxiv_version: 2605.16745v1 · doi: 10.48550/arxiv.2605.16745 · pith_short_12: LO5BPYEGQGSJ · pith_short_16: LO5BPYEGQGSJHTVV · pith_short_8: LO5BPYEG
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LO5BPYEGQGSJHTVVY4WAFAQSFL \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5bba17e08681a493ceb5c72c0282122ae8dc119b204e9c2dc78ae592a76f2ad5
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "499a72a158fbe86ee84d8621bbbb168bf7da06e44a56bf595a23c08d3c99c033",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-16T01:55:03Z",
    "title_canon_sha256": "e81542ff15bc1f6f5509b3fe4842b5b48c686b3d1b03fed83f8524096c3ede09"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16745",
    "kind": "arxiv",
    "version": 1
  }
}