pith. sign in
Pith Number

pith:WDTFOOVO

pith:2026:WDTFOOVOALFSILYMCIRFTLJRUZ
not attested not anchored not stored refs pending

MultiSynt/MT: Trillion-Token Multi-Parallel Pre-Training Data Translated Across 36 Languages

Andr\'e F. T. Martins, Andrey Kutuzov, Anna Lokrantz, Birger Moell, David Salinas, Fedor Vitiugin, Filip Ginter, Gema Ram\'irez-S\'anchez, Jan Haji\v{c}, Jenia Jitsev, Jenna Kanerva, Jonas Lindh, J\"org Tiedemann, Matthias Lindemann, Maximilian Idahl, Sampo Pyysalo, Shenbin Qian, Stephan Oepen, Tim Isbister, Tomasz Galica, Tudor Nicolae Mateiu, Zihao Li

arxiv:2607.00890 v1 · 2026-07-01 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{WDTFOOVOALFSILYMCIRFTLJRUZ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.
Receipt and verification
First computed 2026-07-02T01:18:22.563779Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

b0e6573aae02cb242f0c122259ad31a6786f71079f88f2c3d37a6584610579e4

Aliases

arxiv: 2607.00890 · arxiv_version: 2607.00890v1 · doi: 10.48550/arxiv.2607.00890 · pith_short_12: WDTFOOVOALFS · pith_short_16: WDTFOOVOALFSILYM · pith_short_8: WDTFOOVO
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/WDTFOOVOALFSILYMCIRFTLJRUZ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: b0e6573aae02cb242f0c122259ad31a6786f71079f88f2c3d37a6584610579e4
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "716f98cd5110c31f2950d6cade0f64eaaf2e19aa14103a7f07a88da978d1abd2",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-07-01T12:55:58Z",
    "title_canon_sha256": "7fc4a9269c82ce0904289ce201196611215306f4d9aadeb2978486def878d17b"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2607.00890",
    "kind": "arxiv",
    "version": 1
  }
}