pith. sign in
Pith Number

pith:C6BFU545

pith:2025:C6BFU545DD7RW6PMM2VTJX7B2C
not attested not anchored not stored refs pending

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Alexandra Barr, Amelia Glaese, David Li, Elizabeth Proehl, Gildas Chabot, Grace Kim, Jerry Tworek, Laurance Fauconnet, Marwan Aljubeh, Michael Sharman, Michele Wang, Natalie S. Kim, Olivia Watkins, Patrick Chao, Phoebe Thacker, Rachel Dias, Samuel Miserendino, Sim\'on Posada Fishman, Tejal Patwardhan

Frontier AI models approach industry experts in quality on real-world economically valuable tasks.

arxiv:2510.04374 v1 · 2025-10-05 · cs.LG · cs.AI · cs.CY

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{C6BFU545DD7RW6PMM2VTJX7B2C}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

frontier model performance on GDPval is improving roughly linearly over time, and that the current best frontier models are approaching industry experts in deliverable quality

C2weakest assumption

that the selected tasks and expert ratings accurately represent the full range of economically valuable work and that automated grading reliably matches human expert judgment on deliverable quality

C3one line summary

GDPval benchmark finds frontier AI models approaching industry experts on economically valuable tasks from high-GDP sectors, with linear performance gains over time.

Formal links

3 machine-checked theorem links

Cited by

24 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:48.066776Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

17825a779d18ff1b79ec66ab34dfe1d08698c6913c9055352154a735876fb74c

Aliases

arxiv: 2510.04374 · arxiv_version: 2510.04374v1 · doi: 10.48550/arxiv.2510.04374 · pith_short_12: C6BFU545DD7R · pith_short_16: C6BFU545DD7RW6PM · pith_short_8: C6BFU545
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/C6BFU545DD7RW6PMM2VTJX7B2C \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 17825a779d18ff1b79ec66ab34dfe1d08698c6913c9055352154a735876fb74c
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "3e907bdaf254be3ce564273e978e1bc5f5a1f12490ec69fc10eedd5df91fda4c",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CY"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2025-10-05T21:36:43Z",
    "title_canon_sha256": "98ad289b766a236ffb8e8459c62ec63bee942b19ec85473290763a0fb9ed4d41"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2510.04374",
    "kind": "arxiv",
    "version": 1
  }
}