pith. sign in
Pith Number

pith:2QIRDDAW

pith:2026:2QIRDDAW2NVGA6M5KAVF4K6AH5
not attested not anchored not stored refs resolved

Collider-Bench: Benchmarking AI Agents with Particle Physics Analysis Reproduction

Darius A. Faroughy, David Shih, Ian Pang, Siddharth Mishra-Sharma, Sofia Palacios Schweitzer

No AI agent reliably beats a physicist when reproducing LHC analyses from public papers alone.

arxiv:2605.13950 v1 · 2026-05-13 · cs.LG · cs.AI · hep-ex · hep-ph

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{2QIRDDAW2NVGA6M5KAVF4K6AH5}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our results show that on average no agent reliably beats the physicist-in-the-loop solution.

C2weakest assumption

That published papers and public software contain enough information for agents to fill gaps via physical reasoning and trial-and-error without access to internal experimental details.

C3one line summary

Collider-Bench is a new benchmark showing that current LLM agents cannot reliably reproduce LHC analyses at the level of a physicist-in-the-loop.

References

44 extracted · 44 resolved · 16 Pith anchors

[1] Plehn, Tilman and Schiller, Daniel and Schmal, Nikita. MadAgents. arXiv:2601.21015. 2026 2026 · arXiv:2601.21015
[2] An End-to-end Architecture for Collider Physics and Beyond 2026
[3] The FERMIACC: Agents for Particle Theory 2026
[4] A comprehensive guide to the physics and usage of PYTHIA 8.3 2022 · doi:10.21468/scipostphyscodeb.8
[5] DELPHES 3, A modular framework for fast simulation of a generic collider experiment 2014 · doi:10.1007/jhep02(2014)057

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T23:39:13.741609Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

d411118c16d36a60799d502a5e2bc03f45206ba2daa02629559326cece12ecb6

Aliases

arxiv: 2605.13950 · arxiv_version: 2605.13950v1 · doi: 10.48550/arxiv.2605.13950 · pith_short_12: 2QIRDDAW2NVG · pith_short_16: 2QIRDDAW2NVGA6M5 · pith_short_8: 2QIRDDAW
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/2QIRDDAW2NVGA6M5KAVF4K6AH5 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d411118c16d36a60799d502a5e2bc03f45206ba2daa02629559326cece12ecb6
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "fb3d3d4aaff17f36d28ed6f9572514bbbfe98c1c5a10ee67fd357136b1e62840",
    "cross_cats_sorted": [
      "cs.AI",
      "hep-ex",
      "hep-ph"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T18:00:00Z",
    "title_canon_sha256": "a0dc5dbd64b36286d8835d454857d9099bde67bd2aacd9b160669b966736bfb1"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13950",
    "kind": "arxiv",
    "version": 1
  }
}