Pith Number
pith:2QIRDDAW
pith:2026:2QIRDDAW2NVGA6M5KAVF4K6AH5
not attested
not anchored
not stored
refs resolved
Collider-Bench: Benchmarking AI Agents with Particle Physics Analysis Reproduction
No AI agent reliably beats a physicist when reproducing LHC analyses from public papers alone.
arxiv:2605.13950 v1 · 2026-05-13 · cs.LG · cs.AI · hep-ex · hep-ph
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{2QIRDDAW2NVGA6M5KAVF4K6AH5}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
1
Bitcoin timestamp
2
Internet Archive
3
Author claim
· sign in to
claim
4
Citations
5
Replications
✓
Portable graph bundle live · download bundle · merged
state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same
current state with the deterministic merge algorithm.
Claims
C1strongest claim
Our results show that on average no agent reliably beats the physicist-in-the-loop solution.
C2weakest assumption
That published papers and public software contain enough information for agents to fill gaps via physical reasoning and trial-and-error without access to internal experimental details.
C3one line summary
Collider-Bench is a new benchmark showing that current LLM agents cannot reliably reproduce LHC analyses at the level of a physicist-in-the-loop.
References
[1] Plehn, Tilman and Schiller, Daniel and Schmal, Nikita. MadAgents. arXiv:2601.21015. 2026
[2] An End-to-end Architecture for Collider Physics and Beyond
[3] The FERMIACC: Agents for Particle Theory
[4] A comprehensive guide to the physics and usage of PYTHIA 8.3
[5] DELPHES 3, A modular framework for fast simulation of a generic collider experiment
Formal links
Receipt and verification
| First computed | 2026-05-17T23:39:13.741609Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
d411118c16d36a60799d502a5e2bc03f45206ba2daa02629559326cece12ecb6
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/2QIRDDAW2NVGA6M5KAVF4K6AH5 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d411118c16d36a60799d502a5e2bc03f45206ba2daa02629559326cece12ecb6
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "fb3d3d4aaff17f36d28ed6f9572514bbbfe98c1c5a10ee67fd357136b1e62840",
"cross_cats_sorted": [
"cs.AI",
"hep-ex",
"hep-ph"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-13T18:00:00Z",
"title_canon_sha256": "a0dc5dbd64b36286d8835d454857d9099bde67bd2aacd9b160669b966736bfb1"
},
"schema_version": "1.0",
"source": {
"id": "2605.13950",
"kind": "arxiv",
"version": 1
}
}