pith:TWV2C5X6
Evaluating Relational Reasoning in LLMs with REL
Frontier LLMs show steady performance drops on relational tasks as the number of entities that must bind together increases, even with fixed total entities and extra compute.
arxiv:2604.12176 v2 · 2026-04-14 · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{TWV2C5X66W4PZLZE2BGYUL7XZ3}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Across frontier LLMs, performance degrades consistently and monotonically as RC increases, even when the total number of entities is held fixed. This failure mode persists with increased test-time compute and in-context learning, suggesting a limitation tied to the arity of the required relational binding rather than to insufficient inference steps or lack of exposure to examples.
That the generative tasks in REL truly isolate relational complexity (arity of binding) without introducing uncontrolled confounders in input structure, vocabulary, or task framing that could explain the performance drop instead.
LLMs show consistent performance degradation on higher-arity relational reasoning tasks in a new benchmark REL that isolates relational complexity across scientific domains.
Cited by
Receipt and verification
| First computed | 2026-06-03T01:05:13.536512Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
9daba176fef5b8fcaf24d04d8a2ff7cef68d180e20b89c23839d79d1829eb18e
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/TWV2C5X66W4PZLZE2BGYUL7XZ3 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9daba176fef5b8fcaf24d04d8a2ff7cef68d180e20b89c23839d79d1829eb18e
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "55d280e5ff2c9ff860bb26d96a76c6a5e20046c66f90111525e574df15e7e675",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-04-14T01:07:15Z",
"title_canon_sha256": "a851621559ef56051cf7edbbffbc879b3072759f2dd251310457fead204c9472"
},
"schema_version": "1.0",
"source": {
"id": "2604.12176",
"kind": "arxiv",
"version": 2
}
}