pith. sign in
Pith Number

pith:VS7FSX64

pith:2026:VS7FSX64WHQIURZFLXCWM6ZMSF
not attested not anchored not stored refs resolved

Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance

Faezeh Ghaderi, Mahdi Naser-Moghadasi

Mathematical reasoning produces the highest attention entropy across language model architectures.

arxiv:2605.15436 v1 · 2026-05-14 · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{VS7FSX64WHQIURZFLXCWM6ZMSF}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our analysis of 144 task-model combinations demonstrates that mathematical reasoning consistently produces the highest attention entropy across all architectures, while decoder models exhibit significantly higher sparsity patterns compared to encoder models.

C2weakest assumption

The twelve cognitive task categories and the chosen measurement definitions (final activation values, attention entropy, sparsity) are assumed to capture meaningful and comparable computational differences without substantial confounding from task formulation or model-specific tokenization effects.

C3one line summary

Analysis of 144 task-model pairs finds mathematical reasoning produces the highest attention entropy in all architectures while decoder models show significantly higher sparsity than encoders.

References

50 extracted · 50 resolved · 10 Pith anchors

[1] Llama 2: Open Foundation and Fine-Tuned Chat Models 2023 · arXiv:2307.09288
[2] A. Q. Jiang et al., ”Mistral 7B,” arXiv preprint arXiv:2310.06825, 2023 2023 · arXiv:2310.06825
[3] J. Devlin, M. Chang, K. Lee, and K. Toutanova, ”BERT: Pre-training of deep bidirectional transformers for language understanding,” in Pro- ceedings of NAACL-HLT, 2019, pp. 4171-4186 2019
[4] Radford et al., ”Language models are unsupervised multitask learn- ers,” OpenAI blog, vol 2019
[5] Qwen Technical Report 2023 · arXiv:2309.16609
Receipt and verification
First computed 2026-05-20T00:00:58.510920Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

acbe595fdcb1e08a47255dc5667b2c914dacfcd6cb65d6dd1705b2e6f51d185a

Aliases

arxiv: 2605.15436 · arxiv_version: 2605.15436v1 · doi: 10.48550/arxiv.2605.15436 · pith_short_12: VS7FSX64WHQI · pith_short_16: VS7FSX64WHQIURZF · pith_short_8: VS7FSX64
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/VS7FSX64WHQIURZFLXCWM6ZMSF \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: acbe595fdcb1e08a47255dc5667b2c914dacfcd6cb65d6dd1705b2e6f51d185a
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "323ee04d9b2ad9d88a7635b939cf925e81804fdeac22020879fc7707e7109867",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-14T21:31:19Z",
    "title_canon_sha256": "45db951a6c79865d598c4d8bd1179df722501ed8c78f2d4a3dec3ec011210a1c"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15436",
    "kind": "arxiv",
    "version": 1
  }
}