pith. sign in
Pith Number

pith:G4F5OO4Y

pith:2026:G4F5OO4YXFUCNXVAQRTIGUDQAC
not attested not anchored not stored refs pending

Modularized Reinforcement Learning on LLMs: From MDP Creation to Exploration and Learning

Annie Wong, Aske Plaat, Chao Gao, Filip Ilievski, Hengyuan Zhang, Jacob E. Kooi, Jiayang Shi, Kevin Qiu, Lincen Yang, Mark Hoogendoorn, Ngai Wong, Qi Huang, Shiping Yang, Shujian Yu, Ting-Chih Chen, Vincent Fran\c{c}ois-Lavet, Xinrui Zu, Yuxuan Jiang, Zhaochun Ren, Zhao Yang, Zhong Li

arxiv:2606.21943 v1 · 2026-06-20 · cs.LG · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{G4F5OO4YXFUCNXVAQRTIGUDQAC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.
Receipt and verification
First computed 2026-06-23T02:13:03.789242Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

370bd73b98b96826dea0846683507000bc7f2b6f936ecb3b0903cddde45723cf

Aliases

arxiv: 2606.21943 · arxiv_version: 2606.21943v1 · doi: 10.48550/arxiv.2606.21943 · pith_short_12: G4F5OO4YXFUC · pith_short_16: G4F5OO4YXFUCNXVA · pith_short_8: G4F5OO4Y
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/G4F5OO4YXFUCNXVAQRTIGUDQAC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 370bd73b98b96826dea0846683507000bc7f2b6f936ecb3b0903cddde45723cf
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "d94d6465d99fa3adb141f952e9a72fcf97f44cee182aa975873790cc219214f2",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-06-20T08:20:41Z",
    "title_canon_sha256": "6c73c6288d7418e1629c69ea6e0ab32845916f08a5dc94c6bfbc9d1ffe94bad6"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2606.21943",
    "kind": "arxiv",
    "version": 1
  }
}