Pith Number

pith:WOO7ROXY

pith:2026:WOO7ROXYHKUZGUAO4TMTAY3T5S

not attested not anchored not stored refs resolved

CO-MAP: A Reinforcement Learning Approach to the Qubit Allocation Problem

Ankit Kulshrestha, Xiaoyuan Liu

A reinforcement learning policy trained on a combinatorial formulation cuts SWAP overhead by 65-85 percent on standard quantum circuit benchmarks.

arxiv:2605.13638 v1 · 2026-05-13 · quant-ph · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{WOO7ROXYHKUZGUAO4TMTAY3T5S}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our trained policy achieves a 65-85% reduction in SWAP overhead when compared to existing quantum compilers on different real world datasets like MQTBench and Queko circuits.

C2weakest assumption

That the RL policy, trained on the reported datasets, generalizes to unseen circuits without overfitting and that the measured SWAP reductions are not artifacts of benchmark selection or baseline implementation details.

C3one line summary

Reinforcement learning policy for qubit mapping reduces SWAP overhead by 65-85% versus standard quantum compilers on MQTBench and Queko benchmark circuits.

References

42 extracted · 42 resolved · 11 Pith anchors

[1] Layer Normalization 2016 · arXiv:1607.06450

[2] Neural Combinatorial Optimization with Reinforcement Learning 2016 · arXiv:1611.09940

[3] Machine learning for combinatorial optimization: a methodological tour d’horizon.European Journal of Operational Research, 290(2):405–421 2021

[4] RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark 2025

[5] Quantum Compiler Optimizations 2012 · arXiv:1206.3348

Receipt and verification

First computed	2026-05-18T02:44:17.632912Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

b39df8baf83aa993500ee4d9306373eca49ea86a5546635aabb06674a4b63f73

Aliases

arxiv: 2605.13638 · arxiv_version: 2605.13638v1 · doi: 10.48550/arxiv.2605.13638 · pith_short_12: WOO7ROXYHKUZ · pith_short_16: WOO7ROXYHKUZGUAO · pith_short_8: WOO7ROXY

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/WOO7ROXYHKUZGUAO4TMTAY3T5S \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: b39df8baf83aa993500ee4d9306373eca49ea86a5546635aabb06674a4b63f73

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "3b4f197186bdb990a03b950dcd2c0fa04d6b9d1c6b22dcefac0a1cc96e5c0229",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "quant-ph",
    "submitted_at": "2026-05-13T15:04:09Z",
    "title_canon_sha256": "6ff6a47e2ad3c715684d29bbbaa7f2de4cf3c282285c5a9b7e70805df2e82fcd"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13638",
    "kind": "arxiv",
    "version": 1
  }
}