pith. sign in
Pith Number

pith:JM5BIDNV

pith:2026:JM5BIDNVCP3RE4ITLUZ6JFKK5O
not attested not anchored not stored refs pending

Beyond Localization: A Comprehensive Diagnosis of Perspective-Conditioned Spatial Reasoning in MLLMs from Omnidirectional Images

(2) Guangzhou University, (3) Queen Mary University of London, 4) ((1) The Hong Kong Polytechnic University, (4) HKUST (Guangzhou)), Ioannis Patras, Jiaxing Li, Wai Keung Wong, Xu Zheng, Yuangong Chen

Multimodal large language models show a large gap between perception and perspective-conditioned spatial reasoning on omnidirectional images.

arxiv:2605.12413 v3 · 2026-05-12 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{JM5BIDNVCP3RE4ITLUZ6JFKK5O}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

PCSR is a key bottleneck in current MLLMs and highlight limited but meaningful room for recovery under targeted optimization.

C2weakest assumption

The eight tasks in PCSR-Bench accurately isolate perspective-conditioned spatial reasoning without confounding effects from omnidirectional projection artifacts or question-generation biases.

C3one line summary

A new benchmark reveals MLLMs achieve only 13% or lower accuracy on advanced perspective-conditioned spatial tasks in omnidirectional images, with RL reward shaping raising a 7B model from 31% to 60% in controlled settings.

Formal links

1 machine-checked theorem link

Receipt and verification
First computed 2026-05-20T00:05:47.312094Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

4b3a140db513f71271135d33e4954aebb35e09ab2ad406268bff8aa860c39f3b

Aliases

arxiv: 2605.12413 · arxiv_version: 2605.12413v3 · doi: 10.48550/arxiv.2605.12413 · pith_short_12: JM5BIDNVCP3R · pith_short_16: JM5BIDNVCP3RE4IT · pith_short_8: JM5BIDNV
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/JM5BIDNVCP3RE4ITLUZ6JFKK5O \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4b3a140db513f71271135d33e4954aebb35e09ab2ad406268bff8aa860c39f3b
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "84b5a50f01efed576da9fd608f88f0f8c8f43236f7ed81fb133dd27bcad04efa",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-12T17:11:17Z",
    "title_canon_sha256": "90b9c441bba235778d8a67f1d1b094c97502dcde0cda233d2b9396133f16a348"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12413",
    "kind": "arxiv",
    "version": 3
  }
}