pith. sign in
Pith Number

pith:RZILCBXJ

pith:2026:RZILCBXJKTBWOC34HAXTGATBM6
not attested not anchored not stored refs resolved

Thinking with Patterns: Breaking the Perceptual Bottleneck in Visual Planning via Pattern Induction

Boyuan Xiao, Yao-Xiang Ding, Yichang Jian, Yifei Peng, Zhenyuan Huang

Vision-language models overcome perceptual limits in visual planning by inducing reusable patterns that build accurate internal world models step by step.

arxiv:2605.16848 v1 · 2026-05-16 · cs.CV · cs.AI · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{RZILCBXJKTBWOC34HAXTGATBM6}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The resulting training-free planning strategy enables VLMs to solve tasks that are far beyond their initial capabilities, at the cost that too many TWI operations would significantly increase the computational overhead; Pattern Inference and Pattern Induction achieve a desirable balance between accuracy and efficiency.

C2weakest assumption

That visual patterns can be treated as composite and reusable experts which are autonomously discovered and optimized from experience in a way that directly improves inference efficiency without requiring task-specific retraining or external supervision.

C3one line summary

Pattern Induction discovers reusable visual patterns as experts via online inductive learning, and Pattern Inference uses them to let VLMs perform efficient multi-step visual planning beyond their native capabilities.

References

25 extracted · 25 resolved · 0 Pith anchors

[1] On top of this, we plan the shortest path
[2] If there aren’t any unrevealed grids on the path, the algorithm stops and returns the path
[3] If there are, we check the unrevealed grids one by one from start to goal
[4] If an impassable grid is ever encountered during this process, we immediately go back to step 1 to get another plan
[5] If all checked grids are passable, the algorithm stops and returns the path. Thepolicy-generation procedureworks at step (4) above. Each time, it outputs the first unrevealed grid from start to goal. 2025

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:03:26.003656Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

8e50b106e954c3670b7c382f330261678e14b3de1f6a250333d961a8291d3d01

Aliases

arxiv: 2605.16848 · arxiv_version: 2605.16848v1 · doi: 10.48550/arxiv.2605.16848 · pith_short_12: RZILCBXJKTBW · pith_short_16: RZILCBXJKTBWOC34 · pith_short_8: RZILCBXJ
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/RZILCBXJKTBWOC34HAXTGATBM6 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8e50b106e954c3670b7c382f330261678e14b3de1f6a250333d961a8291d3d01
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "b858adf7b6d06eae85694e2be7d7721a31b81425fdeb69410b992b9fce62cf34",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-16T07:12:19Z",
    "title_canon_sha256": "a72e14f0a7f78f09eada04bf03dc36ca0d9d4cf64d871856be334986c0cb6491"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16848",
    "kind": "arxiv",
    "version": 1
  }
}