pith. sign in
Pith Number

pith:6VHA5ZYD

pith:2026:6VHA5ZYDD2XLVMR2INZZKXRP4O
not attested not anchored not stored refs resolved

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

Hyoungjoon Lee, Injin Kong, Yohan Jo

Geometry-based proxies on hidden states identify shallow layers where a diffusion bridge can replace the lower prefix of a pretrained transformer while recovering the hidden state rather than tokens.

arxiv:2605.14368 v1 · 2026-05-14 · cs.CL · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{6VHA5ZYDD2XLVMR2INZZKXRP4O}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Experiments on 8B-scale backbones show that the geometry score predicts effective shallow insertion layers under a fixed bridge-training protocol and that hidden-state recovery improves over continuous diffusion baselines in a diagnostic comparison matching the diffusion/recovery training budget.

C2weakest assumption

That geometry-based proxies computed on pretrained hidden states reliably identify layers where a diffusion bridge can be inserted without extensive additional validation or retraining of the upper layers.

C3one line summary

DiHAL uses geometry proxies to pick where to replace the lower layers of a pretrained transformer with a diffusion bridge for hidden-state reconstruction, improving over token-level diffusion baselines on 8B models.

References

49 extracted · 49 resolved · 1 Pith anchors

[1] Tom B. Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and Jared Kaplan and Prafulla Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and Sandhini Agarwa 2020
[2] Advances in Neural Information Processing Systems , editor= 2022
[3] Qwen3 Technical Report , author=. 2025 , eprint= 2025
[4] International Conference on Learning Representations , year=
[5] Scaling Laws for Diffusion Transformers , author=. 2025 , url= 2025

Formal links

2 machine-checked theorem links

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-05-17T23:39:07.862219Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

f54e0ee7031eaebab23a4373955e2fe386e04b4902e9d11b2dad624dac22c968

Aliases

arxiv: 2605.14368 · arxiv_version: 2605.14368v1 · doi: 10.48550/arxiv.2605.14368 · pith_short_12: 6VHA5ZYDD2XL · pith_short_16: 6VHA5ZYDD2XLVMR2 · pith_short_8: 6VHA5ZYD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/6VHA5ZYDD2XLVMR2INZZKXRP4O \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: f54e0ee7031eaebab23a4373955e2fe386e04b4902e9d11b2dad624dac22c968
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "d0ad9859bf8621c515bf8fca927e5607f1a53b99b20166c1966626f3c957f736",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-14T04:47:54Z",
    "title_canon_sha256": "0c11d0fdd6459862a3b7f9442ae7194fff49d5051e7b403f233bd6554c4db1d3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14368",
    "kind": "arxiv",
    "version": 1
  }
}