{"paper":{"title":"Group Representational Position Encoding","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"GRAPE models positions as group actions on features, recovering RoPE and ALiBi exactly while adding low-cost extensions for cross-feature coupling.","cross_cats":["cs.AI","cs.CL"],"primary_cat":"cs.LG","authors_text":"Andrew Chi-Chih Yao, Huizhuo Yuan, Kangping Xu, Quanquan Gu, Yang Yuan, Yifan Zhang, Yifeng Liu, Zhen Qin, Zixiang Chen","submitted_at":"2025-12-08T18:39:13Z","abstract_excerpt":"We present GRAPE (Group Representational Position Encoding), a unified framework for positional encoding based on group actions. GRAPE unifies two families of mechanisms: (i) multiplicative rotations (Multiplicative GRAPE) in $\\operatorname{SO}(d)$ and (ii) additive logit biases (Additive GRAPE) arising from unipotent actions in the general linear group $\\mathrm{GL}$. In Multiplicative GRAPE, a position $n \\in \\mathbb{Z}$ (or $t \\in \\mathbb{R}$) acts as $\\mathbf{G}(n) = \\exp(n \\, \\omega \\, \\mathbf{L})$ with a rank-2 skew-symmetric generator $\\mathbf{L} \\in \\mathbb{R}^{d \\times d}$, yielding a "},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"GRAPE provides a principled design space for positional geometry in long-context models, subsuming RoPE and ALiBi as special cases.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the proposed extensions (learned commuting subspaces and compact non-commuting mixtures) will capture useful cross-subspace feature coupling in practice without introducing new optimization difficulties or performance regressions.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"GRAPE unifies RoPE and ALiBi as special cases of group actions on positions, providing a principled design space for positional encodings via SO(d) rotations and GL unipotent transformations.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"GRAPE models positions as group actions on features, recovering RoPE and ALiBi exactly while adding low-cost extensions for cross-feature coupling.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"67e2f0594da038e93aa5f787cac1737a7b49ea4853111269620b9dcd2b94e971"},"source":{"id":"2512.07805","kind":"arxiv","version":6},"verdict":{"id":"7e2fece5-70b1-45d4-b106-30f9543d7e05","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-17T00:00:43.870987Z","strongest_claim":"GRAPE provides a principled design space for positional geometry in long-context models, subsuming RoPE and ALiBi as special cases.","one_line_summary":"GRAPE unifies RoPE and ALiBi as special cases of group actions on positions, providing a principled design space for positional encodings via SO(d) rotations and GL unipotent transformations.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the proposed extensions (learned commuting subspaces and compact non-commuting mixtures) will capture useful cross-subspace feature coupling in practice without introducing new optimization difficulties or performance regressions.","pith_extraction_headline":"GRAPE models positions as group actions on features, recovering RoPE and ALiBi exactly while adding low-cost extensions for cross-feature coupling."},"references":{"count":41,"sample":[{"doi":"","year":2025,"title":"Round and round we go! what makes rotary positional encodings useful? In International Conference on Learning Representations (ICLR 2025),","work_id":"870f04a1-b332-40d5-a979-59692692e391","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2004,"title":"Round and round we go! what makes rotary positional encodings useful?","work_id":"e8226205-3ecb-4aa3-a204-f042c63750bb","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2009,"title":"Extending Context Window of Large Language Models via Positional Interpolation","work_id":"c8b6df85-e7da-4bd8-90a4-d309cc2a0f60","ref_index":4,"cited_arxiv_id":"2306.15595","is_internal_anchor":true},{"doi":"","year":null,"title":"Contextual position encoding: Learning to count what’s important","work_id":"b3573740-adb9-4ab4-b55e-79456028e161","ref_index":6,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Transformer language models without positional encodings still learn positional information","work_id":"4ec68a0c-1f00-40be-8d28-278f90317c49","ref_index":7,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":41,"snapshot_sha256":"e5db67abf4f56546c1de092e98be3af59ed5d1f4fb23234a4ec32dfa86dcd0d1","internal_anchors":8},"formal_canon":{"evidence_count":2,"snapshot_sha256":"f710d7659e99f99c228ebd75adfebff4569a4c9f0f3048e3681da7c398dec3b7"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}