{"paper":{"title":"EndPrompt: Efficient Long-Context Extension via Terminal Anchoring","license":"http://creativecommons.org/licenses/by/4.0/","headline":"EndPrompt extends LLM context windows to 64K by training only on short sequences with a terminal prompt anchored at target positions.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Dawei Yin, Fang Wang, Han Tian, Haoyi Xiong, Jiamin Chen, Jiashu Zhao, Jinman Zhao, Luxuan Chen, Rui Kong, Shuaiqiang Wang, Xinran Chen, Yuchen Li","submitted_at":"2026-05-14T09:00:03Z","abstract_excerpt":"Extending the context window of large language models typically requires training on sequences at the target length, incurring quadratic memory and computational costs that make long-context adaptation expensive and difficult to reproduce. We propose EndPrompt, a method that achieves effective context extension using only short training sequences. The core insight is that exposing a model to long-range relative positional distances does not require constructing full-length inputs: we preserve the original short context as an intact first segment and append a brief terminal prompt as a second s"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"long-context generalization can be induced from sparse positional supervision, challenging the prevailing assumption that dense long-sequence training is necessary for reliable context-window extension.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That assigning positional indices near the target length to a brief terminal prompt in short sequences preserves the necessary relative distances and semantic continuity for effective long-context learning without introducing artifacts from the artificial split.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"EndPrompt induces reliable long-context generalization in LLaMA models from sparse positional supervision via a two-segment short-sequence construction with terminal anchoring.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"EndPrompt extends LLM context windows to 64K by training only on short sequences with a terminal prompt anchored at target positions.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"956f40ed766a3a7c86082e6634ac42ae553e431830d2871c7e3df584ff326ee9"},"source":{"id":"2605.14589","kind":"arxiv","version":1},"verdict":{"id":"9d7ff822-5310-44e5-80d4-89377fc88121","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T01:26:59.577618Z","strongest_claim":"long-context generalization can be induced from sparse positional supervision, challenging the prevailing assumption that dense long-sequence training is necessary for reliable context-window extension.","one_line_summary":"EndPrompt induces reliable long-context generalization in LLaMA models from sparse positional supervision via a two-segment short-sequence construction with terminal anchoring.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That assigning positional indices near the target length to a brief terminal prompt in short sequences preserves the necessary relative distances and semantic continuity for effective long-context learning without introducing artifacts from the artificial split.","pith_extraction_headline":"EndPrompt extends LLM context windows to 64K by training only on short sequences with a terminal prompt anchored at target positions."},"references":{"count":35,"sample":[{"doi":"","year":2024,"title":"Longalign: A recipe for long context alignment of large language models","work_id":"aa055c89-c8e0-44aa-ad75-65f83f864a16","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding","work_id":"ba7831c4-9427-4e0e-a5c1-4e98511f4b53","ref_index":2,"cited_arxiv_id":"2308.14508","is_internal_anchor":true},{"doi":"","year":2004,"title":"Longformer: The Long-Document Transformer","work_id":"abea7a44-6668-4de7-aab6-f53a6e5aa088","ref_index":3,"cited_arxiv_id":"2004.05150","is_internal_anchor":true},{"doi":"","year":2022,"title":"Lexglue: A benchmark dataset for legal language understanding in english.Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022","work_id":"012aa679-98a0-4ee5-84f8-069cf54e0fb1","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"L-Eval: Instituting standardized evaluation for long context language models","work_id":"8c5b99fd-55bf-4d88-b148-8bb61e28ee99","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":35,"snapshot_sha256":"ce7bbf1d451921abfe095d0466d9340d67f6b3022fa57d4a5f8864c0facbd299","internal_anchors":10},"formal_canon":{"evidence_count":2,"snapshot_sha256":"60d9ddabfd61ffc3d0686fc1dbee92877d5e22a38ebf902babd40f08109f16c3"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}