LongAlign: A recipe for long context alignment of large language models

Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, Juanzi Li · 2024 · arXiv 2401.18058

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Long-Text-to-Image Generation via Compositional Prompt Decomposition

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

PRISM lets pre-trained text-to-image models handle long prompts by breaking them into compositional parts, predicting noise separately, and merging outputs via energy-based conjunction, matching fine-tuned models while generalizing better to prompts over 500 tokens.

Where Does Long-Context Supervision Actually Go? Effective-Context Exposure Balancing

cs.CL · 2026-05-11 · conditional · novelty 6.0

EXACT re-allocates training supervision by inverse frequency of long effective-context targets, improving NoLiMa and RULER scores by 5-18 points on Qwen and LLaMA models without degrading standard QA or reasoning.

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

cs.CV · 2024-08-12 · unverdicted · novelty 6.0

CogVideoX generates coherent 10-second text-to-video outputs at high resolution using a 3D VAE, expert adaptive LayerNorm transformer, progressive training, and a custom data pipeline, claiming state-of-the-art results.

A Decomposition Perspective to Long-context Reasoning for LLMs

cs.CL · 2026-04-09 · unverdicted · novelty 5.0

Decomposing long-context reasoning into atomic skills, synthesizing targeted pseudo-datasets, and applying RL improves LLM performance on long-context benchmarks by an average of 7.7%.

Context Collapse: Barriers to Adoption for Generative AI in Workplace Settings

cs.CY · 2026-04-06 · unverdicted · novelty 5.0

Expert interviews demonstrate that context in generative AI workplace use collapses or rots over time, limiting tool effectiveness and revealing pitfalls in computational context approaches.

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

cs.CL · 2025-02-04 · unverdicted · novelty 5.0

SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.

EndPrompt: Efficient Long-Context Extension via Terminal Anchoring

cs.CL · 2026-05-14

citing papers explorer

Showing 7 of 7 citing papers.

Long-Text-to-Image Generation via Compositional Prompt Decomposition cs.CV · 2026-04-20 · unverdicted · none · ref 42
PRISM lets pre-trained text-to-image models handle long prompts by breaking them into compositional parts, predicting noise separately, and merging outputs via energy-based conjunction, matching fine-tuned models while generalizing better to prompts over 500 tokens.
Where Does Long-Context Supervision Actually Go? Effective-Context Exposure Balancing cs.CL · 2026-05-11 · conditional · none · ref 2
EXACT re-allocates training supervision by inverse frequency of long effective-context targets, improving NoLiMa and RULER scores by 5-18 points on Qwen and LLaMA models without degrading standard QA or reasoning.
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer cs.CV · 2024-08-12 · unverdicted · none · ref 12
CogVideoX generates coherent 10-second text-to-video outputs at high resolution using a 3D VAE, expert adaptive LayerNorm transformer, progressive training, and a custom data pipeline, claiming state-of-the-art results.
A Decomposition Perspective to Long-context Reasoning for LLMs cs.CL · 2026-04-09 · unverdicted · none · ref 3
Decomposing long-context reasoning into atomic skills, synthesizing targeted pseudo-datasets, and applying RL improves LLM performance on long-context benchmarks by an average of 7.7%.
Context Collapse: Barriers to Adoption for Generative AI in Workplace Settings cs.CY · 2026-04-06 · unverdicted · none · ref 14
Expert interviews demonstrate that context in generative AI workplace use collapses or rots over time, limiting tool effectiveness and revealing pitfalls in computational context approaches.
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model cs.CL · 2025-02-04 · unverdicted · none · ref 150
SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.
EndPrompt: Efficient Long-Context Extension via Terminal Anchoring cs.CL · 2026-05-14 · unreviewed · ref 1

LongAlign: A recipe for long context alignment of large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer