pith. sign in

Title resolution pending

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

baseline 1

citation-polarity summary

roles

baseline 1

polarities

baseline 1

representative citing papers

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

cs.CL · 2024-10-23 · conditional · novelty 6.0

Adapting autoregressive models via continual pre-training yields diffusion language models from 127M to 7B parameters that outperform prior diffusion models and compete with their autoregressive counterparts on language, reasoning, and commonsense benchmarks.

citing papers explorer

Showing 4 of 4 citing papers.