The factorization curse: Which tokens you predict underlie the reversal curse and more

Ouail Kitouni, Niklas Nolte, Diane Bouchacourt, Adina Williams, Mike Rabbat, Mark Ibrahim · 2024 · arXiv 2406.05183

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

citing papers explorer

Showing 1 of 1 citing paper.

Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 37
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

The factorization curse: Which tokens you predict underlie the reversal curse and more

fields

years

verdicts

representative citing papers

citing papers explorer