Language models are unsupervised multitask learners

Radford, A · 2019

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Chain-based Distillation for Effective Initialization of Variable-Sized Small Language Models

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

Chain-based Distillation constructs a sequence of anchor models to enable efficient initialization of variable-sized SLMs through interpolation, with bridge distillation for cross-architecture transfer, yielding better performance than scratch training.

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

cs.CR · 2026-05-09 · unverdicted · novelty 6.0

PASA is a semantic-level watermarking method for LLM text that uses embedding-space clusters and synchronized randomness to remain detectable after paraphrasing while preserving text quality.

LACE: Lattice Attention for Cross-thread Exploration

cs.AI · 2026-04-16 · unverdicted · novelty 5.0

LACE enables concurrent reasoning paths in LLMs to interact via lattice attention and a synthetic training pipeline, raising accuracy more than 7 points over independent parallel search.

citing papers explorer

Showing 3 of 3 citing papers.

Chain-based Distillation for Effective Initialization of Variable-Sized Small Language Models cs.CL · 2026-05-08 · unverdicted · none · ref 92
Chain-based Distillation constructs a sequence of anchor models to enable efficient initialization of variable-sized SLMs through interpolation, with bridge distillation for cross-architecture transfer, yielding better performance than scratch training.
PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks cs.CR · 2026-05-09 · unverdicted · none · ref 35
PASA is a semantic-level watermarking method for LLM text that uses embedding-space clusters and synchronized randomness to remain detectable after paraphrasing while preserving text quality.
LACE: Lattice Attention for Cross-thread Exploration cs.AI · 2026-04-16 · unverdicted · none · ref 25
LACE enables concurrent reasoning paths in LLMs to interact via lattice attention and a synthetic training pipeline, raising accuracy more than 7 points over independent parallel search.

Language models are unsupervised multitask learners

fields

years

verdicts

representative citing papers

citing papers explorer