Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
arXiv preprint arXiv:2201.11706 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Proportion of unsafe images in training data directly increases unsafe outputs in text-to-image models, independent of absolute count, with complementary risk reduction from safer text encoders.
LLM resume summaries exhibit name-conditioned evaluative bias concentrated in distribution tails, transforming directional harm into symmetric instability that may evade conventional fairness audits.
Representations learned by large AI models are converging toward a shared statistical model of reality.
citing papers explorer
-
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
-
No Safe Dose: How Training Data Drives Unsafe Image Generation
Proportion of unsafe images in training data directly increases unsafe outputs in text-to-image models, independent of absolute count, with complementary risk reduction from safer text encoders.
-
Bias in the Tails: How Name-conditioned Evaluative Framing in Resume Summaries Destabilizes LLM-based Hiring
LLM resume summaries exhibit name-conditioned evaluative bias concentrated in distribution tails, transforming directional harm into symmetric instability that may evade conventional fairness audits.
-
The Platonic Representation Hypothesis
Representations learned by large AI models are converging toward a shared statistical model of reality.