A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.
AI generates covertly racist decisions about people based on their dialect.Nature, 633:147–154
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
Open-ended preference data reveals substantial plurality in what people want from AI and divergent interpretations of shared values such as truthfulness.
Human-written screenplays pass the Bechdel test more often than those generated by GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5, though network analyses show mixed bias patterns across all script types.
citing papers explorer
-
A framework for analyzing concept representations in neural models
A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.
-
What Do People Actually Want From AI? Mapping Preference Plurality
Open-ended preference data reveals substantial plurality in what people want from AI and divergent interpretations of shared values such as truthfulness.
-
Do Language Models Pass the Bechdel Test? Auditing Gender Biases in LLM-Generated Screenplays
Human-written screenplays pass the Bechdel test more often than those generated by GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5, though network analyses show mixed bias patterns across all script types.
- Reducing Political Manipulation with Consistency Training