Analysis of 11 LLMs on 21 disputed inventions across 12 languages and 75,896 responses finds query language systematically shifts credit toward lower-status claimants in their associated language while Anglophone figures remain stable.
Cultural bias and cultural alignment of large language models
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9representative citing papers
Open-ended preference data reveals substantial plurality in what people want from AI and divergent interpretations of shared values such as truthfulness.
Teachers' views on AI benefits and risks vary widely across 55 countries, but LLMs compress these differences, overestimate both sides, and show little improvement from country prompting or better reasoning.
LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.
Occupational prompting of open-weight LLMs elicits structured value patterns in Inglehart-Welzel cultural space, extending prior nationality-based cultural bias evaluations.
LLMs generate narratives containing persistent stereotypes, erasure, and one-dimensional portrayals of Global Majority national identities, with minoritized groups overrepresented in subordinated roles by more than fifty times compared to dominant portrayals.
LLM agents display limited alignment with human emotional responses to red tape across cultures, performing worse in Eastern contexts, while cultural prompting offers little improvement.
The paper outlines opportunities, limitations, and practical parameters for integrating LLMs into qualitative research while aligning with epistemological commitments like reflexivity and interpretive judgment.
citing papers explorer
-
Same question, different history: language, national identity, and credit in large language models
Analysis of 11 LLMs on 21 disputed inventions across 12 languages and 75,896 responses finds query language systematically shifts credit toward lower-status claimants in their associated language while Anglophone figures remain stable.
-
What Do People Actually Want From AI? Mapping Preference Plurality
Open-ended preference data reveals substantial plurality in what people want from AI and divergent interpretations of shared values such as truthfulness.
-
Teachers' Perceived Benefits and Risks of AI Across Fifty-Five Countries: An Audit of LLM Alignment and Steerability
Teachers' views on AI benefits and risks vary widely across 55 countries, but LLMs compress these differences, overestimate both sides, and show little improvement from country prompting or better reasoning.
-
When AI Says It Feels
LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.
-
Occupational Prompting Reveals Cultural Bias in Large Language Models
Occupational prompting of open-weight LLMs elicits structured value patterns in Inglehart-Welzel cultural space, extending prior nationality-based cultural bias evaluations.
-
Representational Harms in LLM-Generated Narratives Against Global Majority Nationalities
LLMs generate narratives containing persistent stereotypes, erasure, and one-dimensional portrayals of Global Majority national identities, with minoritized groups overrepresented in subordinated roles by more than fifty times compared to dominant portrayals.
-
Cross-Cultural Simulation of Citizen Emotional Responses to Bureaucratic Red Tape Using LLM Agents
LLM agents display limited alignment with human emotional responses to red tape across cultures, performing worse in Eastern contexts, while cultural prompting offers little improvement.
-
LLMs in Qualitative Research: Opportunities, Limitations, and Practical Considerations
The paper outlines opportunities, limitations, and practical parameters for integrating LLMs into qualitative research while aligning with epistemological commitments like reflexivity and interpretive judgment.
- Culturally uneven urban perception in large language models