LLMs default to responses more similar to opinions from the USA and some European and South American countries; prompting for a country shifts alignment but can introduce stereotypes, while translation does not reliably match language speakers.
Persistent anti-muslim bias in large language models
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3representative citing papers
The Pareto frontier of fair algorithmic decisions consists of deterministic group-specific threshold rules on predicted success probabilities, which can include upper bounds for some fairness metrics and holds independently of model training approach.
Human-written screenplays pass the Bechdel test more often than those generated by GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5, though network analyses show mixed bias patterns across all script types.
A six-month qualitative study of a mixed-ability nonprofit finds that conflicting access needs in communication act as a generative process revealing power structures and enabling accountability and repair rather than serving as technical problems to eliminate.
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.
citing papers explorer
-
Towards Measuring the Representation of Subjective Global Opinions in Language Models
LLMs default to responses more similar to opinions from the USA and some European and South American countries; prompting for a country shifts alignment but can introduce stereotypes, while translation does not reliably match language speakers.
-
Fairness vs Performance: Characterizing the Pareto Frontier of Algorithmic Decision Systems
The Pareto frontier of fair algorithmic decisions consists of deterministic group-specific threshold rules on predicted success probabilities, which can include upper bounds for some fairness metrics and holds independently of model training approach.
-
Do Language Models Pass the Bechdel Test? Auditing Gender Biases in LLM-Generated Screenplays
Human-written screenplays pass the Bechdel test more often than those generated by GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5, though network analyses show mixed bias patterns across all script types.
-
Designing for Collective Access: In Search of a Solution to Accessible Communication in a Mixed-Ability Non-Profit
A six-month qualitative study of a mixed-ability nonprofit finds that conflicting access needs in communication act as a generative process revealing power structures and enabling accountability and repair rather than serving as technical problems to eliminate.
-
AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.