StereoTales shows that all tested LLMs emit harmful stereotypes in open-ended stories, with associations adapting to prompt language and targeting locally salient groups rather than transferring uniformly across languages.
A survey on fairness in large language models
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
SCOPE is a new large-scale dataset of counterfactual prompt pairs for evaluating fairness and stereotype sensitivity in LLMs across 1,438 topics, nine bias dimensions, 1,536 groups, and four communicative intents.
FairNVT injects calibrated noise into sensitive embeddings of transformer encoders to jointly improve representation-level and prediction-level fairness metrics without degrading task performance.
LLMs exhibit identity-dependent hedging on human rights questions, with group identity as the strongest predictor among tested factors, and group steering mitigates the disparity.
Recruiters perceive themselves as retaining agency over GenAI in hiring pipelines, yet GenAI invisibly architects core evaluation inputs, producing only marginal efficiency gains at the cost of deskilling.
LLMs are more accurate when answers match stereotypes in clear contexts, especially for race-gender combinations, and no tested model shows consistent fairness or reliability across intersectional groups.
Generates 550 roles and 33,000 questions to evaluate 10 LLMs in role-playing, finding 107,580 biased responses.
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
citing papers explorer
-
StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs
StereoTales shows that all tested LLMs emit harmful stereotypes in open-ended stories, with associations adapting to prompt language and targeting locally salient groups rather than transferring uniformly across languages.
-
SCOPE: A Dataset of Stereotyped Prompts for Counterfactual Fairness Assessment of LLMs
SCOPE is a new large-scale dataset of counterfactual prompt pairs for evaluating fairness and stereotype sensitivity in LLMs across 1,438 topics, nine bias dimensions, 1,536 groups, and four communicative intents.
-
FairNVT: Improving Fairness via Noise Injection in Vision Transformers
FairNVT injects calibrated noise into sensitive embeddings of transformer encoders to jointly improve representation-level and prediction-level fairness metrics without degrading task performance.
-
Hedging and Non-Affirmation: Quantifying LLM Alignment on Questions of Human Rights
LLMs exhibit identity-dependent hedging on human rights questions, with group identity as the strongest predictor among tested factors, and group steering mitigates the disparity.
-
Resume-ing Control: (Mis)Perceptions of Agency Around GenAI Use in Recruiting Workflows
Recruiters perceive themselves as retaining agency over GenAI in hiring pipelines, yet GenAI invisibly architects core evaluation inputs, producing only marginal efficiency gains at the cost of deskilling.
-
Intersectional Fairness in Large Language Models
LLMs are more accurate when answers match stereotypes in clear contexts, especially for race-gender combinations, and no tested model shows consistent fairness or reliability across intersectional groups.
-
Fairness Testing of Large Language Models in Role-Playing
Generates 550 roles and 33,000 questions to evaluate 10 LLMs in role-playing, finding 107,580 biased responses.
-
A Survey on the Memory Mechanism of Large Language Model based Agents
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.