Claude Opus 4.6 fabricates more answers on Global North AI contexts than Global South ones, creating an exploitable vulnerability in AI control monitors.
arXiv preprint arXiv:2402.10946 (2024)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
MEMOed framework attributes LLM generations about cultures to pretraining memorization and finds frequency-based biases across 110 cultures for food and clothing.
XTransplant empirically shows that cross-lingual latent transplantation yields mutual benefits for multilingual capability and cultural adaptability in LLMs, especially low-resource ones, while revealing underutilized model potential.
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.
citing papers explorer
-
Geographic Blind Spots in AI Control Monitors: A Cross-National Audit of Claude Opus 4.6
Claude Opus 4.6 fabricates more answers on Global North AI contexts than Global South ones, creating an exploitable vulnerability in AI control monitors.
-
Attributing Culture-Conditioned Generations to Pretraining Corpora
MEMOed framework attributes LLM generations about cultures to pretraining memorization and finds frequency-based biases across 110 cultures for food and clothing.
-
Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges
XTransplant empirically shows that cross-lingual latent transplantation yields mutual benefits for multilingual capability and cultural adaptability in LLMs, especially low-resource ones, while revealing underutilized model potential.
-
Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.