Introduces a gradient-based multilingual audit framework for LLM moral decisions in robot assistance scenarios and reports persistent culturally asymmetric gradient tracking failures not fixed by prompting.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
DISCA converts within-country disagreement among World Values Survey personas into a bounded logit correction that reduces cultural misalignment by 10-24% on MultiTP for models 3.8B and larger across 20 countries, without any weight updates.
NormCoRe is a replication-by-translation framework that maps human subject studies onto multi-agent AI environments, showing AI normative judgments on fairness differ from human baselines and vary with model choice and persona language.
citing papers explorer
-
Auditing LLM-Governed Social Robots with Culture-Specific Moral Gradients
Introduces a gradient-based multilingual audit framework for LLM moral decisions in robot assistance scenarios and reports persistent culturally asymmetric gradient tracking failures not fixed by prompting.
-
Training-Free Cultural Alignment of Large Language Models via Persona Disagreement
DISCA converts within-country disagreement among World Values Survey personas into a bounded logit correction that reduces cultural misalignment by 10-24% on MultiTP for models 3.8B and larger across 20 countries, without any weight updates.
-
Normative Common Ground Replication (NormCoRe): Replication-by-Translation for Studying Norms in Multi-Agent AI
NormCoRe is a replication-by-translation framework that maps human subject studies onto multi-agent AI environments, showing AI normative judgments on fairness differ from human baselines and vary with model choice and persona language.