LLMs show accuracy drops of 0.3% to 5.9% on GSM8K math problems when culturally adapted to six countries while keeping math operations identical, with statistical significance confirmed by McNemar tests.
During this process, we encounter a small inconsistency
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
LLMs show accuracy drops of 0.3% to 5.9% on GSM8K math problems when culturally adapted to six countries while keeping math operations identical, with statistical significance confirmed by McNemar tests.