SARA aligns internal routing distributions in MoE layers to high-resource semantic anchors via symmetric JS divergence, improving low-resource language performance by 0.8-1.2% over standard instruction tuning on Global-MMLU.
arXiv preprint arXiv:2603.10351 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLM safety judges resist adjusting evaluations when given contradictory context or new safety definitions, despite some ability to learn from new information.
citing papers explorer
-
SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment
SARA aligns internal routing distributions in MoE layers to high-resource semantic anchors via symmetric JS divergence, improving low-resource language performance by 0.8-1.2% over standard instruction tuning on Global-MMLU.
-
Safety is Contextual, LLM-Judges Are Not: Navigating the Rigid Priors of Evaluators
LLM safety judges resist adjusting evaluations when given contradictory context or new safety definitions, despite some ability to learn from new information.