Multi-Legal-Bench creates a sparse 5x6 task-jurisdiction matrix across six countries and reports that few-shot effects replicate, no model dominates, cross-lingual transfer tracks label alignment more than language family, and tokenizer fertility does not predict accuracy.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Tokenizer fertility varies 2.5x across 25 European languages with domain-invariant rankings, morphological fragmentation in high-fertility cases, and a Ukrainian penalty from pre-training underrepresentation.
citing papers explorer
-
Multi-Legal-Bench: Evaluating LLMs on Legal Reasoning Across Jurisdictions, Languages, and Legal Traditions
Multi-Legal-Bench creates a sparse 5x6 task-jurisdiction matrix across six countries and reports that few-shot effects replicate, no model dominates, cross-lingual transfer tracks label alignment more than language family, and tokenizer fertility does not predict accuracy.
-
The Tokenizer Tax Across 25 European Languages: Domain Invariance, Cross-Lingual Few-Shot Effects, and the Ukrainian Penalty
Tokenizer fertility varies 2.5x across 25 European languages with domain-invariant rankings, morphological fragmentation in high-fertility cases, and a Ukrainian penalty from pre-training underrepresentation.