An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference

Yamaguchi, Atsuki, Villavicencio, Aline, Aletras, Nikolaos · 2024 · DOI 10.18653/v1/2024.findings-emnlp.396

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.

Multilinguality of Large Language Models From a Structural Perspective

cs.CL · 2026-06-01 · unverdicted · novelty 6.0

Low-resource languages are structurally more different from English in LLMs than high- or mid-resource ones, and language-specific post-training alters structures while preserving inter-language relationships.

citing papers explorer

Showing 2 of 2 citing papers.

TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment cs.CL · 2026-05-13 · unverdicted · none · ref 105
TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.
Multilinguality of Large Language Models From a Structural Perspective cs.CL · 2026-06-01 · unverdicted · none · ref 38
Low-resource languages are structurally more different from English in LLMs than high- or mid-resource ones, and language-specific post-training alters structures while preserving inter-language relationships.

An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference

fields

years

verdicts

representative citing papers

citing papers explorer