Revisiting multi- lingual data mixtures in language model pretraining, 2025

Negar Foroutan, Paul Teiletche, Ayush Kumar Tarun, Antoine Bosselut · 2025 · arXiv 2510.25947

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Soft Token Alignment for Cross-Lingual Reasoning

cs.CL · 2026-06-25 · unverdicted · novelty 6.0

SOLAR aligns soft-token probability mixtures across languages in embedding space during SFT and raises multilingual reasoning accuracy by up to 17.7 points over the base model.

On the Limits of Model Merging for Multilinguality in Pre-Training

cs.CL · 2026-05-25 · unverdicted · novelty 5.0

Merging any combination of monolingual pre-trained models leads to performance collapse due to interference, indicating that merging flexibility from fine-tuning does not extend to pre-training.

Cross-Lingual Sentiment Misalignment: Auditing Multilingual Language Models for Inversion Risk, Dialectal Representation, and Affective Stability

cs.CL · 2026-02-19 · unverdicted · novelty 5.0

Multilingual models invert sentiment polarity 28.7% of the time on Bengali text and show asymmetric affective weighting plus a 57% rise in error on formal dialect compared with colloquial Bengali.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Soft Token Alignment for Cross-Lingual Reasoning cs.CL · 2026-06-25 · unverdicted · none · ref 11
SOLAR aligns soft-token probability mixtures across languages in embedding space during SFT and raises multilingual reasoning accuracy by up to 17.7 points over the base model.
On the Limits of Model Merging for Multilinguality in Pre-Training cs.CL · 2026-05-25 · unverdicted · none · ref 16
Merging any combination of monolingual pre-trained models leads to performance collapse due to interference, indicating that merging flexibility from fine-tuning does not extend to pre-training.
Cross-Lingual Sentiment Misalignment: Auditing Multilingual Language Models for Inversion Risk, Dialectal Representation, and Affective Stability cs.CL · 2026-02-19 · unverdicted · none · ref 6
Multilingual models invert sentiment polarity 28.7% of the time on Bengali text and show asymmetric affective weighting plus a 57% rise in error on formal dialect compared with colloquial Bengali.

Revisiting multi- lingual data mixtures in language model pretraining, 2025

fields

years

verdicts

representative citing papers

citing papers explorer