Merging fine-tuned models for multilingual translation fails because fine-tuning redistributes language-specific neurons rather than sharpening them, increasing representational divergence in output-generating layers.
In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
HAT Score analysis of 20 models on 3 benchmarks finds transfer functional in small models, slower-than-expected gains with scale, and clear progress over time.
Proposes distribution-level unsupervised feature discovery for LLMs by clustering continuations on semantic content and mechanistic attributions without target outputs.
Low-resource languages are structurally more different from English in LLMs than high- or mid-resource ones, and language-specific post-training alters structures while preserving inter-language relationships.
Sparse autoencoder features in language models do not satisfy joint falsification criteria for unified grammatical violation detectors across linguistic phenomena.
GRPO reinforcement learning on the new PolyFact dataset outperforms SFT and CPT for cross-lingual factual consistency in Qwen-2.5-7B and OLMo-2-7B by reducing language specialization in MLP and attention layers.
HONES ranks feed-forward neurons by their causal contributions from task-relevant attention heads and uses lightweight scaling to steer performance on multiple vision-language tasks.
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.
citing papers explorer
-
One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model Merging
Merging fine-tuned models for multilingual translation fails because fine-tuning redistributes language-specific neurons rather than sharpening them, increasing representational divergence in output-generating layers.
-
Are Multilingual Models Actually Improving? Isolating True Cross-Lingual Transfer
HAT Score analysis of 20 models on 3 benchmarks finds transfer functional in small models, slower-than-expected gains with scale, and clear progress over time.
-
Shared Semantics, Divergent Mechanisms: Unsupervised Feature Discovery by Aligning Semantics and Mechanisms
Proposes distribution-level unsupervised feature discovery for LLMs by clustering continuations on semantic content and mechanistic attributions without target outputs.
-
Multilinguality of Large Language Models From a Structural Perspective
Low-resource languages are structurally more different from English in LLMs than high- or mid-resource ones, and language-specific post-training alters structures while preserving inter-language relationships.
-
Do Language Models Encode Knowledge of Linguistic Constraint Violations?
Sparse autoencoder features in language models do not satisfy joint falsification criteria for unified grammatical violation detectors across linguistic phenomena.
-
Improving Cross-Lingual Factual Recall via Consistency-Driven Reinforcement Learning
GRPO reinforcement learning on the new PolyFact dataset outperforms SFT and CPT for cross-lingual factual consistency in Qwen-2.5-7B and OLMo-2-7B by reducing language specialization in MLP and attention layers.
-
From Heads to Neurons: Causal Attribution and Steering in Multi-Task Vision-Language Models
HONES ranks feed-forward neurons by their causal contributions from task-relevant attention heads and uses lightweight scaling to steer performance on multiple vision-language tasks.
-
Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.
- Copy First, Translate Later: Interpreting Translation Dynamics in Multilingual Pretraining