Training-language dominance, not English inherent properties, determines brain-LLM alignment across English, Chinese, and French, with additional independent effects from typological distance concentrated in syntactic brain regions.
hub
How Multilingual is Multilingual BERT? , booktitle =
15 Pith papers cite this work, alongside 569 external citations. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
fields
cs.CL 15roles
background 2polarities
background 2representative citing papers
A framework with TOPPing source selection and VACAI-Bowl dual-branch model yields 54.62% average improvement in dependency parsing across 10 low-resource varieties.
Parallel-SFT mixes parallel programs across languages during SFT to produce more transferable RL initializations, yielding better zero-shot generalization to unseen programming languages.
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
HAT Score analysis of 20 models on 3 benchmarks finds transfer functional in small models, slower-than-expected gains with scale, and clear progress over time.
CRAFT is a Pareto-front prompt optimizer that allocates scarce LLM validation calls to candidates near the current front using accuracy- and cost-oriented generators plus NSGA-II retention.
Broad empirical evaluation finds that fine-tuning heuristics for source-language choice in cross-lingual transfer do not hold reliably under in-context learning.
Biaffine LSTM outperforms transformer parsers like AfroXLMR and RemBERT in low-resource dependency parsing, with transformers gaining advantage as data increases and morphological complexity as a secondary predictor.
Bengali sentiment analysis models exhibit persistent identity-based biases across datasets and developer backgrounds despite similar semantic content.
A survey that taxonomizes motivations for transliteration in cross-lingual NLP, reviews incorporation approaches and their evolution, analyzes trade-offs in settings like code-mixing and language families, and offers implementation recommendations.
A feature-rich regression model using multilingual embeddings and features for frequency, cognate similarity, and predictability reports RMSE scores of 1.132, 1.037, and 0.891 for L1-aware vocabulary difficulty prediction on Spanish, German, and Chinese.
A heterogeneous ensemble of XLM-RoBERTa-large and mDeBERTa-v3-base with independent task modeling and class weighting is reported as effective for multilingual, multicultural, and multievent online polarization detection.
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.
citing papers explorer
-
Brain-LLM Alignment Tracks Training Data, Not Typology
Training-language dominance, not English inherent properties, determines brain-LLM alignment across English, Chinese, and French, with additional independent effects from typological distance concentrated in syntactic brain regions.
-
Harnessing Linguistic Dissimilarity for Language Generalization on Unseen Low-Resource Varieties
A framework with TOPPing source selection and VACAI-Bowl dual-branch model yields 54.62% average improvement in dependency parsing across 10 low-resource varieties.
-
Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL
Parallel-SFT mixes parallel programs across languages during SFT to produce more transferable RL initializations, yielding better zero-shot generalization to unseen programming languages.
-
M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
-
OPT: Open Pre-trained Transformer Language Models
OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
-
Are Multilingual Models Actually Improving? Isolating True Cross-Lingual Transfer
HAT Score analysis of 20 models on 3 benchmarks finds transfer functional in small models, slower-than-expected gains with scale, and clear progress over time.
-
CRAFT: Cost-aware Refinement And Front-aware Tuning of Prompts
CRAFT is a Pareto-front prompt optimizer that allocates scarce LLM validation calls to candidates near the current front using accuracy- and cost-oriented generators plus NSGA-II retention.
-
When English Isn't the Best Teacher: Source Language Effects in Cross-Lingual In-Context Learning
Broad empirical evaluation finds that fine-tuning heuristics for source-language choice in cross-lingual transfer do not hold reliably under in-context learning.
-
Dependency Parsing Across the Resource Spectrum: Evaluating Architectures on High and Low-Resource Languages
Biaffine LSTM outperforms transformer parsers like AfroXLMR and RemBERT in low-resource dependency parsing, with transformers gaining advantage as data increases and morphological complexity as a secondary predictor.
-
How do datasets, developers, and models affect biases in a low-resourced language?: The Case of the Bengali Language
Bengali sentiment analysis models exhibit persistent identity-based biases across datasets and developer backgrounds despite similar semantic content.
-
Scripts Through Time: A Survey of the Evolving Role of Transliteration in NLP
A survey that taxonomizes motivations for transliteration in cross-lingual NLP, reviews incorporation approaches and their evolution, analyzes trade-offs in settings like code-mixing and language families, and offers implementation recommendations.
-
UOL@IDEM at BEA 2026 Shared Task 1: Neural Fusion and Feature-Rich Modeling for L1-Aware Vocabulary Difficulty Prediction
A feature-rich regression model using multilingual embeddings and features for frequency, cognate similarity, and predictability reports RMSE scores of 1.132, 1.037, and 0.891 for L1-aware vocabulary difficulty prediction on Spanish, German, and Chinese.
-
YEZE at SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization via Heterogeneous Ensembling
A heterogeneous ensemble of XLM-RoBERTa-large and mDeBERTa-v3-base with independent task modeling and class weighting is reported as effective for multilingual, multicultural, and multievent online polarization detection.
-
Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.
- Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation