arxiv: 2605.12281 · v1 · submitted 2026-05-12 · 💻 cs.CL · cs.LG

Recognition: no theorem link

What makes a word hard to learn? Modeling L1 influence on English vocabulary difficulty

Jonas Mayer Martins , Zhuojing Huang , Aaricia Herygers , Lisa Beinborn

Authors on Pith no claims yet

Pith reviewed 2026-05-13 04:57 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords vocabulary difficultyL1 influenceEnglish learnersgradient boostingSHAP valuesorthographic transfercross-linguistic transferword familiarity

0 comments

The pith

Word familiarity is the main driver of English vocabulary difficulty for learners whose first language is Spanish, German, or Chinese, with orthographic transfer adding explanatory power only for the first two groups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models how hard individual English words are to learn for speakers of three different first languages. It trains gradient-boosted models on word features grouped into familiarity, meaning, surface form, and cross-linguistic transfer, then uses Shapley values to rank the groups. Familiarity emerges as the strongest shared predictor across all learners. Spanish- and German-speaking learners gain additional signal from orthographic overlap with their native language, while Chinese-speaking learners instead draw on surface-form cues alone. The resulting L1-specific predictions can support more targeted vocabulary selection in language courses.

Core claim

Gradient-boosted models trained on familiarity, meaning, surface-form, and cross-linguistic-transfer features and interpreted with Shapley values establish that word familiarity is the dominant feature group for vocabulary difficulty in all three learner populations. Spanish and German learners additionally depend on orthographic transfer, a mechanism unavailable to Chinese learners whose difficulty is instead shaped by familiarity combined with surface features.

What carries the argument

Gradient-boosted regression models whose predictions are decomposed with Shapley additive explanations to measure the contribution of four feature groups: familiarity, meaning, surface form, and cross-linguistic transfer.

If this is right

L1-tailored difficulty estimates can be used directly to select and sequence vocabulary in language curricula.
Teaching materials for Spanish- and German-speaking learners should exploit orthographic similarities where they exist.
Materials for Chinese-speaking learners should instead emphasize surface-form properties such as length and spelling regularity.
The same modeling approach can generate difficulty scores for any new English word without requiring fresh learner data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If orthographic transfer is confirmed as the differentiating mechanism, language apps could automatically highlight cognate forms for Romance and Germanic learners but skip that cue for Chinese learners.
The surface-feature reliance observed for Chinese speakers suggests that explicit instruction on English spelling patterns may yield larger gains for this group than for the others.

Load-bearing premise

The selected feature groups and the Shapley analysis of gradient-boosted models are sufficient to identify the true factors that drive L1-influenced vocabulary difficulty.

What would settle it

A replication that collects new difficulty ratings from the same learner groups and finds that adding unmodeled variables such as semantic neighborhood density or individual learner exposure history reverses the ranking of familiarity versus orthographic transfer for Spanish or German speakers.

Figures

Figures reproduced from arXiv: 2605.12281 by Aaricia Herygers, Jonas Mayer Martins, Lisa Beinborn, Zhuojing Huang.

**Figure 2.** Figure 2: Predicted versus gold-label lexical difficulty. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Per-item feature-group-importance shares, sorted by decreasing importance of familiarity (left to right). [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Each item projected onto a triangle according to the relative importance of three feature groups (familiarity, [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Cross-L1 evaluation. Each colored bar shows [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Screenshot of our interactive demo. Words of the input text are highlighted according to their lexical difficulty. Clicking on a word opens a panel that shows the gold-label and predicted difficulty as well as the feature-group importance. The L1-background can be switched. Logarithmic frequency. Log-transformed word frequency on the Zipf scale (Van Heuven et al., 2014) fZipf = log10 fpmw + 3 (1) which mo… view at source ↗

**Figure 7.** Figure 7: Pairwise Spearman correlation between nu [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 9.** Figure 9: Pairwise correlation of gold-label lexical difficulty across L1s. Each point is one English word tested [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Character similarity by word frequency for Spanish and German, colored by (a) lexical difficulty and the [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: Prediction-error distribution (gold − predicted) by POS competition status across L1 groups (Spanish, German, Chinese). The violin plots compare items without POS competition (nno = 585) versus items with POS competition (nyes = 163) per language, with horizontal lines marking the median. Edit distance. In addition to character-level cosine similarity, we evaluated character edit distance as a measure o… view at source ↗

**Figure 13.** Figure 13: Frequency of L1 (Spanish) source words versus gold-label difficulty (top) and L2 (English) word frequency (bottom). 18 [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 12.** Figure 12: Normalized edit distance between English [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗

read the original abstract

What makes a word difficult to learn, and how does the difficulty depend on the learner's native language? We computationally model vocabulary difficulty for English learners whose first language is Spanish, German, or Chinese with gradient-boosted models trained on features related to a word's familiarity (e.g., frequency), meaning, surface form, and cross-linguistic transfer. Using Shapley values, we determine the importance of each feature group. Word familiarity is the dominant feature group shared by all three languages. However, predictions for Spanish- and German-speaking learners rely additionally on orthographic transfer. This transfer mechanism is unavailable to Chinese learners, whose difficulty is shaped by a combination of familiarity and surface features alone. Our models provide interpretable, L1-tailored difficulty estimates that can be used to design vocabulary curricula.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Familiarity dominates English word difficulty for all three L1s, with orthographic transfer adding explanatory power only for Spanish and German learners via SHAP on gradient-boosted models.

read the letter

The main takeaway is that word familiarity is the strongest feature group for predicting English vocabulary difficulty no matter whether the learner's L1 is Spanish, German, or Chinese. Orthographic transfer from the L1 supplies additional signal for the first two groups but drops out for Chinese learners, whose difficulty instead tracks surface features more closely. The authors reach this by training gradient-boosted models on grouped features and then using SHAP to rank the groups per L1. The pattern lines up with basic facts about script overlap, so the interpretation feels grounded rather than forced. The work does a clean job of turning an established ML pipeline into L1-specific difficulty estimates that could feed directly into teaching apps or curriculum design. Grouping the features into familiarity, meaning, surface form, and cross-linguistic transfer keeps the attributions readable and avoids drowning in single-variable noise. That is useful applied progress even if the underlying techniques are not new. The soft spots are mostly about missing details. The abstract supplies no accuracy numbers, no dataset size, and no validation scheme, so it is impossible to tell how much the models actually explain or whether the SHAP rankings are stable. The stress-test concern about intercorrelations between groups (frequency and orthographic similarity scores, for example) is reasonable; SHAP can hand shared variance to whichever group the model happens to favor, and without an ablation or correlation matrix the group-level claims rest on an untested assumption. The citation pattern is standard and appropriate for the area. This paper is for people working on educational NLP or modeling second-language acquisition. A reader who wants concrete, L1-aware difficulty scores for vocabulary selection will get something usable out of it. It is not a foundational theoretical piece, but the empirical comparison across typologically different L1s is worth referee time. I would send it for peer review rather than desk-reject; the core setup is honest and the practical angle is clear, even if the authors need to add performance metrics and robustness checks in revision.

Referee Report

2 major / 2 minor

Summary. The paper computationally models the difficulty of English words for learners with L1 Spanish, German, or Chinese using gradient-boosted models. Features are grouped into familiarity (e.g., frequency), meaning, surface form, and cross-linguistic transfer. SHAP values are used to assess the importance of each group. The main result is that familiarity is the dominant factor for all L1s, with additional reliance on orthographic transfer for Spanish and German learners, while Chinese learners' difficulty is determined by familiarity and surface features. The models aim to provide L1-specific difficulty estimates for curriculum design.

Significance. If the SHAP-based attributions hold after accounting for potential feature dependencies, this work would contribute meaningfully to understanding L1 effects on vocabulary learning by providing a data-driven, interpretable framework that differentiates between alphabetic and logographic L1 influences. It builds on standard ML techniques in NLP but applies them to a practical educational question, with potential applications in adaptive language learning systems. The explicit comparison across three L1s strengthens the cross-linguistic aspect.

major comments (2)

[Feature importance attribution] The key finding that orthographic transfer is important only for Spanish and German (but not Chinese) depends on the stability of SHAP group-level attributions. However, the paper does not report correlations between feature groups (e.g., between familiarity features like log-frequency and transfer features like orthographic similarity). If such correlations exist, SHAP may misallocate importance, weakening the claim of distinct L1 mechanisms. An ablation study removing transfer features and comparing model performance or SHAP changes across L1s would strengthen this.
[Model training and evaluation] The abstract and summary provide no details on model performance (e.g., R², accuracy on held-out data), dataset size, or validation methods. Without these, it is difficult to gauge whether the gradient-boosted models are reliable enough to support the SHAP interpretations and the central claims about feature group importance.

minor comments (2)

Ensure all acronyms are defined on first use, such as SHAP.
[Figure 1] The SHAP summary plots could benefit from clearer labeling of the four feature groups to aid reader interpretation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive feedback, which has helped us identify areas to strengthen the manuscript. We address each major comment below and will incorporate revisions to improve the robustness and transparency of our analyses.

read point-by-point responses

Referee: [Feature importance attribution] The key finding that orthographic transfer is important only for Spanish and German (but not Chinese) depends on the stability of SHAP group-level attributions. However, the paper does not report correlations between feature groups (e.g., between familiarity features like log-frequency and transfer features like orthographic similarity). If such correlations exist, SHAP may misallocate importance, weakening the claim of distinct L1 mechanisms. An ablation study removing transfer features and comparing model performance or SHAP changes across L1s would strengthen this.

Authors: We agree that unreported correlations between feature groups could potentially influence SHAP attributions and that an ablation analysis would provide stronger evidence for L1-specific mechanisms. In the revised manuscript, we will compute and report Pearson correlations between all feature groups (familiarity, meaning, surface form, and cross-linguistic transfer) separately for each L1. We will also conduct an ablation study by retraining the gradient-boosted models without the transfer features, then compare changes in overall model performance (R² on held-out data) and shifts in SHAP values for the remaining groups across the Spanish, German, and Chinese cohorts. These additions will directly address concerns about feature dependencies and the stability of our key claims. revision: yes
Referee: [Model training and evaluation] The abstract and summary provide no details on model performance (e.g., R², accuracy on held-out data), dataset size, or validation methods. Without these, it is difficult to gauge whether the gradient-boosted models are reliable enough to support the SHAP interpretations and the central claims about feature group importance.

Authors: We concur that explicit reporting of model performance, dataset characteristics, and validation procedures is necessary to support the reliability of the SHAP-based conclusions. In the revised manuscript, we will update the abstract to include summary performance metrics (e.g., mean R² on held-out test sets) and add a new subsection in the Methods detailing the dataset sizes (number of words rated per L1 and total learner responses), the train/validation/test split ratios, the cross-validation strategy employed, and the hyperparameter optimization process for the gradient-boosted models. These details will enable readers to assess the models' predictive validity and the robustness of the feature importance attributions. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical modeling with post-hoc SHAP attributions

full rationale

The paper trains gradient-boosted regression models on a set of hand-crafted linguistic features (familiarity, meaning, surface form, cross-linguistic transfer) to predict vocabulary difficulty ratings for three L1 groups, then applies the established SHAP method to compute feature-group importances. No equations, derivations, or first-principles claims are present; the reported dominance of familiarity and the L1-specific role of orthographic transfer are direct outputs of the fitted models and their explanations rather than inputs restated by construction. No self-citations are load-bearing, no parameters are fitted on a subset and then relabeled as predictions, and no ansatz or uniqueness theorem is smuggled in. The analysis is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The modeling approach rests on standard machine learning assumptions and feature engineering choices whose details are absent from the abstract; no new entities are postulated.

free parameters (2)

Gradient boosting hyperparameters
Typical tunable parameters such as learning rate and tree count are fitted to data but not specified in the abstract.
Feature grouping thresholds
Decisions on how to bundle raw word properties into familiarity, surface, and transfer groups are not detailed.

axioms (2)

domain assumption SHAP values accurately attribute the contribution of each feature group to model predictions.
Implicit when using Shapley values to rank feature importance.
domain assumption The chosen word features adequately represent the linguistic influences on vocabulary difficulty.
Foundation for training the models and interpreting results.

pith-pipeline@v0.9.0 · 5441 in / 1587 out tokens · 72523 ms · 2026-05-13T04:57:28.825885+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

[1]

Lisa Beinborn, Torsten Zesch, and Iryna Gurevych. 2014. https://doi.org/10.1075/itl.165.2.02bei Readability for foreign language learning: The importance of cognates . ITL - International Journal of Applied Linguistics, 165(2):136--162

work page doi:10.1075/itl.165.2.02bei 2014
[2]

Lisa Beinborn, Torsten Zesch, and Iryna Gurevych. 2016. https://doi.org/10.18653/v1/W16-0508 Predicting the spelling difficulty of words for language learners . In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications , pages 73--83, San Diego, CA, USA. Association for Computational Linguistics

work page doi:10.18653/v1/w16-0508 2016
[3]

Marsha Bensoussan and Batia Laufer. 1984. https://doi.org/10.1111/j.1467-9817.1984.tb00252.x Lexical guessing in context in EFL reading comprehension . Journal of Research in Reading, 7(1):15--32

work page doi:10.1111/j.1467-9817.1984.tb00252.x 1984
[4]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. https://doi.org/10.1162/tacl_a_00051 Enriching word vectors with subword information . Transactions of the Association for Computational Linguistics, 5:135--146

work page doi:10.1162/tacl_a_00051 2017
[5]

Roger Brown and David McNeill. 1966. https://doi.org/10.1016/S0022-5371(66)80040-3 The ``tip of the tongue'' phenomenon . Journal of Verbal Learning and Verbal Behavior, 5(4):325--337

work page doi:10.1016/s0022-5371(66)80040-3 1966
[6]

Bram Bult \'e , Alex Housen, and Gabriele Pallotti. 2025. https://doi.org/10.1111/lang.12669 Complexity and difficulty in second language acquisition: A theoretical and methodological overview . Language Learning, 75(2):533--574

work page doi:10.1111/lang.12669 2025
[7]

Brent Culligan. 2015. https://doi.org/10.1177/0265532215572268 A comparison of three test formats to assess word difficulty . Language Testing, 32(4):503--520

work page doi:10.1177/0265532215572268 2015
[8]

Mihai Dascalu, Danielle McNamara, Scott Crossley, and Stefan Trausan-Matu . 2016. https://doi.org/10.1609/aaai.v30i1.10372 Age of exposure: A model of word learning . In Proceedings of the AAAI Conference on Artificial Intelligence , volume 30, Phoenix, AZ, USA. AAAI Press

work page doi:10.1609/aaai.v30i1.10372 2016
[9]

Paul De Boeck. 2008. https://doi.org/10.1007/s11336-008-9092-x Random item IRT models . Psychometrika, 73(4):533--559

work page doi:10.1007/s11336-008-9092-x 2008
[10]

Annette M. B. De Groot and Rineke Keijzer. 2000. https://doi.org/10.1111/0023-8333.00110 What is hard to learn is easy to forget: The roles of word concreteness, cognate status, and word frequency in foreign-language vocabulary learning and forgetting . Language Learning, 50(1):1--56

work page doi:10.1111/0023-8333.00110 2000
[11]

Karen J. Dunn. 2024. https://doi.org/10.1016/j.rmal.2024.100143 Random-item rasch models and explanatory extensions: A worked example using L2 vocabulary test item responses . Research Methods in Applied Linguistics, 3(3):100143

work page doi:10.1016/j.rmal.2024.100143 2024
[12]

Luise D \"u rlich and Thomas Fran c ois. 2018. https://aclanthology.org/L18-1140/ EFLLex : A graded lexical resource for learners of English as a foreign language . In Proceedings of the Eleventh International Conference on Language Resources and Evaluation ( LREC 2018) , Miyazaki, Japan. European Language Resources Association (ELRA)

work page 2018
[13]

Nick C. Ellis. 2002. https://doi.org/10.1017/S0272263102002024 Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition . Studies in Second Language Acquisition, 24(2):143--188

work page doi:10.1017/s0272263102002024 2002
[14]

Ellis and Alan Beaton

Nick C. Ellis and Alan Beaton. 1993. https://doi.org/10.1111/j.1467-1770.1993.tb00627.x Psycholinguistic determinants of foreign language vocabulary learning . Language Learning, 43(4):559--617

work page doi:10.1111/j.1467-1770.1993.tb00627.x 1993
[15]

Europarat , editor. 2011. https://www.coe.int/lang-cefr Common European framework of reference for languages: Learning , teaching, assessment , 12th edition. Cambridge University Press, Cambridge, UK

work page 2011
[16]

Mariano Felice and Lucy Skidmore. 2026. Shared task on vocabulary difficulty prediction for English learners. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications ( BEA 2026) , San Diego, CA, USA. Association for Computational Linguistics

work page 2026
[17]

Christiane Fellbaum, editor. 1998. https://doi.org/10.7551/mitpress/7287.001.0001 WordNet : An Electronic Lexical Database , 1st edition. The MIT Press, Cambridge, MA, USA

work page doi:10.7551/mitpress/7287.001.0001 1998
[18]

Pierre Finnimore, Elisabeth Fritzsch, Daniel King, Alison Sneyd, Aneeq Ur Rehman, Fernando Alva-Manchego , and Andreas Vlachos. 2019. https://doi.org/10.18653/v1/N19-1102 Strong baselines for complex word identification across multiple languages . In Proceedings of the 2019 Conference of the North , pages 970--977, Minneapolis, MN, USA. Association for Co...

work page doi:10.18653/v1/n19-1102 2019
[19]

Wolfgang H \"a rdle. 1990. https://doi.org/10.1017/CCOL0521382483 Applied Nonparametric Regression , 1st edition. Cambridge University Press, Cambridge, UK

work page doi:10.1017/ccol0521382483 1990
[20]

Yusuke Ide, Masato Mita, Adam Nohejl, Hiroki Ouchi, and Taro Watanabe. 2023. https://doi.org/10.18653/v1/2023.bea-1.40 Japanese lexical complexity for non-native readers: A new dataset . In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications ( BEA 2023) , pages 477--487, Toronto, Canada. Association for Computat...

work page doi:10.18653/v1/2023.bea-1.40 2023
[21]

James and Deborah M

Lori E. James and Deborah M. Burke. 2000. https://doi.org/10.1037/0278-7393.26.6.1378 Phonological priming effects on word retrieval and tip-of-the-tongue experiences in young and older adults . Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(6):1378--1391

work page doi:10.1037/0278-7393.26.6.1378 2000
[22]

Victor Kuperman, Hans Stadthagen-Gonzalez , and Marc Brysbaert. 2012. https://doi.org/10.3758/s13428-012-0210-4 Age-of-acquisition ratings for 30,000 English words . Behavior Research Methods, 44(4):978--990

work page doi:10.3758/s13428-012-0210-4 2012
[23]

Batia Laufer and Zahava Goldstein. 2004. https://doi.org/10.1111/j.0023-8333.2004.00260.x Testing vocabulary knowledge: Size , strength, and computer adaptiveness . Language Learning, 54(3):399--436

work page doi:10.1111/j.0023-8333.2004.00260.x 2004
[24]

John Lee and Chak Yan Yeung. 2018 a . https://doi.org/10.1109/ICNLSP.2018.8374392 Automatic prediction of vocabulary knowledge for learners of Chinese as a foreign language . In 2018 2nd International Conference on Natural Language and Speech Processing ( ICNLSP ) , pages 1--4, Algiers. IEEE

work page doi:10.1109/icnlsp.2018.8374392 2018
[25]

John Lee and Chak Yan Yeung. 2018 b . https://aclanthology.org/C18-1019/ Personalizing lexical simplification . In Proceedings of the 27th International Conference on Computational Linguistics , pages 224--232, Santa Fe, NM, USA. Association for Computational Linguistics

work page 2018
[26]

Lundberg, Gabriel G

Scott M. Lundberg, Gabriel G. Erion, and Su-In Lee. 2018. https://doi.org/10.48550/arXiv.1802.03888 Consistent individualized feature attribution for tree ensembles . arXiv preprint

work page doi:10.48550/arxiv.1802.03888 2018
[27]

Lundberg and Su-In Lee

Scott M. Lundberg and Su-In Lee. 2017. https://dl.acm.org/doi/10.5555/3295222.3295230 A unified approach to interpreting model predictions . In Proceedings of the 31st International Conference on Neural Information Processing Systems , pages 4768--4777, Long Beach, CA, USA. Curran Associates Inc

work page doi:10.5555/3295222.3295230 2017
[28]

George A. Miller. 1995. https://doi.org/10.1145/219717.219748 WordNet : A lexical database for English . Communications of the ACM, 38(11):39--41

work page doi:10.1145/219717.219748 1995
[29]

Theory of Probability & Its Applica- tions9(1), 141–142 (1964) https://doi.org/10.1137/1109020

\`E lizbar A. Nadaraya. 1964. https://doi.org/10.1137/1109020 On estimating regression . Theory of Probability and Its Applications, 9(1):141--142

work page doi:10.1137/1109020 1964
[30]

Ian Stephen Paul Nation. 2000. https://doi.org/10.1017/CBO9781139524759 Learning Vocabulary in Another Language , 1st edition. Cambridge University Press, Cambridge, UK

work page doi:10.1017/cbo9781139524759 2000
[31]

Masashi Negishi, Tomoko Takada, and Yukio Tono. 2013. https://aclanthology.org/2016.jeptalnrecital-long.17/ A progress report on the development of the CEFR-J . In Evelina D. Galaczi and Cyril J. Weir, editors, Exploring language frameworks: Proceedings of the ALTE Krak\'ow Conference , July 2011 , 1st edition, number 36 in Studies in language testing. Ca...

work page 2013
[32]

Daiki Nishihara and Tomoyuki Kajiwara. 2020. https://aclanthology.org/2020.lrec-1.381/ Word complexity estimation for Japanese lexical simplification . In Proceedings of the Twelfth Language Resources and Evaluation Conference , pages 3114--3120, Marseille, France. European Language Resources Association

work page 2020
[33]

Adam Nohejl, Akio Hayakawa, Yusuke Ide, and Taro Watanabe. 2024. https://doi.org/10.18653/v1/2024.tsar-1.8 Difficult for whom? A study of Japanese lexical complexity . In Proceedings of the Third Workshop on Text Simplification , Accessibility and Readability ( TSAR 2024) , pages 69--81, Miami, FL, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.tsar-1.8 2024
[34]

Kai North and Marcos Zampieri. 2023. https://doi.org/10.3389/frai.2023.1236963 Features of lexical complexity: insights from L1 and L2 speakers . Frontiers in Artificial Intelligence, 6:1236963

work page doi:10.3389/frai.2023.1236963 2023
[35]

Kai North, Marcos Zampieri, and Matthew Shardlow. 2023. https://doi.org/10.1145/3557885 Lexical complexity prediction: An overview . ACM Computing Surveys, 55(9):1--42

work page doi:10.1145/3557885 2023
[36]

Terence Odlin. 1989. https://doi.org/10.1017/CBO9781139524537 Language Transfer : Cross-Linguistic Influence in Language Learning , 1st edition. Cambridge University Press, Cambridge, UK

work page doi:10.1017/cbo9781139524537 1989
[37]

Momose Oyama, Sho Yokoi, and Hidetoshi Shimodaira. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.131 Norm of word embedding encodes information gain . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages 2108--2130, Singapore. Association for Computational Linguistics

work page doi:10.18653/v1/2023.emnlp-main.131 2023
[38]

Gustavo Paetzold and Lucia Specia. 2016. https://doi.org/10.18653/v1/S16-1085 SemEval 2016 task 11: Complex word identification . In Proceedings of the 10th International Workshop on Semantic Evaluation ( SemEval-2016 ) , pages 560--569, San Diego, CA, USA. Association for Computational Linguistics

work page doi:10.18653/v1/s16-1085 2016
[39]

Alessio Palmero Aprosio, Stefano Menini, and Sara Tonelli. 2020. https://doi.org/10.1145/3340631.3394857 Adaptive complex word identification through false friend detection . In Proceedings of the 28th ACM Conference on User Modeling , Adaptation and Personalization , pages 192--200, Genoa Italy. ACM

work page doi:10.1145/3340631.3394857 2020
[40]

Elke Peters. 2019. https://www.routledge.com/The-Routledge-Handbook-of-Vocabulary-Studies/Webb/p/book/9781138735729 Factors affecting the learning of single-word items . In Stuart Webb, editor, The Routledge Handbook of Vocabulary Studies , 1st edition, Routledge handbooks in linguistics, pages 125--142. Routledge, Taylor & Francis Group, London, UK

work page arXiv 2019
[41]

Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. https://dl.acm.org/doi/10.5555/3327757.3327770 CatBoost : Unbiased boosting with categorical features . In Proceedings of the 32nd International Conference on Neural Information Processing Systems , NIPS '18, pages 6639--6649, Montr\'eal, Canada. Curran ...

work page doi:10.5555/3327757.3327770 2018
[42]

Real Academia Espa\ nola . 2025. https://www.rae.es/corpes/ Corpus del Espa\ nol del Siglo XXI ( CORPES )

work page 2025
[43]

H kan Ringbom. 1987. The Role of the First Language in Foreign Language Learning , 1st edition. Number 34 in Multilingual matters. Multilingual Matters, Clevedon, UK

work page 1987
[44]

H kan Ringbom and Scott Jarvis. 2009. https://doi.org/10.1002/9781444315783.ch7 The importance of cross-linguistic similarity in foreign language learning . In Michael H. Long and Catherine J. Doughty, editors, The Handbook of Language Teaching , 1st edition. Wiley, Clevedon, UK

work page doi:10.1002/9781444315783.ch7 2009
[45]

Susanne Rott. 1999. https://doi.org/10.1017/S0272263199004039 The effect of exposure frequency on intermediate language learners' incidental vocabulary acquisition and retention through reading . Studies in Second Language Acquisition, 21(4):589--619

work page doi:10.1017/s0272263199004039 1999
[46]

Norbert Schmitt, Karen Dunn, Barry O'Sullivan, Laurence Anthony, and Benjamin Kremmel. 2021. https://doi.org/10.1002/tesj.622 Introducing knowledge-based vocabulary lists ( KVL ) . TESOL Journal, 12(4):e622

work page doi:10.1002/tesj.622 2021
[47]

Norbert Schmitt and Diane Schmitt. 2020. https://doi.org/10.1017/9781108569057 Vocabulary in Language Teaching , 2nd edition. Cambridge University Press, Cambridge, UK

work page doi:10.1017/9781108569057 2020
[48]

Lloyd S. Shapley. 1953. https://doi.org/10.1515/9781400881970-018 A Value for n- Person Games . In Harold William Kuhn and Albert William Tucker, editors, Contributions to the Theory of Games ( AM-28 ), Volume II , pages 307--318. Princeton University Press

work page doi:10.1515/9781400881970-018 1953
[49]

Matthew Shardlow, Richard Evans, Gustavo Henrique Paetzold, and Marcos Zampieri. 2021. https://doi.org/10.18653/v1/2021.semeval-1.1 SemEval-2021 task 1: Lexical complexity prediction . In Proceedings of the 15th International Workshop on Semantic Evaluation ( SemEval-2021 ) , pages 1--16, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2021.semeval-1.1 2021
[50]

Matthew Shardlow et al . 2024. https://aclanthology.org/2024.bea-1.51/ The BEA 2024 shared task on the multilingual lexical simplification pipeline . In Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications ( BEA 2024) , pages 571--589, Mexico City, Mexico. Association for Computational Linguistics

work page 2024
[51]

Lucy Skidmore, Mariano Felice, and Karen Dunn. 2025. https://doi.org/10.18653/v1/2025.bea-1.12 Transformer architectures for vocabulary test item difficulty prediction . In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications ( BEA 2025) , pages 160--174, Vienna, Austria. Association for Computational Linguistics

work page doi:10.18653/v1/2025.bea-1.12 2025
[52]

Ana \"i s Tack, Thomas Fran c ois, Anne-Laure Ligozat, and C \'e drick Fairon. 2016. https://aclanthology.org/2016.jeptalnrecital-long.17/ Mod\`eles adaptatifs pour pr\'edire automatiquement la comp\'etence lexicale d'un apprenant de fran cais langue \'etrang\`ere . In Actes de la conf\'erence conjointe JEP-TALN-RECITAL 2016 , Paris, France. AFCP - ATALA

work page 2016
[53]

Raquel Perez Urdaniz and Sophia Skoufaki. 2022. https://doi.org/10.1515/applirev-2018-0109 Spanish L1 EFL learners' recognition knowledge of English academic vocabulary: The role of cognateness, word frequency and length . Applied Linguistics Review, 13(4):661--703

work page doi:10.1515/applirev-2018-0109 2022
[54]

Van Hell and Andrea Candia Mahn

Janet G. Van Hell and Andrea Candia Mahn. 1997. https://doi.org/10.1111/0023-8333.00018 Keyword mnemonics versus rote rehearsal: Learning concrete and abstract foreign words by experienced and inexperienced learners . Language Learning, 47(3):507--546

work page doi:10.1111/0023-8333.00018 1997
[55]

Walter J. B. Van Heuven, Pawel Mandera, Emmanuel Keuleers, and Marc Brysbaert. 2014. https://doi.org/10.1080/17470218.2013.850521 Subtlex- UK : A new and improved word frequency database for British English . Quarterly Journal of Experimental Psychology, 67(6):1176--1190

work page doi:10.1080/17470218.2013.850521 2014
[56]

Geoffrey S Watson. 1964. https://www.jstor.org/stable/25049340 Smooth regression analysis . Sankhy\=a: The Indian Journal of Statistics, Series A, 26(4):359--372

work page arXiv 1964
[57]

Seid Muhie Yimam, Chris Biemann, Shervin Malmasi, Gustavo Paetzold, Lucia Specia, Sanja S tajner, Ana \"i s Tack, and Marcos Zampieri. 2018. https://doi.org/10.18653/v1/W18-0507 A report on the Complex Word Identification shared task 2018 . In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications , pages 66-...

work page doi:10.18653/v1/w18-0507 2018
[58]

Tatu Ylonen. 2022. https://aclanthology.org/2022.lrec-1.140/ Wiktextract: Wiktionary as machine-readable structured data . In Proceedings of the Thirteenth Language Resources and Evaluation Conference , pages 1317--1325, Marseille, France. European Language Resources Association

work page 2022