Light or Full Verb? A Minimal-Pair Dataset for Probing Phraseological Competence in Language Models
Pith reviewed 2026-06-28 06:38 UTC · model grok-4.3
The pith
Language models distinguish light-verb from full-verb uses of the same verb in minimal sentence contexts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using a large-scale controlled dataset of minimally varying English sentence series in which the same context contains the same verb in light-verb and full-verb uses, the study shows that language models differentiate between these uses even in minimal contexts and exhibit separable patterns across object types.
What carries the argument
Minimal-pair sentence series that keep the verb and surrounding context constant while varying only its role as light verb or full predicate.
If this is right
- Models register phraseological distinctions without additional contextual cues beyond the verb-object pair.
- Model behavior varies systematically with object type, reflecting sensitivity to collocational preferences.
- The released dataset and code enable controlled testing of the same distinction for additional verbs and languages.
- Phraseological competence can be isolated and measured through minimal contrasts rather than varied full sentences.
Where Pith is reading between the lines
- The observed separation may arise from statistical regularities in training data rather than explicit semantic rules.
- Applying the same minimal-pair method to other languages could test whether the differentiation generalizes beyond English.
- Fine-tuning models on this dataset might improve handling of idiomatic versus literal verb uses in downstream tasks.
Load-bearing premise
The minimal-pair sentences isolate only the light-verb versus full-verb distinction without other uncontrolled linguistic factors affecting model behavior.
What would settle it
If language models produced statistically indistinguishable outputs or internal representations for the light-verb and full-verb versions of the same minimal sentence pairs, the claim of differentiation would be falsified.
Figures
read the original abstract
Frequent English verbs such as 'have' and 'make' can function either as collocates in light-verb constructions or as full lexical predicates, as in 'make a decision' vs. 'make a cake'. Whether language models represent this distinction remains unclear. We introduce a large-scale controlled dataset of minimally varying English sentence series in which the same context contains the same verb in light-verb and full-verb uses. Two probing experiments show that language models differentiate between these uses even in minimal contexts and exhibit separable patterns across object types. We release the dataset, generation code, and materials as a reusable resource. The framework supports extensions to broader contexts, additional verbs, and other languages.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a large-scale controlled dataset of minimally varying English sentence series in which the same context contains the same verb in light-verb and full-verb uses (e.g., 'make a decision' vs. 'make a cake'). Two probing experiments are reported to show that language models differentiate these uses even in minimal contexts and exhibit separable patterns across object types. The dataset, generation code, and materials are released as a reusable resource.
Significance. If the minimal pairs successfully isolate the constructional distinction without confounds from object semantics, this dataset would provide a valuable, extensible benchmark for evaluating phraseological competence in language models and could support comparative work across verbs and languages.
major comments (2)
- [Dataset construction / Abstract] The central claim that the sentence series isolate the light-verb versus full-verb distinction (Abstract) is load-bearing, yet the generation procedure supplies no quantitative evidence that object nouns are matched for concreteness, frequency, animacy, or argument structure; these properties differ systematically between light-verb and full-verb objects and could allow models to succeed without representing the constructional contrast itself.
- [Probing experiments / Abstract] The abstract asserts that two probing experiments demonstrate differentiation, yet supplies no information on experimental design, controls, statistical tests, or sample details, so it is impossible to judge whether the reported patterns are supported by the data.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We respond to each major point below and note planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Dataset construction / Abstract] The central claim that the sentence series isolate the light-verb versus full-verb distinction (Abstract) is load-bearing, yet the generation procedure supplies no quantitative evidence that object nouns are matched for concreteness, frequency, animacy, or argument structure; these properties differ systematically between light-verb and full-verb objects and could allow models to succeed without representing the constructional contrast itself.
Authors: The minimal-pair design holds the verb and sentential context constant, so that the sole systematic difference between paired items is the light- versus full-verb use of that verb. Object nouns were chosen on the basis of attested corpus collocations to ensure naturalness for each construction. We acknowledge, however, that the generation procedure description does not include quantitative matching statistics for concreteness, frequency, animacy or argument structure. We will add a supplementary table reporting these properties for the object sets and will discuss any residual differences in the revised manuscript. revision: yes
-
Referee: [Probing experiments / Abstract] The abstract asserts that two probing experiments demonstrate differentiation, yet supplies no information on experimental design, controls, statistical tests, or sample details, so it is impossible to judge whether the reported patterns are supported by the data.
Authors: Abstracts are length-constrained summaries; the experimental design, controls, statistical tests and sample details are fully reported in the Methods and Results sections of the manuscript. No revision to the abstract is required, as the body of the paper already supplies the requested information. revision: no
Circularity Check
No circularity: purely empirical dataset construction and probing with no derivations or self-referential fits
full rationale
The paper constructs a minimal-pair dataset for light vs. full verb uses and runs probing experiments on language models to test differentiation. No equations, parameters, or derivations appear in the abstract or described content. Claims rest on experimental outcomes from the released dataset rather than any reduction to fitted inputs, self-citations, or ansatzes. The work is self-contained as an empirical resource release with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[3]
Miriam Butt. 1995. The Structure of Complex Predicates in Urdu. CSLI Publications, Stanford
1995
-
[4]
Miriam Butt. 2010. The light verb jungle: Still hacking away. In Mengistu Amberber, Brett Baker, and Mark Harvey, editors, Complex Predicates: Cross-Linguistic Perspectives on Event Structure. Cambridge University Press, Cambridge
2010
-
[5]
Wei-Te Chen and Martha Palmer. 2015. English light verb construction identification using lexical knowledge. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence
2015
-
[6]
Mathieu Constant, Gülşen Eryiğit, Johanna Monti, Carlos Ramisch van der Plas, Lonneke, Michael Rosner, and Amalia Todirascu. 2017. Multiword expression processing: A survey. Computational Linguistics, 43(4):837--892
2017
-
[7]
Silvio Ricardo Cordeiro and Marie Candito. 2019. https://aclanthology.org/W19-6110/ Syntax-based identification of light-verb constructions . In Proceedings of the 22nd Nordic Conference on Computational Linguistics, pages 97--104, Turku, Finland. Link \"o ping University Electronic Press
2019
-
[8]
Beatriz Fisas, Luis Espinosa Anke, Joan Codina-Filb \'a , and Leo Wanner. 2020. https://aclanthology.org/2020.mwe-1.1/ C oll F r E n: Rich bilingual E nglish -- F rench collocation resource . In Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, pages 1--12, online. Association for Computational Linguistics
2020
-
[9]
Jens Fleischhauer, , and Anja Latrouite. 2025. Light Verbs. Language Sciences Press, Berlin
2025
-
[11]
Gemma Team, Google DeepMind . 2025. https://arxiv.org/abs/2503.19786 Gemma 3 technical report . Preprint, arXiv:2503.19786
Pith/arXiv arXiv 2025
-
[12]
Adam Goodkind and Klinton Bicknell. 2018. https://doi.org/10.18653/v1/W18-0102 Predictive power of word surprisal for reading times is a linear function of language model quality . In Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics ( CMCL 2018) , pages 10--18, Salt Lake City, Utah. Association for Computational Linguistics
-
[13]
Stefan Th. Gries. 2013. https://doi.org/10.1075/ijcl.18.1.09gri 50-something years of work on collocations: What is or should be next ... International Journal of Corpus Linguistics, 18(1):137--166
-
[14]
Jane Grimshaw and Armin Mester. 1988. Light verbs and -marking. Linguistic Inquiry, 19(2):205--232
1988
-
[15]
Jennifer Hu, Jon Gauthier, Peng Qian, Ethan Wilcox, and Roger Levy. 2020. A systematic assessment of syntactic generalization in neural language models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1725--1744. Association for Computational Linguistics
2020
-
[16]
Otto Jesperen. 1942. A Modern English Grammar on Historical Principles, Part VI, Morphology. Ejnar Munksgaard, Copenhagen
1942
-
[17]
Jaap Jumelet, Leonie Weissweiler, Joakim Nivre, and Arianna Bisazza. 2025. MultiBLiMP 1.0: A massively multilingual benchmark of linguistic minimal pairs. arXiv preprint arXiv:2504.02768
Pith/arXiv arXiv 2025
-
[18]
Roger Levy. 2008. https://doi.org/10.1016/j.cognition.2007.05.006 Expectation-based syntactic comprehension . Cognition, 106(3):1126--1177
-
[19]
Rebecca Marvin and Tal Linzen. 2018. Targeted syntactic evaluation of language models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1192--1202. Association for Computational Linguistics
2018
-
[20]
Mel c uk, Andr \'e Clas, and Alain Polgu \`e re
Igor A. Mel c uk, Andr \'e Clas, and Alain Polgu \`e re. 1995. Introduction \`a la lexicologie explicative et combinatoire . Duculot, Louvain-la-Neuve
1995
-
[21]
István Nagy, Veronika Vincze, and Richárd Farkas. 2020. Detecting light verb constructions across languages. Natural Language Engineering, 26(3):319--348
2020
-
[22]
Carlos Ramisch, Silvio Ricardo Cordeiro, Agata Savary, Veronika Vincze, Verginica Barbu Mititelu, Archna Bhatia, Maja Buljan, Marie Candito, Polona Gantar, Voula Giouli, Tunga G \"u ng \"o r, Abdelati Hawwari, Uxoa I \ n urrieta, Jolanta Kovalevskait \.e , Simon Krek, Timm Lichte, Chaya Liebeskind, Johanna Monti, Carla Parra Escart \'i n, and 7 others. 20...
2018
-
[23]
Agata Savary, Carlos Ramisch, Silvio Ricardo Cordeiro, Federico Sangati, Veronika Vincze, Behrang QasemiZadeh, Marie Candito, Fabienne Cap, Voula Giouli, Ivelina Stoyanova, and Antoine Doucet. 2017. The PARSEME shared task on automatic identification of verbal multiword expressions. In Proceedings of the 13th Workshop on Multiword Expressions ( MWE 2017) ...
2017
-
[24]
Smith and Roger Levy , keywords =
Nathaniel J. Smith and Roger Levy. 2013. https://doi.org/10.1016/j.cognition.2013.02.013 The effect of word predictability on reading time is logarithmic . Cognition, 128(3):302--319
-
[25]
Anatol Stefanowitsch and Stefan Th Gries. 2003. Collostructions: Investigating the interaction of words and constructions. International journal of corpus linguistics, 8(2):209--243
2003
-
[26]
Alba T \'a boas Garc \'i a and Leo Wanner. 2025. https://aclanthology.org/2025.depling-1.4/ Assessing the agreement competence of large language models . In Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025), pages 36--53, Ljubljana, Slovenia. Association for Computational Linguistics
2025
-
[27]
Kathleen Tan, Tong Ming Lim, Chi Wee Tan, and Wei Wei Chew. 2021. Automatic identification of light verb constructions: A review. IEM Journal, Special Edition: International Conference on Digital Transformation and Applications
2021
-
[28]
Yee Fan Tan, Min-Yen Kan, and Hang Cui. 2006. Extending corpus-based identification of light verb constructions using a supervised learning framework. In Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties, pages 49--56, Sydney, Australia. Association for Computational Linguistics
2006
-
[29]
Yuancheng Tu and Dan Roth. 2011. Learning english light verb constructions: Contextual or statistical. In Proceedings of the Workshop on Multiword Expressions, pages 31--39. ACL
2011
-
[30]
Ashwini Vaidya, Sumeet Agarwal, and Martha Palmer. 2016. Syntax-based identification of light-verb constructions. In Proceedings of the International Conference on Computational Linguistics, pages 1320--1329, Osaka, Japan
2016
-
[31]
Veronika Vincze, István Nagy, and János Zsibrita. 2013. Learning to detect english and hungarian light verb constructions. ACM Transactions on Speech and Language Processing, 10(2)
2013
-
[32]
Alex Warstadt, Alicia Parrish, Haokun Liu, Anhad Mohananey, Wei Peng, Sheng-Fu Wang, and Samuel R. Bowman. 2020. BLiMP : The benchmark of linguistic minimal pairs for english. Transactions of the Association for Computational Linguistics, 8:377--392
2020
-
[33]
Ethan Gotlieb Wilcox, Pranali Vani, and Roger Levy. 2021. A targeted assessment of incremental processing in neural language models and humans. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, pages 939--952. Association for Computational Linguistics
2021
-
[34]
Beilei Xiang and 1 others. 2021. CLiMP : A benchmark for chinese language model evaluation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics
2021
-
[35]
arXiv preprint arXiv:2601.03779 , year=
Tracing the complexity profiles of different linguistic phenomena through the intrinsic dimension of LLM representations , author=. arXiv preprint arXiv:2601.03779 , year=
-
[36]
Discourse studies , volume=
A register perspective on grammar and discourse: Variability in the form and use of English complement clauses , author=. Discourse studies , volume=. 1999 , publisher=
1999
-
[37]
International journal of corpus linguistics , volume=
Collostructions: Investigating the interaction of words and constructions , author=. International journal of corpus linguistics , volume=. 2003 , publisher=
2003
-
[38]
and Stefanowitsch, Anatol , title =
Gries, Stefan Th. and Stefanowitsch, Anatol , title =. International Journal of Corpus Linguistics , volume =. 2004 , doi =
2004
-
[39]
Cognitive sociolinguistics , pages=
Channel and constructional meaning: A collostructional case study , author=. Cognitive sociolinguistics , pages=. 2008 , publisher=
2008
-
[40]
, title =
Stefanowitsch, Anatol and Gries, Stefan Th. , title =. Corpus Linguistics and Linguistic Theory , volume =. 2005 , doi =
2005
-
[41]
, title =
Gries, Stefan Th. , title =. International Journal of Corpus Linguistics , volume =. 2013 , doi =
2013
-
[42]
Jesperen, Otto , title =. 1942
1942
-
[43]
Miriam Butt , title =. 1995
1995
-
[44]
Complex Predicates: Cross-Linguistic Perspectives on Event Structure , editor =
Miriam Butt , title =. Complex Predicates: Cross-Linguistic Perspectives on Event Structure , editor =. 2010
2010
-
[45]
Light Verbs and
Grimshaw, Jane and Armin Mester , journal=. Light Verbs and
-
[46]
Proceedings of the Workshop on Multiword Expressions , pages =
Tu, Yuancheng and Dan Roth , title =. Proceedings of the Workshop on Multiword Expressions , pages =. 2011
2011
-
[47]
Natural Language Engineering , pages=
Detecting Light Verb Constructions Across Languages , author=. Natural Language Engineering , pages=
-
[48]
Fleischhauer, Jens and and Anja Latrouite , title =. 2025
2025
-
[49]
Chen, Wei-Te and Martha Palmer , title =. 2015
2015
-
[50]
ACM Transactions on Speech and Language Processing , volume=
Learning to Detect English and Hungarian Light Verb Constructions , author=. ACM Transactions on Speech and Language Processing , volume=
-
[51]
Syntax-based identification of light-verb constructions
Cordeiro, Silvio Ricardo and Candito, Marie. Syntax-based identification of light-verb constructions. Proceedings of the 22nd Nordic Conference on Computational Linguistics. 2019
2019
-
[52]
Syntax-based identification of light-verb constructions
Vaidya, Ashwini and Sumeet Agarwal and Martha Palmer. Syntax-based identification of light-verb constructions. Proceedings of the International Conference on Computational Linguistics. 2016
2016
-
[53]
IEM Journal, Special Edition: International Conference on Digital Transformation and Applications , year=
Automatic Identification of Light Verb Constructions: A Review , author=. IEM Journal, Special Edition: International Conference on Digital Transformation and Applications , year=
-
[54]
Computational Linguistics , volume=
Multiword Expression Processing: A Survey , author=. Computational Linguistics , volume=
-
[55]
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages =
Targeted Syntactic Evaluation of Language Models , author =. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages =. 2018 , publisher =
2018
-
[56]
S yntax G ym: An Online Platform for Targeted Evaluation of Language Models
Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger. S yntax G ym: An Online Platform for Targeted Evaluation of Language Models. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2020. doi:10.18653/v1/2020.acl-demos.10
-
[57]
, journal =
Warstadt, Alex and Parrish, Alicia and Liu, Haokun and Mohananey, Anhad and Peng, Wei and Wang, Sheng-Fu and Bowman, Samuel R. , journal =
-
[58]
Assessing the Agreement Competence of Large Language Models
T \'a boas Garc \'i a, Alba and Wanner, Leo. Assessing the Agreement Competence of Large Language Models. Proceedings of the Eighth International Conference on Dependency Linguistics (Depling, SyntaxFest 2025). 2025
2025
-
[59]
2021 , publisher =
Xiang, Beilei and others , booktitle =. 2021 , publisher =
2021
-
[60]
Jumelet, Jaap and Weissweiler, Leonie and Nivre, Joakim and Bisazza, Arianna , year =
-
[61]
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =
A Systematic Assessment of Syntactic Generalization in Neural Language Models , author =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =. 2020 , publisher =
2020
-
[62]
Proceedings of the 2018 EMNLP Workshop BlackboxNLP , pages =
What do RNN Language Models Learn about Filler--Gap Dependencies? , author =. Proceedings of the 2018 EMNLP Workshop BlackboxNLP , pages =. 2018 , publisher =
2018
-
[63]
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics , pages =
A Targeted Assessment of Incremental Processing in Neural Language Models and Humans , author =. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics , pages =. 2021 , publisher =
2021
-
[64]
Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Beyer, Anne and Lo \'a iciga, Sharid and Schlangen, David. Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. doi:10.18653/v1/2021.naacl-main.328
-
[65]
Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties , pages =
Tan, Yee Fan and Kan, Min-Yen and Cui, Hang , title =. Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties , pages =. 2006 , address =
2006
-
[66]
Proceedings of the 13th Workshop on Multiword Expressions (
Savary, Agata and Ramisch, Carlos and Cordeiro, Silvio Ricardo and Sangati, Federico and Vincze, Veronika and QasemiZadeh, Behrang and Candito, Marie and Cap, Fabienne and Giouli, Voula and Stoyanova, Ivelina and Doucet, Antoine , title =. Proceedings of the 13th Workshop on Multiword Expressions (. 2017 , address =
2017
-
[67]
Edition 1.1 of the
Ramisch, Carlos and Cordeiro, Silvio Ricardo and Savary, Agata and Vincze, Veronika and Barbu Mititelu, Verginica and Bhatia, Archna and Buljan, Maja and Candito, Marie and Gantar, Polona and Giouli, Voula and G. Edition 1.1 of the. Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (. 2018 , address =
2018
-
[68]
Cognition , volume =
Levy, Roger , title =. Cognition , volume =. 2008 , doi =
2008
-
[69]
and Levy, Roger , title =
Smith, Nathaniel J. and Levy, Roger , title =. Cognition , volume =. 2013 , doi =
2013
-
[70]
Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (
Goodkind, Adam and Bicknell, Klinton , title =. Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (. 2018 , address =
2018
-
[71]
C oll F r E n: Rich Bilingual E nglish -- F rench Collocation Resource
Fisas, Beatriz and Espinosa Anke, Luis and Codina-Filb \'a , Joan and Wanner, Leo. C oll F r E n: Rich Bilingual E nglish -- F rench Collocation Resource. Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons. 2020
2020
-
[72]
Igor A. Mel. Introduction
-
[73]
2025 , eprint =
Gemma 3 Technical Report , author =. 2025 , eprint =
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.