MIDI is a new multilingual idiom dataset with sentence and conversational contexts; benchmarking reveals worse performance in low-resource languages and on literal vs. figurative uses.
and Titone, Debra A
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Language models show idiom decomposability correlates weakly with human judgments, negatively with syntactic flexibility, and contributes most strongly to representation stabilization during training alongside surprisal and frequency.
citing papers explorer
-
Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages
MIDI is a new multilingual idiom dataset with sentence and conversational contexts; benchmarking reveals worse performance in low-resource languages and on literal vs. figurative uses.
-
Rethinking the Idiomaticity Decomposability Hypothesis: Evidence from Distributional Learning
Language models show idiom decomposability correlates weakly with human judgments, negatively with syntactic flexibility, and contributes most strongly to representation stabilization during training alongside surprisal and frequency.