arxiv: 2604.18296 · v1 · submitted 2026-04-20 · 💻 cs.CL

Recognition: unknown

Exploring Concreteness Through a Figurative Lens

Saptarshi Ghosh , Tianyu Jiang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:26 UTC · model grok-4.3

classification 💻 cs.CL

keywords concretenessfigurative languageLLM representationslayer-wise analysisgeometric directionmetaphorrepresentation spacesteering generation

0 comments

The pith

LLMs encode concreteness as one consistent one-dimensional direction in their mid-to-late layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates how large language models internally represent word concreteness, which can shift between literal and figurative senses depending on context. It shows that early layers separate literal from figurative uses of the same noun while mid-to-late layers compress the distinction into a single direction shared across model families. A sympathetic reader would care because this simple geometric organization turns a complex semantic property into something directly usable for classifying figurative language and adjusting generation without retraining.

Core claim

The authors demonstrate that LLMs separate literal and figurative usage in early layers, and that mid-to-late layers compress concreteness into a one-dimensional direction that is consistent across models. This geometric structure supports efficient figurative-language classification and enables training-free steering of generation toward more literal or more figurative rewrites.

What carries the argument

The one-dimensional concreteness direction in hidden representation space that organizes literal versus figurative interpretations of nouns.

If this is right

Early layers perform the separation between literal and figurative contexts.
A single direction vector enables training-free classification of figurative language.
Manipulating the direction allows steering of generated text toward literal or figurative styles.
The structure remains consistent across four different model families.
This geometry provides a practical handle on context-dependent semantics in representation space.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same low-dimensional compression might appear for other shifting semantic properties such as specificity or sentiment.
Identifying such directions could supply lightweight interpretability tools for controlling stylistic output in deployed systems.
The pattern may generalize to additional figurative phenomena like idioms or sarcasm if tested on broader datasets.
If the direction proves stable, it could reduce reliance on supervised fine-tuning for style control tasks.

Load-bearing premise

The observed one-dimensional direction genuinely reflects the model's internal handling of concreteness rather than an artifact of training data statistics or the specific choice of nouns and contexts examined.

What would settle it

If the same direction vector fails to reliably classify new literal-versus-figurative examples or to steer generation in the predicted direction across additional word sets or models, the claim would be falsified.

Figures

Figures reproduced from arXiv: 2604.18296 by Saptarshi Ghosh, Tianyu Jiang.

**Figure 3.** Figure 3: Layer-wise AUROC for separating high and [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 5.** Figure 5: Word frequency distribution in the 25,000 [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 4.** Figure 4: Mean δ across layers in Llama-3.1-8B, for verbs. Early high separation is followed by moderate to low separation in the middle to later layers [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 7.** Figure 7: AUROC score for classifying high and low [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: Prompt for generating contextual concreteness [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

**Figure 9.** Figure 9: Prompt for generating static concreteness [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 10.** Figure 10: Pearson correlation between embeddings of models and concreteness scores from [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: Mean δ for all layers for remaining models. celeration (multiple NVIDIA A100 GPUs) as well as substantial CPU time for mathematical operations such as DiffMean analysis. Runtime is a practical constraint: for example, layer-wise MLP probing with 10-fold cross-validation on larger models can take several hours to complete [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗

**Figure 12.** Figure 12: AUROC score for classifying high and low concrete nouns using one-directional geometric subspace for [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗

**Figure 13.** Figure 13: Prompt for generating static concreteness token. [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: Annotation guidelines [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗

read the original abstract

Static concreteness ratings are widely used in NLP, yet a word's concreteness can shift with context, especially in figurative language such as metaphor, where common concrete nouns can take abstract interpretations. While such shifts are evident from context, it remains unclear how LLMs understand concreteness internally. We conduct a layer-wise and geometric analysis of LLM hidden representations across four model families, examining how models distinguish literal vs figurative uses of the same noun and how concreteness is organized in representation space. We find that LLMs separate literal and figurative usage in early layers, and that mid-to-late layers compress concreteness into a one-dimensional direction that is consistent across models. Finally, we show that this geometric structure is practically useful: a single concreteness direction supports efficient figurative-language classification and enables training-free steering of generation toward more literal or more figurative rewrites.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper finds a consistent 1D concreteness direction in mid-to-late LLM layers for literal versus figurative noun uses, and shows it supports simple classification and steering, but the evidence that this is a general model property rather than a stimulus artifact remains thin.

read the letter

The core observation is that several LLMs develop a single linear direction tied to concreteness in their hidden states once they reach mid-to-late layers, and that direction separates literal from figurative uses of the same nouns while also allowing training-free steering of generation. The cross-model consistency and the downstream uses are the parts that stand out most clearly from the abstract and the described experiments.

Referee Report

2 major / 2 minor

Summary. The manuscript conducts a layer-wise geometric analysis of LLM hidden representations to examine how models encode concreteness in literal versus figurative uses of the same nouns. Across four model families, it reports that early layers separate literal and figurative usages while mid-to-late layers compress concreteness into a consistent one-dimensional direction; this direction is then shown to support training-free figurative-language classification and steering of generation toward more literal or figurative outputs.

Significance. If the geometric claims hold after controls, the work advances interpretability by identifying a compressed, cross-model axis for a context-dependent semantic property. The practical demonstrations of classification and steering add applied value, and the consistency finding (if not reducible to stimulus artifacts) would be a useful benchmark for representation geometry studies.

major comments (2)

[§4.2] §4.2 (Cross-model consistency analysis): the reported one-dimensional concreteness direction and its alignment across models may be driven by co-occurrence statistics in the paired literal/figurative noun stimuli rather than an internal model property. Because the same nouns appear in both conditions, any systematic difference in training-data contexts (syntactic frames, collocates, or sentiment) can induce a spurious linear direction; without ablations on unmatched nouns, frequency-matched controls, or out-of-distribution items, the generality claim is not yet load-bearing.
[§3.3] §3.3 (Stimulus and direction extraction): the method for extracting the concreteness direction (difference vectors, linear probes, or PCA) is not shown to be invariant to the specific choice of figurative contexts. The central compression claim requires evidence that the direction remains stable when the literal/figurative contrast is decorrelated from lexical identity; the current paired design leaves this open.

minor comments (2)

[Abstract] The abstract would be strengthened by reporting at least one key quantitative result (e.g., classification accuracy or steering success rate with error bars) rather than qualitative descriptions alone.
[Figures] Figure captions for the layer-wise plots should explicitly state the number of stimuli, models, and the precise metric used to measure 'consistency' of the direction (e.g., cosine similarity threshold or alignment score).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which raises valid points about potential stimulus confounds in our geometric analysis. We address each major comment below and commit to revisions that strengthen the claims without overstating current evidence.

read point-by-point responses

Referee: [§4.2] §4.2 (Cross-model consistency analysis): the reported one-dimensional concreteness direction and its alignment across models may be driven by co-occurrence statistics in the paired literal/figurative noun stimuli rather than an internal model property. Because the same nouns appear in both conditions, any systematic difference in training-data contexts (syntactic frames, collocates, or sentiment) can induce a spurious linear direction; without ablations on unmatched nouns, frequency-matched controls, or out-of-distribution items, the generality claim is not yet load-bearing.

Authors: We agree that the paired-noun design, while controlling for lexical identity, permits possible co-occurrence confounds. The paired approach was selected to isolate context-driven concreteness shifts for identical nouns, following standard practices in figurative-language studies. Cross-model alignment of the direction offers partial support against pure artifacts, as training-data differences across families make identical spurious directions unlikely. To directly test this, we will add ablations using unmatched nouns (frequency- and concreteness-matched but unpaired) and out-of-distribution items, reporting whether the 1D direction and its cross-model consistency persist. These controls will be included in the revised manuscript. revision: yes
Referee: [§3.3] §3.3 (Stimulus and direction extraction): the method for extracting the concreteness direction (difference vectors, linear probes, or PCA) is not shown to be invariant to the specific choice of figurative contexts. The central compression claim requires evidence that the direction remains stable when the literal/figurative contrast is decorrelated from lexical identity; the current paired design leaves this open.

Authors: The difference-vector method was chosen precisely to decorrelate usage from noun identity within each pair. We acknowledge that full decorrelation from lexical identity requires additional tests. In revision we will add an invariance check: extract the direction from one set of figurative contexts, then evaluate classification and steering performance on held-out figurative contexts and unmatched nouns. We will also compare difference vectors against PCA and linear-probe variants to demonstrate method robustness. These results will be reported to support the compression claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical analysis of representations

full rationale

The paper conducts a layer-wise geometric analysis of existing LLM hidden states on literal vs. figurative noun pairs, reporting an observed one-dimensional direction in mid-to-late layers without any equations, derivations, or fitted parameters that reduce the claimed structure to its inputs by construction. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided abstract or description; the consistency across models and downstream utility are presented as empirical observations rather than constructed results. The work remains self-contained as an analysis of model behavior on the chosen stimuli.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The abstract relies on the standard domain assumption that hidden representations encode semantic distinctions such as concreteness, but introduces no free parameters, new axioms, or invented entities.

axioms (1)

domain assumption Hidden representations in LLMs encode semantic properties including concreteness and its contextual shifts
This assumption underpins the layer-wise and geometric analysis described in the abstract.

pith-pipeline@v0.9.0 · 5435 in / 1260 out tokens · 40841 ms · 2026-05-10T05:26:55.790759+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

100 extracted references · 49 canonical work pages · 8 internal anchors

[2]

Behavior Research Methods , volume=

Concreteness ratings for 40 thousand generally known English word lemmas , author=. Behavior Research Methods , volume=. 2014 , publisher=

2014
[3]

A Structural Probe for Finding Syntax in Word Representations

Hewitt, John and Manning, Christopher D. A Structural Probe for Finding Syntax in Word Representations. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v1/N19-1419

work page doi:10.18653/v1/n19-1419 2019
[4]

Information-Theoretic Probing for Linguistic Structure

Pimentel, Tiago and Valvoda, Josef and Maudslay, Rowan Hall and Zmigrod, Ran and Williams, Adina and Cotterell, Ryan. Information-Theoretic Probing for Linguistic Structure. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.420

work page doi:10.18653/v1/2020.acl-main.420 2020
[5]

Estimating Word Concreteness from Contextualized Embeddings

Wartena, Christian. Estimating Word Concreteness from Contextualized Embeddings. Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024). 2024

2024
[6]

Don ' t Invite BERT to Drink a Bottle: Modeling the Interpretation of Metonymies Using BERT and Distributional Representations

Pedinotti, Paolo and Lenci, Alessandro. Don ' t Invite BERT to Drink a Bottle: Modeling the Interpretation of Metonymies Using BERT and Distributional Representations. Proceedings of the 28th International Conference on Computational Linguistics. 2020. doi:10.18653/v1/2020.coling-main.602

work page doi:10.18653/v1/2020.coling-main.602 2020
[8]

2025 , eprint=

Sycophancy Is Not One Thing: Causal Separation of Sycophantic Behaviors in LLMs , author=. 2025 , eprint=

2025
[9]

2024 , eprint=

Linearity of Relation Decoding in Transformer Language Models , author=. 2024 , eprint=

2024
[10]

Pollock, Lewis , year =
[12]

Behavioral and Brain Sciences , volume=

Perceptual symbol systems , author=. Behavioral and Brain Sciences , volume=. 1999 , publisher=

1999
[13]

2024 , eprint=

The Llama 3 Herd of Models , author=. 2024 , eprint=

2024
[14]

2025 , eprint=

Qwen3 Technical Report , author=. 2025 , eprint=

2025
[15]

2025 , eprint=

gpt-oss-120b gpt-oss-20b Model Card , author=. 2025 , eprint=

2025
[16]

2024 , eprint=

Gemma 2: Improving Open Language Models at a Practical Size , author=. 2024 , eprint=

2024
[19]

Bonin, P. and M. Concreteness Norms for 1,659 French Words: Relationships with Other Psycholinguistic Variables and Word Recognition Times , journal =. 2018 , url=

2018
[20]

and Ferr

Guasch, M. and Ferr. Spanish Norms for Affective and Lexico-Semantic Variables for 1,400 Words , journal =. 2016 , volume =

2016
[22]

Behavior research methods , volume=

Concreteness ratings for 62,000 English multiword expressions , author=. Behavior research methods , volume=. 2023 , publisher=

2023
[23]

Contextual Characteristics of Concrete and Abstract Words

Frassinelli, Diego and Naumann, Daniela and Utt, Jason and Schulte m Walde, Sabine. Contextual Characteristics of Concrete and Abstract Words. Proceedings of the 12th International Conference on Computational Semantics ( IWCS ) --- Short papers. 2017

2017
[24]

Defining a Conceptual Topography of Word Concreteness: Clustering Properties of Emotion, Sensation, and Magnitude among 750 English Words , volume =

Troche, Joshua and Crutch, Sebastian and Reilly, Jamie , year =. Defining a Conceptual Topography of Word Concreteness: Clustering Properties of Emotion, Sensation, and Magnitude among 750 English Words , volume =. Frontiers in Psychology , doi =
[31]

Cross-Lingual Metaphor Detection Using Common Semantic Features

Tsvetkov, Yulia and Mukomel, Elena and Gershman, Anatole. Cross-Lingual Metaphor Detection Using Common Semantic Features. Proceedings of the First Workshop on Metaphor in NLP. 2013

2013
[33]

A Robustly Optimized BERT Pre-training Approach with Post-training

Zhuang, Liu and Wayne, Lin and Ya, Shi and Jun, Zhao. A Robustly Optimized BERT Pre-training Approach with Post-training. Proceedings of the 20th Chinese National Conference on Computational Linguistics. 2021

2021
[35]

Communications Psychology , year=

A multimodal transformer-based tool for automatic generation of concreteness ratings across languages , author=. Communications Psychology , year=
[37]

2025 , eprint=

The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect , author=. 2025 , eprint=

2025
[38]

2024 , eprint=

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets , author=. 2024 , eprint=

2024
[39]

2020 , eprint=

EPIE Dataset: A Corpus For Possible Idiomatic Expressions , author=. 2020 , eprint=

2020
[40]

MAGPIE : A Large Corpus of Potentially Idiomatic Expressions

Haagsma, Hessel and Bos, Johan and Nissim, Malvina. MAGPIE : A Large Corpus of Potentially Idiomatic Expressions. Proceedings of the Twelfth Language Resources and Evaluation Conference. 2020

2020
[42]

2010 , url =

A Method for Linguistic Metaphor Identification: From MIP to MIPVU , author =. 2010 , url =

2010
[43]

Gregori, Lorenzo and Montefinese, Maria and Radicioni, Daniele and Ravelli, Andrea Amelio and Varvara, Rossella , year =
[44]

ABRICOT - AB st R actness and Inclusiveness in CO ntex T : A CALAMITA Challenge

Puccetti, Giovanni and Collacciani, Claudia and Ravelli, Andrea Amelio and Esuli, Andrea and Bolognesi, Marianna. ABRICOT - AB st R actness and Inclusiveness in CO ntex T : A CALAMITA Challenge. Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024). 2024

2024
[51]

2023 , eprint=

Augmented Language Models: a Survey , author=. 2023 , eprint=

2023
[52]

2020 , eprint=

Language Models are Few-Shot Learners , author=. 2020 , eprint=

2020
[53]

2023 , eprint=

Prompting Large Language Model for Machine Translation: A Case Study , author=. 2023 , eprint=

2023
[54]

Metaphors We Live By , year =

George Lakoff and Mark Johnson , publisher =. Metaphors We Live By , year =
[55]

Metaphor and Thought , editor =

Lakoff, George , title =. Metaphor and Thought , editor =. 1993 , url=

1993
[56]

The poetics of mind: Figurative thought, language, and understanding: Raymond W

Shen, Yeshayahu , year =. The poetics of mind: Figurative thought, language, and understanding: Raymond W. Gibbs, Jr., New York: Cambridge University Press, 1994. (ix + 527 pp.). ISBN 0-521-41965-4 (hb.), 0-521-42992-7 (pb.). 59.95 (hb.), 18.95 (pb.) , volume =

1994
[57]

1993 , isbn =

Idioms: Processing, Structure, and Interpretation , publisher =. 1993 , isbn =

1993
[58]

2001 , isbn =

Glucksberg, Sam , title =. 2001 , isbn =

2001
[59]

Metaphor and Metonymy at the Crossroads: A Cognitive Perspective

Barcelona, Antonio. Metaphor and Metonymy at the Crossroads: A Cognitive Perspective. 2003

2003
[60]

1999 , isbn =

Metonymy in Language and Thought , series =. 1999 , isbn =

1999
[61]

Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings , url=

Ljubešić, Nikola and Fišer, Darja and Peti-Stantić, Anita , year=. Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings , url=. doi:10.18653/v1/w18-3028 , booktitle=

work page doi:10.18653/v1/w18-3028
[65]

2026 , eprint=

MetFuse: Figurative Fusion between Metonymy and Metaphor , author=. 2026 , eprint=

2026
[66]

2026 , eprint=

Rhetorical Questions in LLM Representations: A Linear Probing Study , author=. 2026 , eprint=

2026
[67]

Antonio Barcelona. 2003. Metaphor and Metonymy at the Crossroads: A Cognitive Perspective. Mouton de Gruyter

2003
[68]

Lawrence W Barsalou. 1999. https://doi.org/10.1017/S0140525X99002149 Perceptual symbol systems . Behavioral and Brain Sciences, 22(4):577--660

work page doi:10.1017/s0140525x99002149 1999
[69]

Beata Beigman Klebanov, Chee Wee Leong, and Michael Flor. 2015. https://doi.org/10.3115/v1/W15-1402 Supervised word-level metaphor detection: Experiments with concreteness and reweighting of examples . In Proceedings of the Third Workshop on Metaphor in NLP , pages 11--20, Denver, Colorado. Association for Computational Linguistics

work page doi:10.3115/v1/w15-1402 2015
[70]

Bonin, A

P. Bonin, A. M \'e ot, and A. Bugaiska. 2018. https://doi.org/10.3758/s13428-018-1014-y Concreteness norms for 1,659 french words: Relationships with other psycholinguistic variables and word recognition times . Behavior Research Methods, 50(6):2366--2387

work page doi:10.3758/s13428-018-1014-y 2018
[71]

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, and 1 others. 2020. https://arxiv.org/abs/2005.14165 Language models are few-shot learners . Preprint, arXiv:2005.14165

work page internal anchor Pith review arXiv 2020
[72]

Marc Brysbaert, Amy Beth Warriner, and Victor Kuperman. 2014. https://doi.org/10.3758/s13428-013-0403-5 Concreteness ratings for 40 thousand generally known english word lemmas . Behavior Research Methods, 46(3):904--911

work page doi:10.3758/s13428-013-0403-5 2014
[73]

Cristina Cacciari and Patrizia Tabossi, editors. 1993. Idioms: Processing, Structure, and Interpretation. Lawrence Erlbaum Associates, Hillsdale, NJ, USA

1993
[74]

Tuhin Chakrabarty, Yejin Choi, and Vered Shwartz. 2022. https://doi.org/10.1162/tacl_a_00478 It ' s not rocket science: Interpreting figurative language in narratives . Transactions of the Association for Computational Linguistics, 10:589--606

work page doi:10.1162/tacl_a_00478 2022
[75]

Tuhin Chakrabarty, Xurui Zhang, Smaranda Muresan, and Nanyun Peng. 2021. https://doi.org/10.18653/v1/2021.naacl-main.336 MERMAID : Metaphor generation with symbolism and discriminative decoding . In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021)

work page doi:10.18653/v1/2021.naacl-main.336 2021
[76]

Jean Charbonnier and Christian Wartena. 2019. https://doi.org/10.18653/v1/W19-0415 Predicting word concreteness and imagery . In Proceedings of the 13th International Conference on Computational Semantics - Long Papers, pages 176--187, Gothenburg, Sweden. Association for Computational Linguistics

work page doi:10.18653/v1/w19-0415 2019
[77]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://doi.org/10.18653/v1/N19-1423 BERT : Pre-training of deep bidirectional transformers for language understanding . In Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics (NAACL 2019)

work page doi:10.18653/v1/n19-1423 2019
[78]

Diego Frassinelli, Daniela Naumann, Jason Utt, and Sabine Schulte m Walde. 2017. https://aclanthology.org/W17-6910/ Contextual characteristics of concrete and abstract words . In Proceedings of the 12th International Conference on Computational Semantics ( IWCS ) --- Short papers

2017
[79]

Gemma Team , Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, and 179 others. 2024. https://arxiv.org/abs/2408.00118 Gemma 2:...

work page internal anchor Pith review arXiv 2024
[80]

Saptarshi Ghosh and Tianyu Jiang. 2025. https://doi.org/10.18653/v1/2025.naacl-long.330 C on M e C : A dataset for metonymy resolution with common nouns . In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025)

work page doi:10.18653/v1/2025.naacl-long.330 2025
[81]

Saptarshi Ghosh and Tianyu Jiang. 2026. https://arxiv.org/abs/2604.12919 Metfuse: Figurative fusion between metonymy and metaphor . Preprint, arXiv:2604.12919

work page internal anchor Pith review Pith/arXiv arXiv 2026
[82]

Saptarshi Ghosh, Linfeng Liu, and Tianyu Jiang. 2026. https://doi.org/10.18653/v1/2026.eacl-long.92 A computational approach to visual metonymy . In Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics (EACL 2026)

work page doi:10.18653/v1/2026.eacl-long.92 2026
[83]

Sam Glucksberg. 2001. https://academic.oup.com/book/32733 Understanding Figurative Language: From Metaphor to Idioms . Oxford University Press, New York, NY, USA

2001
[84]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, and others. 2024. https://arxiv.org/abs/2407.21783 The llama 3 herd of models . Preprint, arXiv:2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[85]

Guasch, P

M. Guasch, P. Ferr \'e , and I. Fraga. 2016. https://doi.org/10.3758/s13428-015-0684-y Spanish norms for affective and lexico-semantic variables for 1,400 words . Behavior Research Methods, 48(4):1358--1369

work page doi:10.3758/s13428-015-0684-y 2016
[86]

Hessel Haagsma, Johan Bos, and Malvina Nissim. 2020. https://aclanthology.org/2020.lrec-1.35/ MAGPIE : A large corpus of potentially idiomatic expressions . In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 279--287, Marseille, France. European Language Resources Association

2020
[87]

Jack Hessel, David Mimno, and Lillian Lee. 2018. https://doi.org/10.18653/v1/N18-1199 Quantifying the visual concreteness of words and topics in multimodal datasets . In Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics (NAACL 2018)

work page doi:10.18653/v1/n18-1199 2018
[88]

Holyoak and Du s an Stamenkovi \'c

Keith J. Holyoak and Du s an Stamenkovi \'c . 2018. https://doi.org/10.1037/bul0000145 Metaphor comprehension: A critical review of theories and evidence . Psychological Bulletin, 144(6):641--671

work page doi:10.1037/bul0000145 2018
[89]

Cosimo Iaia, Bhavin Choksi, Emily Wiebers, Gemma Roig, and Christian J. Fiebach. 2025. https://arxiv.org/abs/2505.15682 The representational alignment between humans and language models is implicitly driven by a concreteness effect . Preprint, arXiv:2505.15682

work page arXiv 2025
[90]

Jessen, R

F. Jessen, R. Heun, M. Erb, D.-O. Granath, U. Klose, A. Papassotiropoulos, and W. Grodd. 2000. https://doi.org/10.1006/brln.2000.2340 The concreteness effect: Evidence for dual coding and context availability . Brain and Language, 74(1):103--112

work page doi:10.1006/brln.2000.2340 2000
[91]

Skipper, and Gabriella Vigliocco

Viktor Kewenig, Jeremy I. Skipper, and Gabriella Vigliocco. 2025. https://api.semanticscholar.org/CorpusID:279984017 A multimodal transformer-based tool for automatic generation of concreteness ratings across languages . Communications Psychology, 3

2025
[92]

Huiyuan Lai and Malvina Nissim. 2024. https://doi.org/10.1145/3654795 A survey on automatic generation of figurative language: From rule-based systems to large language models . ACM Comput. Surv., 56(10)

work page doi:10.1145/3654795 2024
[93]

Lai, Odessa Howerton, and Rutvik H

Vicky T. Lai, Odessa Howerton, and Rutvik H. Desai. 2019. https://doi.org/10.1016/j.brainres.2019.03.005 Concrete processing of action metaphors: Evidence from erp . Brain Research, 1714:202--209

work page doi:10.1016/j.brainres.2019.03.005 2019
[94]

George Lakoff. 1993. https://doi.org/10.1017/CBO9781139173865.013 The contemporary theory of metaphor . In Andrew Ortony, editor, Metaphor and Thought, 2 edition, pages 202--251. Cambridge University Press, Cambridge, UK

work page doi:10.1017/cbo9781139173865.013 1993
[95]

George Lakoff and Mark Johnson. 1980. Metaphors We Live By. University of Chicago Press, Chicago

1980
[96]

Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, and Jie Fu. 2025. https://doi.org/10.18653/v1/2025.findings-naacl.251 A closer look into mixture-of-experts in large language models . In Findings of the Association for Computational Linguistics: NAACL 2025 (Findings of NAACL 2025)

work page doi:10.18653/v1/2025.findings-naacl.251 2025
[97]

Samuel Marks and Max Tegmark. 2024. https://arxiv.org/abs/2310.06824 The geometry of truth: Emergent linear structure in large language model representations of true/false datasets . Preprint, arXiv:2310.06824

work page internal anchor Pith review arXiv 2024
[98]

Rowan Hall Maudslay, Tiago Pimentel, Ryan Cotterell, and Simone Teufel. 2020. https://doi.org/10.18653/v1/2020.figlang-1.30 Metaphor detection using context and concreteness . In Proceedings of the Second Workshop on Figurative Language Processing, pages 221--226, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.figlang-1.30 2020
[99]

Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, Edouard Grave, Yann LeCun, and Thomas Scialom. 2023. https://arxiv.org/abs/2302.07842 Augmented language models: a survey . Preprint, arXiv:2302.07842

work page arXiv 2023
[100]

Mon, Mira Nencheva, Francesca M.M

Serena K. Mon, Mira Nencheva, Francesca M.M. Citron, Casey Lew-Williams, and Adele E. Goldberg. 2021. https://doi.org/10.1016/j.jml.2021.104285 Conventional metaphors elicit greater real-time engagement than literal paraphrases or concrete sentences . Journal of Memory and Language, 121:104285

work page doi:10.1016/j.jml.2021.104285 2021
[101]

Montefinese, L

M. Montefinese, L. Gregori, A. A. Ravelli, R. Varvara, and D. P. Radicioni. 2023. https://doi.org/10.1371/journal.pone.0293031 Concretext norms: Concreteness ratings for italian and english words in context . PLOS ONE, 18(10):e0293031

work page doi:10.1371/journal.pone.0293031 2023
[102]

Montefinese, A

M. Montefinese, A. Visalli, A. Angrilli, and E. Ambrosini. 2025. https://doi.org/10.1111/psyp.70074 Fine-grained concreteness effects on word processing and representation across three tasks: An erp study . Psychophysiology, 62(5):e70074

work page doi:10.1111/psyp.70074 2025
[103]

Emiko J Muraki, Summer Abdalla, Marc Brysbaert, and Penny M Pexman. 2023. https://doi.org/10.3758/s13428-022-01912-6 Concreteness ratings for 62,000 english multiword expressions . Behavior research methods, 55(5):2522--2531

work page doi:10.3758/s13428-022-01912-6 2023
[104]

gpt-oss-120b & gpt-oss-20b Model Card

OpenAI, :, Sandhini Agarwal, Lama Ahmad, Jason Ai, Sam Altman, Andy Applebaum, Edwin Arbus, Rahul K. Arora, Yu Bai, Bowen Baker, Haiming Bao, Boaz Barak, Ally Bennett, Tyler Bertao, and others. 2025. https://arxiv.org/abs/2508.10925 gpt-oss-120b gpt-oss-20b model card . Preprint, arXiv:2508.10925

work page internal anchor Pith review arXiv 2025
[105]

Klaus-Uwe Panther and G \"u nter Radden, editors. 1999. Metonymy in Language and Thought, volume 4 of Human Cognitive Processing. John Benjamins Publishing Company, Amsterdam / Philadelphia

1999

Showing first 80 references.