pith. machine review for the scientific record. sign in

arxiv: 2512.24329 · v1 · submitted 2025-12-30 · 💻 cs.CL

Recognition: 2 theorem links

· Lean Theorem

World model inspired sarcasm reasoning with large language model agents

Authors on Pith no claims yet

Pith reviewed 2026-05-16 18:50 UTC · model grok-4.3

classification 💻 cs.CL
keywords sarcasm detectionworld modelLLM agentsinconsistency scoreintention reasoningnatural language processinginterpretable AI
0
0 comments X

The pith

World model agents detect sarcasm by measuring inconsistency between literal meaning and speaker intention.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reformulates sarcasm detection as a structured reasoning process that breaks an utterance into literal meaning, surrounding context, normative expectations, and speaker intentions, each handled by its own LLM agent. It computes a deterministic inconsistency score from the gap between the literal evaluation and the normative expectation, then feeds that score together with an intention score into a simple logistic regression model to output a sarcasm probability. This design keeps the final decision numerically interpretable while using the agents to capture the cognitive mismatch that defines sarcasm. Experiments show the approach beats both traditional deep learning models and other LLM baselines on standard sarcasm benchmarks, with ablations confirming that the inconsistency and intention components are essential.

Core claim

WM-SAR decomposes sarcasm understanding into literal meaning, context, normative expectation, and intention using specialized LLM-based agents. The discrepancy between literal evaluation and normative expectation is quantified as a deterministic inconsistency score, which together with an intention score is integrated by logistic regression to infer sarcasm probability, yielding superior performance and interpretability on representative sarcasm detection benchmarks.

What carries the argument

The WM-SAR framework of specialized LLM agents that extract literal meaning, normative expectations, and intentions, then combine a deterministic inconsistency score with an intention score through logistic regression.

If this is right

  • The method supplies explicit numerical signals that explain why a given utterance is classified as sarcastic.
  • Explicit separation of literal meaning from normative expectation allows the model to handle cases where surface wording conflicts with social norms.
  • The lightweight logistic regression layer preserves interpretability even when the underlying agents are large language models.
  • Ablation results indicate that removing either the inconsistency score or the intention component measurably degrades benchmark performance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same agent decomposition could be tested on related phenomena such as irony or indirect speech acts.
  • Running the inconsistency score on live social-media streams might expose how quickly normative expectations shift across communities.
  • Replacing the logistic regression with a small neural combiner could be checked to see whether performance gains justify the loss of direct numerical transparency.

Load-bearing premise

That LLM agents can reliably and consistently extract literal meaning, normative expectations, and intentions so the derived inconsistency score remains stable and the logistic regression produces a valid sarcasm probability.

What would settle it

A collection of utterances labeled sarcastic by humans where the model's computed inconsistency score shows no systematic difference from non-sarcastic utterances.

read the original abstract

Sarcasm understanding is a challenging problem in natural language processing, as it requires capturing the discrepancy between the surface meaning of an utterance and the speaker's intentions as well as the surrounding social context. Although recent advances in deep learning and Large Language Models (LLMs) have substantially improved performance, most existing approaches still rely on black-box predictions of a single model, making it difficult to structurally explain the cognitive factors underlying sarcasm. Moreover, while sarcasm often emerges as a mismatch between semantic evaluation and normative expectations or intentions, frameworks that explicitly decompose and model these components remain limited. In this work, we reformulate sarcasm understanding as a world model inspired reasoning process and propose World Model inspired SArcasm Reasoning (WM-SAR), which decomposes literal meaning, context, normative expectation, and intention into specialized LLM-based agents. The discrepancy between literal evaluation and normative expectation is explicitly quantified as a deterministic inconsistency score, and together with an intention score, these signals are integrated by a lightweight Logistic Regression model to infer the final sarcasm probability. This design leverages the reasoning capability of LLMs while maintaining an interpretable numerical decision structure. Experiments on representative sarcasm detection benchmarks show that WM-SAR consistently outperforms existing deep learning and LLM-based methods. Ablation studies and case analyses further demonstrate that integrating semantic inconsistency and intention reasoning is essential for effective sarcasm detection, achieving both strong performance and high interpretability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces WM-SAR, a framework inspired by world models for sarcasm reasoning using large language model agents. It decomposes the task into specialized agents for literal meaning, context, normative expectation, and intention. A deterministic inconsistency score is computed from the discrepancy between literal evaluation and normative expectation, combined with an intention score using logistic regression to predict sarcasm probability. The manuscript reports that this approach outperforms existing deep learning and LLM-based methods on sarcasm detection benchmarks, with ablation studies confirming the necessity of the semantic inconsistency and intention reasoning components.

Significance. If the results hold after addressing reproducibility, the work contributes an interpretable, modular approach to sarcasm detection that explicitly models key cognitive elements like inconsistency and intention, potentially improving both performance and explainability in NLP applications involving figurative language and social context. The hybrid design with lightweight logistic regression on LLM agents balances reasoning power with numerical transparency.

major comments (2)
  1. [Abstract] The assertion of a 'deterministic inconsistency score' in the abstract lacks any specification of mechanisms to control for the inherent stochasticity of LLMs, such as setting temperature to 0, employing greedy decoding, or fixing random seeds. This is load-bearing for the central empirical claims because the score is used in the logistic regression and the ablation studies rely on it to demonstrate the importance of the inconsistency component; without such controls, the results may vary across runs and the interpretability is compromised.
  2. [Experiments] Details on how the logistic regression coefficients are obtained are insufficient. If they are fitted using the same benchmark data as the evaluation, this introduces circularity that could inflate performance metrics and weaken the cross-benchmark claims of consistent outperformance.
minor comments (1)
  1. [Abstract] The abstract would benefit from including at least high-level quantitative results, specific benchmark names, or dataset sizes to allow readers to immediately gauge the magnitude of the reported improvements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on reproducibility and experimental details. We address each point below and will revise the manuscript to incorporate clarifications that strengthen the claims.

read point-by-point responses
  1. Referee: [Abstract] The assertion of a 'deterministic inconsistency score' in the abstract lacks any specification of mechanisms to control for the inherent stochasticity of LLMs, such as setting temperature to 0, employing greedy decoding, or fixing random seeds. This is load-bearing for the central empirical claims because the score is used in the logistic regression and the ablation studies rely on it to demonstrate the importance of the inconsistency component; without such controls, the results may vary across runs and the interpretability is compromised.

    Authors: We agree that explicit controls for stochasticity must be stated to support the determinism claim and the ablation results. In our implementation, all LLM agents used temperature=0 with greedy decoding and fixed random seeds to produce deterministic outputs for literal evaluation and normative expectation. We will revise the abstract to note these controls and add a methods subsection detailing the exact decoding parameters, ensuring the inconsistency score remains fully reproducible. revision: yes

  2. Referee: [Experiments] Details on how the logistic regression coefficients are obtained are insufficient. If they are fitted using the same benchmark data as the evaluation, this introduces circularity that could inflate performance metrics and weaken the cross-benchmark claims of consistent outperformance.

    Authors: The logistic regression is fitted exclusively on a held-out training split (via cross-validation) that is disjoint from all evaluation benchmark test sets, avoiding any circularity. Coefficients are learned to combine the inconsistency and intention scores on training data only, after which the fixed model is applied to the test benchmarks. We will expand the experiments section with the precise fitting procedure, data splits, and hyperparameters to make this transparent. revision: yes

Circularity Check

1 steps flagged

Logistic regression coefficients fitted to benchmark data reduce final sarcasm probability to data-driven fit

specific steps
  1. fitted input called prediction [Abstract (integration step)]
    "the discrepancy between literal evaluation and normative expectation is explicitly quantified as a deterministic inconsistency score, and together with an intention score, these signals are integrated by a lightweight Logistic Regression model to infer the final sarcasm probability"

    The inconsistency and intention scores are produced by the LLM agents; the LR then combines them into the final probability. Because the LR coefficients are fitted directly to the benchmark labels used for reported accuracy and ablation results, the 'prediction' of sarcasm is statistically forced by the same data rather than emerging from the world-model structure alone.

full rationale

The paper's central inference step extracts literal/normative scores via LLM agents then feeds them into logistic regression whose parameters are learned from the same sarcasm detection benchmarks used for final evaluation. This matches the fitted-input-called-prediction pattern: the reported performance and ablation gains are not independent predictions but outputs of a supervised combiner trained on the evaluation distribution. No evidence of held-out parameter fitting or external validation of the LR step is provided in the abstract or described method, creating moderate circular dependence even though the agent decomposition itself is not self-referential.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; ledger is minimal. The method assumes LLMs can extract the four components reliably and that the inconsistency score is a stable, deterministic quantity independent of prompt variation.

pith-pipeline@v0.9.0 · 5543 in / 1087 out tokens · 45207 ms · 2026-05-16T18:50:04.545573+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    The discrepancy between literal evaluation and normative expectation is explicitly quantified as a deterministic inconsistency score... D(u, C(u)) = M_literal(u) − E_norm(C(u))... SD(u, C(u)) = I[sgn(M_literal(u)) ≠ sgn(E_norm(C(u)))]

  • IndisputableMonolith/Foundation/ArrowOfTime.lean forward_accumulates echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    reformulate sarcasm understanding as a world model inspired reasoning process... observation→latent state→prediction→prediction error→decision

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 8 internal anchors

  1. [1]

    In: Proceedings of the 3rd International Conference on Smart Data Intelligence (ICSMDI), pp

    Salini, Y., HariKiran, J.: Sarcasm detection: A systematic review of methods and approaches. In: Proceedings of the 3rd International Conference on Smart Data Intelligence (ICSMDI), pp. 15–22. IEEE, Trichy, India (2023). https://doi.org/ 10.1109/ICSMDI57622.2023.00012

  2. [2]

    In: Proceedings of the 10th International Conference on Contemporary Computing (IC3), pp

    Jain, T., Agrawal, N., Goyal, G., Aggrawal, N.: Sarcasm detection of tweets: A comparative study. In: Proceedings of the 10th International Conference on Contemporary Computing (IC3), pp. 1–6. IEEE, Noida, India (2017). https:// doi.org/10.1109/IC3.2017.8284317

  3. [3]

    AI Open 4, 13–18 (2023) https://doi.org/10.1016/j.aiopen.2023.01.001

    Misra, R., Arora, P.: Sarcasm detection using news headlines dataset. AI Open 4, 13–18 (2023) https://doi.org/10.1016/j.aiopen.2023.01.001

  4. [4]

    IAES International Journal of Artificial Intelligence 13(4), 4695–4702 (2024) https://doi.org/10.11591/ijai.v13.i4.pp4695-4702

    Palaniammal, A., Anandababu, P.: Sarcasm detection on social data: Heuristic search and deep learning. IAES International Journal of Artificial Intelligence 13(4), 4695–4702 (2024) https://doi.org/10.11591/ijai.v13.i4.pp4695-4702

  5. [5]

    NPJ Artificial Intelligence1(1), 20 (2025) https://doi.org/10.1038/s44387-025-00031-9

    Wu, Y., Guo, W., Liu, Z., Ji, H., Xu, Z., Zhang, D.: How large language models encode theory of mind: A study on sparse parameter patterns. NPJ Artificial Intelligence1(1), 20 (2025) https://doi.org/10.1038/s44387-025-00031-9

  6. [6]

    IEEE Transactions on Artificial Intelligence, 1–15 (2024) https://doi.org/10.1109/TAI.2024.3515935

    Boutsikaris, L., Polykalas, S.: A comparative review of deep learning techniques on the classification of irony and sarcasm in text. IEEE Transactions on Artificial Intelligence, 1–15 (2024) https://doi.org/10.1109/TAI.2024.3515935

  7. [7]

    In: Proceedings of the 32nd International Conference on Neural Information Processing (ICONIP)

    Liu, Z., Zhou, Z., Hu, M.: Caf-i: A collaborative multi-agent framework for enhanced irony detection with large language models. In: Proceedings of the 32nd International Conference on Neural Information Processing (ICONIP). IEEE, Okinawa, Japan (2026). https://doi.org/10.48550/arXiv.2506.08430

  8. [8]

    World Models

    Ha, D., Schmidhuber, J.: World Models. arXiv preprint arXiv:1803.10122 (2018). https://doi.org/10.48550/arXiv.1803.10122

  9. [9]

    In: Proceedings of the 14th Conference on Compu- tational Natural Language Learning (CoNLL), pp

    Davidov, D., Tsur, O., Rappoport, A.: Semi-supervised recognition of sarcasm in twitter and amazon. In: Proceedings of the 14th Conference on Compu- tational Natural Language Learning (CoNLL), pp. 107–116. Association for Computational Linguistics, Uppsala, Sweden (2010) 26

  10. [10]

    Language Resources and Evaluation47(1), 239–268 (2013) https:// doi.org/10.1007/s10579-012-9196-x

    Reyes, A., Rosso, P., Veale, T.: A multidimensional approach for detecting irony in twitter. Language Resources and Evaluation47(1), 239–268 (2013) https:// doi.org/10.1007/s10579-012-9196-x

  11. [11]

    PLOS ONE16(6), 0252918 (2021) https://doi.org/10.1371/journal.pone.0252918

    Eke, C., Norman, A., Shuib, L.: Multi-feature fusion framework for sarcasm iden- tification on twitter data: A machine learning based approach. PLOS ONE16(6), 0252918 (2021) https://doi.org/10.1371/journal.pone.0252918

  12. [12]

    In: Proceedings of the International Conference on Text, Speech, and Dialogue

    Bharti, S.K., Sathya Babu, K., Jena, S.K.: Harnessing online news for sarcasm detection in hindi tweets. In: Proceedings of the International Conference on Text, Speech, and Dialogue. Lecture Notes in Computer Science, vol. 10415, pp. 679–686. Springer, Prague, Czech Republic (2017). https://doi.org/10.1007/ 978-3-319-69900-4 86

  13. [13]

    International Journal on Semantic Web and Information Systems13(4), 89–108 (2017) https://doi.org/ 10.4018/IJSWIS.2017100105

    Bharti, S.K., Pradhan, R., Babu, K.S., Jena, S.K.: Sarcastic sentiment detection based on types of sarcasm occurring in twitter data. International Journal on Semantic Web and Information Systems13(4), 89–108 (2017) https://doi.org/ 10.4018/IJSWIS.2017100105

  14. [14]

    In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP)

    Bhattacharyya, P., Joshi, A.: Computational sarcasm. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Copenhagen, Denmark (2017)

  15. [15]

    URL https://doi.org/10.3115/v1/d14-1162

    Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word rep- resentation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Compu- tational Linguistics, Doha, Qatar (2014). https://doi.org/10.3115/v1/D14-1162

  16. [16]

    In: Proceedings of the 26th International Conference on Computational Linguistics (COLING), pp

    Poria, S., Cambria, E., Hazarika, D., Vij, P.: A deeper look into sarcastic tweets using deep convolutional neural networks. In: Proceedings of the 26th International Conference on Computational Linguistics (COLING), pp. 1601–

  17. [17]

    A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks

    Association for Computational Linguistics, Osaka, Japan (2016). https: //doi.org/10.48550/arXiv.1610.08815

  18. [18]

    In: Proceedings of the 26th International Conference on Computational Lin- guistics (COLING), pp

    Zhang, M., Zhang, Y., Fu, G.: Tweet sarcasm detection using deep neural network. In: Proceedings of the 26th International Conference on Computational Lin- guistics (COLING), pp. 2449–2460. Association for Computational Linguistics, Osaka, Japan (2016)

  19. [19]

    In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp

    Liang, B., Lou, C., Li, X., Yang, M., Gui, L., He, Y.,et al.: Multi-modal sarcasm detection via cross-modal graph convolutional network. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1767–1777. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.acl-long.124

  20. [20]

    In: Proceedings of the 27 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Com- munications Technology (IAICT), pp

    Ueno, T., Inoshita, K.: Dual-branch feature extraction via discrepancy-aware fusion with evidential deep learning for sarcasm detection. In: Proceedings of the 27 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Com- munications Technology (IAICT), pp. 345–352. IEEE, Bali, Indonesia (2025). https://doi.org/10.1109/IAICT65714.202...

  21. [21]

    In: Proceedings of the International Conference on Neural Information Processing

    Inoshita, K., Ueno, T., Zhou, X.: Multi-scale convolutional fusion with con- trastive feature alignment for imbalanced data classification. In: Proceedings of the International Conference on Neural Information Processing. Lecture Notes in Computer Science, pp. 3–18. Springer, Kanazawa, Japan (2026). https://doi.org/ 10.1007/978-3-031-97141-9 1

  22. [22]

    IEEE Transactions on Affective Computing16(4), 2560–2578 (2025) https://doi.org/10.1109/TAFFC

    Zhang, Y., Zou, C., Lian, Z., Tiwari, P., Qin, J.: Sarcasmbench: Towards eval- uating large language models on sarcasm understanding. IEEE Transactions on Affective Computing16(4), 2560–2578 (2025) https://doi.org/10.1109/TAFFC. 2025.3604806

  23. [23]

    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

    Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F.,et al.: Chain- of-thought prompting elicits reasoning in large language models. In: Proceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS), pp. 24824–24837. Curran Associates Inc., New Orleans, USA (2022). https://doi.org/10.48550/arXiv.2201.11903

  24. [24]

    Automatic Chain of Thought Prompting in Large Language Models

    Zhang, Z., Zhang, A., Li, M., Smola, A.: Automatic chain of thought prompting in large language models. In: Proceedings of the 11th International Conference on Learning Representations (ICLR), Kigali, Rwanda (2023). https://doi.org/10. 48550/arXiv.2210.03493

  25. [25]

    25651–25659 (2025)

    Yao, B., Zhang, Y., Li, Q., Qin, J.: Is sarcasm detection a step-by-step reasoning process in large language models? In: Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI), pp. 25651–25659 (2025). https://doi.org/10. 1609/aaai.v39i24.34756

  26. [26]

    Improving Factuality and Reasoning in Language Models through Multiagent Debate

    Du, Y., Li, S., Torralba, A., Tenenbaum, J.B., Mordatch, I.: Improving factuality and reasoning in language models through multiagent debate. In: Proceedings of the 41st International Conference on Machine Learning (ICML), Vienna, Austria, pp. 11733–11763 (2024). https://doi.org/10.48550/arXiv.2305.14325

  27. [27]

    CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society

    Li, G., Hammoud, H.A.A.K., Itani, H., Khizbullin, D., Ghanem, B.: Camel: Com- municative agents for “mind” exploration of large language model society. In: Proceedings of the 37th International Conference on Neural Information Process- ing Systems (NeurIPS), pp. 51991–52008. Curran Associates Inc., New Orleans, USA (2023). https://doi.org/10.48550/arXiv.2...

  28. [28]

    In: Proceedings of the ICLR 2024 Workshop on LLM Agents, Vienna, Austria (2024)

    Wu, Y., Jia, F., Zhang, S., Li, H., Zhu, E., Wang, Y.,et al.: Mathchat: Con- verse to tackle challenging math problems with llm agents. In: Proceedings of the ICLR 2024 Workshop on LLM Agents, Vienna, Austria (2024). https: //doi.org/10.48550/arXiv.2306.01337 28

  29. [29]

    AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

    Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E.,et al.: Autogen: Enabling next-gen llm applications via multi-agent conversation. In: Proceedings of the Conference on Language Modeling (COLM). Association for Computa- tional Linguistics, Pennsylvania, USA (2024). https://doi.org/10.48550/arXiv. 2308.08155

  30. [30]

    Social Development (2023) https://doi.org/10.1111/sode.12666

    Misgav, K., Chomsky, A., Daniel, E.: Children’s understanding of values as men- tal concepts: Longitudinal changes and association with theory of mind. Social Development (2023) https://doi.org/10.1111/sode.12666

  31. [31]

    Really? Well. Apparently Bootstrapping Improves the Performance of Sarcasm and Nastiness Classifiers for Online Dialogue

    Lukin, S., Walker, M.: Really? well. apparently bootstrapping improves the per- formance of sarcasm and nastiness classifiers for online dialogue. In: Proceedings of the Workshop on Language Analysis in Social Media, pp. 30–40. Association for Computational Linguistics, Atlanta, Georgia (2013). https://doi.org/10.48550/ arXiv.1708.08572

  32. [32]

    In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pp

    Oraby, S., Harrison, V., Reed, L., Hernandez, E., Riloff, E., Walker, M.: Creating and characterizing a diverse corpus of sarcasm in dialogue. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pp. 31–41. Association for Computational Linguistics, Los Angeles, USA (2016). https://doi.org/10.18653/...

  33. [33]

    In: Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval), pp

    Van Hee, C., Lefever, E., Hoste, V.: Semeval-2018 task 3: Irony detection in english tweets. In: Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval), pp. 39–50. Association for Computational Linguistics, New Orleans, USA (2018). https://doi.org/10.18653/v1/S18-1005

  34. [34]

    In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp

    Tay, Y., Luu, A.T., Hui, S.C., Su, J.: Reasoning with sarcasm by reading in- between. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1010–1020. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1093

  35. [35]

    In: Frontiers in Artificial Intelligence and Applications, pp

    Hongliang, P., Zheng, L., Peng, F., Wang, W.: Modeling the incongruity between sentence snippets for sarcasm detection. In: Frontiers in Artificial Intelligence and Applications, pp. 337–344. IOS Press, Santiago Chile (2020). https://doi.org/10. 3233/FAIA200337

  36. [36]

    In: Findings of the Asso- ciation for Computational Linguistics: NAACL 2022, pp

    Liu, Y., Wang, Y., Sun, A., Meng, X., Li, J., Guo, J.: A dual-channel framework for sarcasm recognition by detecting sentiment conflict. In: Findings of the Asso- ciation for Computational Linguistics: NAACL 2022, pp. 1797–1808. Association for Computational Linguistics, Seattle, USA (2022). https://doi.org/10.18653/ v1/2022.findings-naacl.126

  37. [37]

    BERT: pre-training of deep bidirectional transformers for language understanding

    Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 4171–4186. Association for Computational Linguistics, 29 Minneapolis, Minnesota (2019). https:...