Proceedings of the National Academy of Sciences , volume=

Evaluating large language models in theory of mind tasks , author= · 2024

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

Are you with me? A Framework for Detecting Mental Model Discrepancies in Task-Based Team Dialogues

cs.AI · 2026-05-04 · unverdicted · novelty 6.0

A new framework identifies four mental model discrepancy types in team dialogues and demonstrates they carry predictive signals for future misalignments via uniform-weighted historical counts.

Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations

cs.AI · 2026-04-28 · conditional · novelty 6.0

Improvements in LLM Theory of Mind on static benchmarks do not reliably improve performance in dynamic, first-person human-AI interactions across goal-oriented and experience-oriented tasks.

Modeling Pathology-Like Behavioral Patterns in Language Models Through Behavioral Fine-Tuning

cs.CL · 2026-05-21 · unverdicted · novelty 5.0

Fine-tuning LLMs on structured tasks inspired by maladaptive behaviors produces stable, context-general shifts in next-token distributions and response tendencies consistent with altered behavioral priors.

Do LLMs have core beliefs?

cs.LG · 2026-05-05 · unverdicted · novelty 5.0

LLMs generally fail to maintain stable worldviews under adversarial conversational pressure, indicating they lack core beliefs akin to those in human cognition.

citing papers explorer

Showing 4 of 4 citing papers.

Are you with me? A Framework for Detecting Mental Model Discrepancies in Task-Based Team Dialogues cs.AI · 2026-05-04 · unverdicted · none · ref 5
A new framework identifies four mental model discrepancy types in team dialogues and demonstrates they carry predictive signals for future misalignments via uniform-weighted historical counts.
Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations cs.AI · 2026-04-28 · conditional · none · ref 10
Improvements in LLM Theory of Mind on static benchmarks do not reliably improve performance in dynamic, first-person human-AI interactions across goal-oriented and experience-oriented tasks.
Modeling Pathology-Like Behavioral Patterns in Language Models Through Behavioral Fine-Tuning cs.CL · 2026-05-21 · unverdicted · none · ref 34
Fine-tuning LLMs on structured tasks inspired by maladaptive behaviors produces stable, context-general shifts in next-token distributions and response tendencies consistent with altered behavioral priors.
Do LLMs have core beliefs? cs.LG · 2026-05-05 · unverdicted · none · ref 24
LLMs generally fail to maintain stable worldviews under adversarial conversational pressure, indicating they lack core beliefs akin to those in human cognition.

Proceedings of the National Academy of Sciences , volume=

fields

years

verdicts

representative citing papers

citing papers explorer