Minding language models’ (lack of) theory of mind: A plug-and-play multi-character belief tracker

Association for Computational Linguistics · 2023 · DOI 10.18653/v1/2023.acl-long.780

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open at publisher browse 4 citing papers

representative citing papers

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

cs.AI · 2026-05-25 · unverdicted · novelty 7.0

OmniToM is a new benchmark for Theory of Mind in LLMs that evaluates explicit belief extraction and seven-dimensional labeling from 895 stories, revealing an actor-specific belief-tracking bottleneck.

Bayesian Social Deduction with Graph-Informed Language Models

cs.AI · 2025-06-21 · unverdicted · novelty 7.0

Hybrid Bayesian-graph LLM agent reaches competitive performance against large models and achieves 67% win rate against humans in controlled Avalon play, outperforming baselines and human teammates.

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

cs.CL · 2026-06-04 · unverdicted · novelty 6.0

AURA improves implicit-need coverage by 0.07 over ReAct baselines on a 100-query benchmark by inserting an intent inference step controlled by a gap score, while cutting probes 82% on factual tasks.

PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking

cs.CL · 2026-04-20 · unverdicted · novelty 6.0

PDDL-Mind improves LLM accuracy on theory-of-mind benchmarks by over 5% by translating stories into verifiable PDDL states that decouple environment tracking from belief inference.

citing papers explorer

Showing 2 of 2 citing papers after filters.

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling cs.AI · 2026-05-25 · unverdicted · none · ref 26
OmniToM is a new benchmark for Theory of Mind in LLMs that evaluates explicit belief extraction and seven-dimensional labeling from 895 stories, revealing an actor-specific belief-tracking bottleneck.
Bayesian Social Deduction with Graph-Informed Language Models cs.AI · 2025-06-21 · unverdicted · none · ref 48
Hybrid Bayesian-graph LLM agent reaches competitive performance against large models and achieves 67% win rate against humans in controlled Avalon play, outperforming baselines and human teammates.

Minding language models’ (lack of) theory of mind: A plug-and-play multi-character belief tracker

fields

years

verdicts

representative citing papers

citing papers explorer