Introduces NCP-ExploreToM framework to evaluate LLMs on inducing belief states via planning and action, with GPT-5 succeeding on ~80% of tasks and outperforming humans.
arXiv preprint arXiv:2412.12175 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
MindZero is a self-supervised RL framework that trains MLLMs for online Theory of Mind reasoning by rewarding mental-state hypotheses that best explain observed actions via a planner, then distills this into fast inference.
LLMs display political plasticity via prompt-driven ideological adaptation that is more reliable in larger newer models, but inverted questions produce counterintuitive shifts suggesting data leakage.
citing papers explorer
-
Theory of Mind and Persuasion Beyond Conversation: Assessing the Capacity of LLMs to Induce Belief States via Planning and Action
Introduces NCP-ExploreToM framework to evaluate LLMs on inducing belief states via planning and action, with GPT-5 succeeding on ~80% of tasks and outperforming humans.
-
MindZero: Learning Online Mental Reasoning With Zero Annotations
MindZero is a self-supervised RL framework that trains MLLMs for online Theory of Mind reasoning by rewarding mental-state hypotheses that best explain observed actions via a planner, then distills this into fast inference.
-
Political Plasticity: An Analysis of Ideological Adaptability in Large Language Models
LLMs display political plasticity via prompt-driven ideological adaptation that is more reliable in larger newer models, but inverted questions produce counterintuitive shifts suggesting data leakage.