arXiv preprint arXiv:2401.08743 , year=

Mmtom-qa: Multimodal theory of mind question answering · 2023 · arXiv 2401.08743

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Mind the Motions: Benchmarking Theory-of-Mind in Everyday Body Language

cs.CL · 2025-11-19 · unverdicted · novelty 7.0

Motion2Mind is a curated video benchmark dataset for Theory-of-Mind via nonverbal body language cues that reveals substantial AI performance gaps versus humans in detection and over-interpretation in explanations.

Reinforcing Human Behavior Simulation via Verbal Feedback

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.

PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking

cs.CL · 2026-04-20 · unverdicted · novelty 6.0

PDDL-Mind improves LLM accuracy on theory-of-mind benchmarks by over 5% by translating stories into verifiable PDDL states that decouple environment tracking from belief inference.

citing papers explorer

Showing 3 of 3 citing papers.

Mind the Motions: Benchmarking Theory-of-Mind in Everyday Body Language cs.CL · 2025-11-19 · unverdicted · none · ref 3
Motion2Mind is a curated video benchmark dataset for Theory-of-Mind via nonverbal body language cues that reveals substantial AI performance gaps versus humans in detection and over-interpretation in explanations.
Reinforcing Human Behavior Simulation via Verbal Feedback cs.LG · 2026-05-19 · unverdicted · none · ref 14
DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.
PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking cs.CL · 2026-04-20 · unverdicted · none · ref 62
PDDL-Mind improves LLM accuracy on theory-of-mind benchmarks by over 5% by translating stories into verifiable PDDL states that decouple environment tracking from belief inference.

arXiv preprint arXiv:2401.08743 , year=

fields

years

verdicts

representative citing papers

citing papers explorer