pith. machine review for the scientific record. sign in

arxiv: 2605.12745 · v1 · submitted 2026-05-12 · 💻 cs.HC · cs.AI

Recognition: no theorem link

What Do You Think I Think? Accounting for Human Beliefs Using Second-Order Theory of Mind

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:33 UTC · model grok-4.3

classification 💻 cs.HC cs.AI
keywords second-order theory of mindI-POMDPcognitive biases and heuristicshuman-agent interactionuser studyfeedback generationbelief modeling
0
0 comments X

The pith

An agent using second-order theory of mind can detect and correct for humans' mistaken beliefs about the agent's own knowledge.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that extending the I-POMDP framework to second-order theory of mind lets an agent track how a person's beliefs about the agent's knowledge evolve and what cognitive biases and heuristics drive any errors in those beliefs. Once detected, the agent can generate targeted feedback that accounts for those biases during interaction. An in-person user study found that this ToM-2 capability made teacher actions significantly more informative and that people judged the resulting feedback more useful than feedback from agents lacking the second-order model.

Core claim

Using the I-POMDP as a framework for a second-order Theory of Mind, this work endows an agent with the ability to model the evolution of a person's erroneous beliefs about an agent and the cognitive biases and heuristics from which they arise, enabling the agent to detect when such biases might be at play and adaptively generate feedback that accounts for them.

What carries the argument

I-POMDP extended to second-order Theory of Mind (ToM-2), which tracks a human's beliefs about the agent's knowledge state together with the cognitive biases and heuristics that shape those beliefs.

Load-bearing premise

That the I-POMDP framework, when extended to second-order beliefs, can accurately capture and detect the evolution of a person's erroneous beliefs about the agent together with the specific cognitive biases and heuristics driving them.

What would settle it

A controlled user study that finds no significant increase in teacher-action informativeness when the agent uses the ToM-2 model compared with a first-order baseline would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.12745 by Henny Admoni, Patrick Callaghan, Reid Simmons.

Figure 1
Figure 1. Figure 1: An AI learner with a zeroth-order Theory of Mind [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Proof-of-Concept Domain: Multi-featured cards are sorted into bins according to a rule. With this formulation, we conduct an in-person user study and showcase: • A novel I-POMDP-based framework that enables an agent to approximate when and how CBH affects a human’s be￾havior and beliefs; • This model enables an agent to generate feedback that ac￾counts for a person’s CBH upon being detected; and • Empirica… view at source ↗
Figure 3
Figure 3. Figure 3: Mean Relative Information Gain: The IG of the card placed by the human normalized by the IG of that time step’s optimal card. Left: When holding feedback content constant, the ToM-0 and ToM-2 learners elicit more informative teaching due to the ToM-2 learner’s adaptive feedback process. Right: The content of the ToM-2 learner’s two-statement feedback elicits more informative teaching [PITH_FULL_IMAGE:figu… view at source ↗
Figure 4
Figure 4. Figure 4: Mean Teacher Action Count: ToM-2 learners tend to elicit fewer teacher actions before teachers successfully end the session. Marginal significance indicated by +. (M2), deliberation time (M4), or total teaching session dura￾tion (M5), suggesting participants often identified a sufficient teaching strategy regardless of the learner they taught. Given the domain’s simplicity, this result is hardly surprising… view at source ↗
read the original abstract

Discrepancies between an agent's actual knowledge and what a person thinks the agent knows can hinder interactions. If an agent could detect such discrepancies, it could provide feedback to account for them and improve current and future interactions. Using the I-POMDP as a framework for a second-order Theory of Mind (ToM-2), this work endows an agent with the ability to model the evolution of a person's erroneous beliefs about an agent and the cognitive biases and heuristics (CBH) from which they arise. In doing so, the agent can detect when CBH might be at play during an interaction and adaptively generate feedback that accounts for them. An in-person user study shows how a ToM-2 learner can account for the effects of a teacher's CBH to significantly improve the informativeness of teacher actions, and subjective results suggest people find the ToM-2 learner's feedback more useful.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes extending the I-POMDP framework to a second-order Theory of Mind (ToM-2) model that tracks the evolution of a person's erroneous beliefs about an agent's knowledge and attributes discrepancies to specific cognitive biases and heuristics (CBH). It claims that an agent using this model can detect CBH during interactions and generate adaptive feedback, with an in-person user study demonstrating significantly improved informativeness of teacher actions and higher subjective usefulness ratings compared to baselines.

Significance. If the user-study results and CBH attribution hold under proper validation, the work could advance human-AI interaction design by enabling agents to address human misconceptions in real time, with applications in tutoring systems, collaborative robotics, and assistive technologies. The integration of CBH into an established I-POMDP belief-update structure offers a concrete, falsifiable mechanism for second-order belief modeling that goes beyond standard Bayesian updates.

major comments (2)
  1. [User Study] User Study section: The abstract and results claim statistically significant improvement in teacher-action informativeness and positive subjective ratings, yet no sample size, control conditions, statistical tests, exclusion criteria, or power analysis are reported. Without these, the central claim that ToM-2 accounting for CBH drives the gains cannot be evaluated.
  2. [Modeling Approach] Modeling section (I-POMDP extension): Standard I-POMDP belief updates are Bayesian and do not encode non-Bayesian CBH such as anchoring or confirmation bias. The manuscript must specify how custom observation/transition functions inject these biases, report the fitting procedure from interaction data, and provide targeted validation (e.g., alignment of inferred CBH labels with participants' self-reports or controlled bias-induction trials) to show that performance gains are CBH-specific rather than generic second-order tracking.
minor comments (1)
  1. [Notation] Notation for second-order belief states and CBH parameters should be defined explicitly in a single table or section to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. The comments highlight important areas where additional clarity and detail will strengthen the manuscript. We address each major comment below and will revise the paper to incorporate the requested information and clarifications.

read point-by-point responses
  1. Referee: [User Study] User Study section: The abstract and results claim statistically significant improvement in teacher-action informativeness and positive subjective ratings, yet no sample size, control conditions, statistical tests, exclusion criteria, or power analysis are reported. Without these, the central claim that ToM-2 accounting for CBH drives the gains cannot be evaluated.

    Authors: We agree that these details are necessary for proper evaluation of the results. In the revised manuscript we will expand the User Study section to explicitly report the sample size, fully describe all control conditions (including the ToM-1 baseline), specify the statistical tests and their results, list exclusion criteria, and include a power analysis. We will also add a supplementary analysis that isolates the contribution of CBH modeling to the observed gains. revision: yes

  2. Referee: [Modeling Approach] Modeling section (I-POMDP extension): Standard I-POMDP belief updates are Bayesian and do not encode non-Bayesian CBH such as anchoring or confirmation bias. The manuscript must specify how custom observation/transition functions inject these biases, report the fitting procedure from interaction data, and provide targeted validation (e.g., alignment of inferred CBH labels with participants' self-reports or controlled bias-induction trials) to show that performance gains are CBH-specific rather than generic second-order tracking.

    Authors: We acknowledge that the current description of the I-POMDP extension does not provide sufficient technical detail on the non-Bayesian components. In the revision we will explicitly define the modified observation and transition functions that encode specific CBH (e.g., anchoring and confirmation bias), describe the parameter-fitting procedure applied to the collected interaction data, and add targeted validation results comparing inferred CBH labels against participants' self-reported biases. revision: yes

Circularity Check

0 steps flagged

No significant circularity in I-POMDP ToM-2 extension

full rationale

The paper extends the pre-existing I-POMDP framework to second-order ToM without any self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations that collapse the central claims. The derivation relies on standard Bayesian belief updates within I-POMDP, augmented by custom functions for CBH, but these are not shown to be tautological with the reported user-study outcomes. The in-person study supplies independent empirical evidence on informativeness and usefulness, keeping the work self-contained against external I-POMDP benchmarks rather than circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that I-POMDP can represent second-order beliefs including cognitive biases; no free parameters or invented entities are specified in the abstract.

axioms (1)
  • domain assumption I-POMDP can be extended to model the evolution of a person's erroneous beliefs about an agent and the cognitive biases and heuristics from which they arise
    This assumption underpins the agent's ability to detect CBH and generate corrective feedback.

pith-pipeline@v0.9.0 · 5456 in / 1181 out tokens · 77621 ms · 2026-05-14T19:33:56.397800+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    M., Basappa, R.,Bergsmann,S.,Bouneffouf,D.,Callaghan,P.,Cavazza, M., Chaminade, T.,

    Abrini, M., Abend, O., Acklin, D., Admoni, H., Aichinger, G., Alon, N., Ashktorab, Z., Atreja, A., Auron, M., Aufre- iter, A., Awasthi, R., Banerjee, S., Barnby, J. M., Basappa, R.,Bergsmann,S.,Bouneffouf,D.,Callaghan,P.,Cavazza, M., Chaminade, T., ... Zilberstein, S. (2025, April 28). Proceedings of 1st workshop on advancing artificial intel- ligence thr...

  2. [2]

    W., Pelletier, J., & Homer, B

    Astington, J. W., Pelletier, J., & Homer, B. (2002). Theory of mind and epistemological development: The relation betweenchildren’ssecond-orderfalse-beliefunderstanding andtheirabilitytoreasonaboutevidence.NewIdeasinPsy- chology,20(2), 131–144. https://doi.org/10.1016/S0732- 118X(02)00005-3 Benjamin,D.J.(2019,January1).Errorsinprobabilisticrea- soning and...

  3. [3]

    (2019, September 13)

    Brooks, C., & Szafir, D. (2019, September 13). Building second-order mental models for human-robot interaction. Retrieved May 14, 2024, from http://arxiv.org/abs/1909. 06508

  4. [4]

    M., Palacci, A., Vélez, N., Hawkins, R

    Chen, A. M., Palacci, A., Vélez, N., Hawkins, R. D., & Gershman, S. J. (2024). A hierarchi- cal bayesian model of adaptive teaching [_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/cogs.13477]. Cognitive Science,48(7), e13477. https://doi.org/10.1111/ cogs.13477

  5. [5]

    Cheyette, S., Callaway, F., Bramley, N., Nelson, J., & Tenen- baum, J. (2023). People seek easily interpretable informa- tion.45(45)

  6. [6]

    Clark and Susan E

    Clark, H. H., & Brennan, S. E. (1991). Grounding in com- munication. InPerspectives on socially shared cognition (pp.127–149).AmericanPsychologicalAssociation.https: //doi.org/10.1037/10096-006

  7. [7]

    Doshi, P., & Gmytrasiewicz, P. J. (2009). Monte carlo sam- pling methods for approximating interactive POMDPs. Journal of Artificial Intelligence Research,34, 297–337. https://doi.org/10.1613/jair.2630

  8. [8]

    S., & Young, D

    Doshi, P., Qu, X., Goodie, A. S., & Young, D. L. (2012). Modeling human recursive reasoning using empirically in- formed interactive partially observable markov decision processes.IEEETransactionsonSystems,Man,andCyber- netics - Part A: Systems and Humans,42(6), 1529–1542. https://doi.org/10.1109/TSMCA.2012.2199484

  9. [9]

    Gennaioli, N., & Shleifer, A. (2010). What comes to mind. The Quarterly Journal of Economics,125(4), 1399–1433. https://doi.org/10.1162/qjec.2010.125.4.1399

  10. [10]

    Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). (2002). Heuristicsandbiases:Thepsychologyofintuitivejudgment. Cambridge University Press. https://doi.org/10.1017/ CBO9780511808098

  11. [11]

    J., & Doshi, P

    Gmytrasiewicz, P. J., & Doshi, P. (2005). A framework for sequential planning in multi-agent settings.Journal of Ar- tificial Intelligence Research,24, 49–79. https://doi.org/10. 1613/jair.1579

  12. [12]

    Habibian, S., Jonnavittula, A., & Losey, D. P. (2021). Here’s whati’velearned:Askingquestionsthatrevealrewardlearn- ing [ZSCC: 0000003].arXiv:2107.01995 [cs]. Retrieved November 13, 2021, from http://arxiv.org/abs/2107.01995

  13. [13]

    Han, Y., & Gmytrasiewicz, P. (2018). Learning others’ intentional models in multi-agent settings using inter- active POMDPs.Advances in Neural Information Pro- cessing Systems,31. Retrieved June 4, 2024, from https : / / proceedings . neurips . cc / paper / 2018 / hash / 65fc9fb4897a89789352e211ca2d398f-Abstract.html

  14. [14]

    Han, Y., & Gmytrasiewicz, P. (2019). IPOMDP-net: A deep neural network for partially observable multi-agent plan- ningusinginteractivePOMDPs[Number:01].Proceedings of the AAAI Conference on Artificial Intelligence,33(1), 6062–6069. https://doi.org/10.1609/aaai.v33i01.33016062 Nickerson,R.S.(1998).Confirmationbias:Aubiquitousphe- nomenon in many guises [Pu...

  15. [15]

    Premack, D., & Woodruff, G. (1978). Does the chimpanzee haveatheoryofmind?BrainandBehavioralSciences,1(4), 515–526. https://doi.org/10.1017/S0140525X00076512

  16. [16]

    N., Brunskill, E., Griffiths, T

    Rafferty, A. N., Brunskill, E., Griffiths, T. L., & Shafto, P. (2016). Faster teaching via POMDP planning.Cognitive Science,40(6), 1290–1332. https://doi.org/10.1111/cogs. 12290

  17. [17]

    Rathnasabapathy, B., Doshi, P., & Gmytrasiewicz, P. (2006). Exact solutions of interactive POMDPs using behavioral equivalence.Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, 1025–1032. https://doi.org/10.1145/1160633.1160816

  18. [18]

    Schwartz, J., Zhou, R., & Kurniawati, H. (2022). Online planningforinteractive-POMDPsusingnestedmontecarlo tree search [ISSN: 2153-0866].2022 IEEE/RSJ Interna- tionalConferenceonIntelligentRobotsandSystems(IROS), 8770–8777. https://doi.org/10.1109/IROS47612.2022. 9981713

  19. [19]

    Goodman and Thomas L

    Shafto, P., Goodman, N. D., & Griffiths, T. L. (2014). A ra- tional account of pedagogical reasoning: Teaching by, and learningfrom,examples.CognitivePsychology,71,55–89. https://doi.org/10.1016/j.cogpsych.2013.12.004

  20. [20]

    M., & Gigerenzer, G

    Todd, P. M., & Gigerenzer, G. (2000). Simple heuristics that make us smart. Tversky,A.,&Kahneman,D.(1974).Judgmentunderuncer- tainty: Heuristics and biases [Publisher: American Associ- ationfortheAdvancementofScience].Science,185(4157), 1124–1131.RetrievedJanuary18,2024,fromhttps://www. jstor.org/stable/1738360

  21. [21]

    Rossano, F., Lu, H., Zhu, Y., & Zhu, S.-C. (2022). In situ bidirectional human-robot value alignment [Publisher: American Association for the Advancement of Science]. Science Robotics,7(68), eabm4183. https://doi.org/10. 1126/scirobotics.abm4183

  22. [22]

    Y., & Shu, T

    Zhang, Z., Jin, C., Jia, M. Y., & Shu, T. (2025, February 21). AutoToM:Automatedbayesianinverseplanningandmodel discovery for open-ended theory of mind. https://doi.org/ 10.48550/arXiv.2502.15676