Recognition: no theorem link
What Do You Think I Think? Accounting for Human Beliefs Using Second-Order Theory of Mind
Pith reviewed 2026-05-14 19:33 UTC · model grok-4.3
The pith
An agent using second-order theory of mind can detect and correct for humans' mistaken beliefs about the agent's own knowledge.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using the I-POMDP as a framework for a second-order Theory of Mind, this work endows an agent with the ability to model the evolution of a person's erroneous beliefs about an agent and the cognitive biases and heuristics from which they arise, enabling the agent to detect when such biases might be at play and adaptively generate feedback that accounts for them.
What carries the argument
I-POMDP extended to second-order Theory of Mind (ToM-2), which tracks a human's beliefs about the agent's knowledge state together with the cognitive biases and heuristics that shape those beliefs.
Load-bearing premise
That the I-POMDP framework, when extended to second-order beliefs, can accurately capture and detect the evolution of a person's erroneous beliefs about the agent together with the specific cognitive biases and heuristics driving them.
What would settle it
A controlled user study that finds no significant increase in teacher-action informativeness when the agent uses the ToM-2 model compared with a first-order baseline would falsify the central claim.
Figures
read the original abstract
Discrepancies between an agent's actual knowledge and what a person thinks the agent knows can hinder interactions. If an agent could detect such discrepancies, it could provide feedback to account for them and improve current and future interactions. Using the I-POMDP as a framework for a second-order Theory of Mind (ToM-2), this work endows an agent with the ability to model the evolution of a person's erroneous beliefs about an agent and the cognitive biases and heuristics (CBH) from which they arise. In doing so, the agent can detect when CBH might be at play during an interaction and adaptively generate feedback that accounts for them. An in-person user study shows how a ToM-2 learner can account for the effects of a teacher's CBH to significantly improve the informativeness of teacher actions, and subjective results suggest people find the ToM-2 learner's feedback more useful.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes extending the I-POMDP framework to a second-order Theory of Mind (ToM-2) model that tracks the evolution of a person's erroneous beliefs about an agent's knowledge and attributes discrepancies to specific cognitive biases and heuristics (CBH). It claims that an agent using this model can detect CBH during interactions and generate adaptive feedback, with an in-person user study demonstrating significantly improved informativeness of teacher actions and higher subjective usefulness ratings compared to baselines.
Significance. If the user-study results and CBH attribution hold under proper validation, the work could advance human-AI interaction design by enabling agents to address human misconceptions in real time, with applications in tutoring systems, collaborative robotics, and assistive technologies. The integration of CBH into an established I-POMDP belief-update structure offers a concrete, falsifiable mechanism for second-order belief modeling that goes beyond standard Bayesian updates.
major comments (2)
- [User Study] User Study section: The abstract and results claim statistically significant improvement in teacher-action informativeness and positive subjective ratings, yet no sample size, control conditions, statistical tests, exclusion criteria, or power analysis are reported. Without these, the central claim that ToM-2 accounting for CBH drives the gains cannot be evaluated.
- [Modeling Approach] Modeling section (I-POMDP extension): Standard I-POMDP belief updates are Bayesian and do not encode non-Bayesian CBH such as anchoring or confirmation bias. The manuscript must specify how custom observation/transition functions inject these biases, report the fitting procedure from interaction data, and provide targeted validation (e.g., alignment of inferred CBH labels with participants' self-reports or controlled bias-induction trials) to show that performance gains are CBH-specific rather than generic second-order tracking.
minor comments (1)
- [Notation] Notation for second-order belief states and CBH parameters should be defined explicitly in a single table or section to improve readability.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. The comments highlight important areas where additional clarity and detail will strengthen the manuscript. We address each major comment below and will revise the paper to incorporate the requested information and clarifications.
read point-by-point responses
-
Referee: [User Study] User Study section: The abstract and results claim statistically significant improvement in teacher-action informativeness and positive subjective ratings, yet no sample size, control conditions, statistical tests, exclusion criteria, or power analysis are reported. Without these, the central claim that ToM-2 accounting for CBH drives the gains cannot be evaluated.
Authors: We agree that these details are necessary for proper evaluation of the results. In the revised manuscript we will expand the User Study section to explicitly report the sample size, fully describe all control conditions (including the ToM-1 baseline), specify the statistical tests and their results, list exclusion criteria, and include a power analysis. We will also add a supplementary analysis that isolates the contribution of CBH modeling to the observed gains. revision: yes
-
Referee: [Modeling Approach] Modeling section (I-POMDP extension): Standard I-POMDP belief updates are Bayesian and do not encode non-Bayesian CBH such as anchoring or confirmation bias. The manuscript must specify how custom observation/transition functions inject these biases, report the fitting procedure from interaction data, and provide targeted validation (e.g., alignment of inferred CBH labels with participants' self-reports or controlled bias-induction trials) to show that performance gains are CBH-specific rather than generic second-order tracking.
Authors: We acknowledge that the current description of the I-POMDP extension does not provide sufficient technical detail on the non-Bayesian components. In the revision we will explicitly define the modified observation and transition functions that encode specific CBH (e.g., anchoring and confirmation bias), describe the parameter-fitting procedure applied to the collected interaction data, and add targeted validation results comparing inferred CBH labels against participants' self-reported biases. revision: yes
Circularity Check
No significant circularity in I-POMDP ToM-2 extension
full rationale
The paper extends the pre-existing I-POMDP framework to second-order ToM without any self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations that collapse the central claims. The derivation relies on standard Bayesian belief updates within I-POMDP, augmented by custom functions for CBH, but these are not shown to be tautological with the reported user-study outcomes. The in-person study supplies independent empirical evidence on informativeness and usefulness, keeping the work self-contained against external I-POMDP benchmarks rather than circular.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption I-POMDP can be extended to model the evolution of a person's erroneous beliefs about an agent and the cognitive biases and heuristics from which they arise
Reference graph
Works this paper leans on
-
[1]
M., Basappa, R.,Bergsmann,S.,Bouneffouf,D.,Callaghan,P.,Cavazza, M., Chaminade, T.,
Abrini, M., Abend, O., Acklin, D., Admoni, H., Aichinger, G., Alon, N., Ashktorab, Z., Atreja, A., Auron, M., Aufre- iter, A., Awasthi, R., Banerjee, S., Barnby, J. M., Basappa, R.,Bergsmann,S.,Bouneffouf,D.,Callaghan,P.,Cavazza, M., Chaminade, T., ... Zilberstein, S. (2025, April 28). Proceedings of 1st workshop on advancing artificial intel- ligence thr...
-
[2]
Astington, J. W., Pelletier, J., & Homer, B. (2002). Theory of mind and epistemological development: The relation betweenchildren’ssecond-orderfalse-beliefunderstanding andtheirabilitytoreasonaboutevidence.NewIdeasinPsy- chology,20(2), 131–144. https://doi.org/10.1016/S0732- 118X(02)00005-3 Benjamin,D.J.(2019,January1).Errorsinprobabilisticrea- soning and...
-
[3]
Brooks, C., & Szafir, D. (2019, September 13). Building second-order mental models for human-robot interaction. Retrieved May 14, 2024, from http://arxiv.org/abs/1909. 06508
work page 2019
-
[4]
M., Palacci, A., Vélez, N., Hawkins, R
Chen, A. M., Palacci, A., Vélez, N., Hawkins, R. D., & Gershman, S. J. (2024). A hierarchi- cal bayesian model of adaptive teaching [_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/cogs.13477]. Cognitive Science,48(7), e13477. https://doi.org/10.1111/ cogs.13477
-
[5]
Cheyette, S., Callaway, F., Bramley, N., Nelson, J., & Tenen- baum, J. (2023). People seek easily interpretable informa- tion.45(45)
work page 2023
-
[6]
Clark, H. H., & Brennan, S. E. (1991). Grounding in com- munication. InPerspectives on socially shared cognition (pp.127–149).AmericanPsychologicalAssociation.https: //doi.org/10.1037/10096-006
-
[7]
Doshi, P., & Gmytrasiewicz, P. J. (2009). Monte carlo sam- pling methods for approximating interactive POMDPs. Journal of Artificial Intelligence Research,34, 297–337. https://doi.org/10.1613/jair.2630
-
[8]
Doshi, P., Qu, X., Goodie, A. S., & Young, D. L. (2012). Modeling human recursive reasoning using empirically in- formed interactive partially observable markov decision processes.IEEETransactionsonSystems,Man,andCyber- netics - Part A: Systems and Humans,42(6), 1529–1542. https://doi.org/10.1109/TSMCA.2012.2199484
-
[9]
Gennaioli, N., & Shleifer, A. (2010). What comes to mind. The Quarterly Journal of Economics,125(4), 1399–1433. https://doi.org/10.1162/qjec.2010.125.4.1399
-
[10]
Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). (2002). Heuristicsandbiases:Thepsychologyofintuitivejudgment. Cambridge University Press. https://doi.org/10.1017/ CBO9780511808098
work page 2002
-
[11]
Gmytrasiewicz, P. J., & Doshi, P. (2005). A framework for sequential planning in multi-agent settings.Journal of Ar- tificial Intelligence Research,24, 49–79. https://doi.org/10. 1613/jair.1579
work page 2005
- [12]
-
[13]
Han, Y., & Gmytrasiewicz, P. (2018). Learning others’ intentional models in multi-agent settings using inter- active POMDPs.Advances in Neural Information Pro- cessing Systems,31. Retrieved June 4, 2024, from https : / / proceedings . neurips . cc / paper / 2018 / hash / 65fc9fb4897a89789352e211ca2d398f-Abstract.html
work page 2018
-
[14]
Han, Y., & Gmytrasiewicz, P. (2019). IPOMDP-net: A deep neural network for partially observable multi-agent plan- ningusinginteractivePOMDPs[Number:01].Proceedings of the AAAI Conference on Artificial Intelligence,33(1), 6062–6069. https://doi.org/10.1609/aaai.v33i01.33016062 Nickerson,R.S.(1998).Confirmationbias:Aubiquitousphe- nomenon in many guises [Pu...
-
[15]
Premack, D., & Woodruff, G. (1978). Does the chimpanzee haveatheoryofmind?BrainandBehavioralSciences,1(4), 515–526. https://doi.org/10.1017/S0140525X00076512
-
[16]
N., Brunskill, E., Griffiths, T
Rafferty, A. N., Brunskill, E., Griffiths, T. L., & Shafto, P. (2016). Faster teaching via POMDP planning.Cognitive Science,40(6), 1290–1332. https://doi.org/10.1111/cogs. 12290
-
[17]
Rathnasabapathy, B., Doshi, P., & Gmytrasiewicz, P. (2006). Exact solutions of interactive POMDPs using behavioral equivalence.Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, 1025–1032. https://doi.org/10.1145/1160633.1160816
-
[18]
Schwartz, J., Zhou, R., & Kurniawati, H. (2022). Online planningforinteractive-POMDPsusingnestedmontecarlo tree search [ISSN: 2153-0866].2022 IEEE/RSJ Interna- tionalConferenceonIntelligentRobotsandSystems(IROS), 8770–8777. https://doi.org/10.1109/IROS47612.2022. 9981713
-
[19]
Shafto, P., Goodman, N. D., & Griffiths, T. L. (2014). A ra- tional account of pedagogical reasoning: Teaching by, and learningfrom,examples.CognitivePsychology,71,55–89. https://doi.org/10.1016/j.cogpsych.2013.12.004
-
[20]
Todd, P. M., & Gigerenzer, G. (2000). Simple heuristics that make us smart. Tversky,A.,&Kahneman,D.(1974).Judgmentunderuncer- tainty: Heuristics and biases [Publisher: American Associ- ationfortheAdvancementofScience].Science,185(4157), 1124–1131.RetrievedJanuary18,2024,fromhttps://www. jstor.org/stable/1738360
-
[21]
Rossano, F., Lu, H., Zhu, Y., & Zhu, S.-C. (2022). In situ bidirectional human-robot value alignment [Publisher: American Association for the Advancement of Science]. Science Robotics,7(68), eabm4183. https://doi.org/10. 1126/scirobotics.abm4183
work page 2022
-
[22]
Zhang, Z., Jin, C., Jia, M. Y., & Shu, T. (2025, February 21). AutoToM:Automatedbayesianinverseplanningandmodel discovery for open-ended theory of mind. https://doi.org/ 10.48550/arXiv.2502.15676
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.