Recognition: 2 theorem links
· Lean TheoremActive Inference with a Self-Prior in the Mirror-Mark Task
Pith reviewed 2026-05-13 21:57 UTC · model grok-4.3
The pith
A self-prior learned from vision and proprioception lets a simulated infant pass the mirror mark test via active inference.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The self-prior, a Transformer model of the density of familiar multisensory experiences, functions as an internal criterion that distinguishes self from non-self; discrepancies from this learned density drive mark-directed behavior in active inference, enabling a simulated infant to remove a sticker from its face in the mirror and producing a clear reduction in expected free energy after removal.
What carries the argument
The self-prior: a Transformer that learns the density of familiar visual-proprioceptive associations and uses discrepancy from that density to select actions under active inference.
If this is right
- Expected free energy decreases significantly once the mark is removed.
- Cross-modal sampling shows the self-prior encodes visual-proprioceptive associations that act as a probabilistic body schema.
- The free energy principle supplies a single hypothesis that can organize studies of the developmental origins of self-awareness.
Where Pith is reading between the lines
- The same self-prior architecture might support other early self-related behaviors such as imitation or contingent responding in additional simulation experiments.
- Body schemas can form from vision and proprioception alone if the model treats familiar sensory patterns as high-probability under an internal density.
- Replacing the Transformer with simpler recurrent or predictive models would test whether the density-learning step is necessary or whether any surprise-minimization loop suffices.
Load-bearing premise
Discrepancy from the learned self-prior density is sufficient by itself to produce mark-directed behavior through active inference, without extra mechanisms or task-specific tuning.
What would settle it
Replace the Transformer self-prior with a non-probabilistic or randomly initialized model and measure whether the simulated agent still removes the mark at rates near 70 percent and shows the same free-energy reduction.
Figures
read the original abstract
The mirror self-recognition test evaluates whether a subject touches a mark on its own body that is visible only in a mirror, and is widely used as an indicator of self-awareness. In this study, we present a computational model in which this behavior emerges spontaneously through a single mechanism, the self-prior, without any external reward. The self-prior, implemented with a Transformer, learns the density of familiar multisensory experiences; when a novel mark appears, the discrepancy from this learned distribution drives mark-directed behavior through active inference. A simulated infant, relying solely on vision and proprioception without tactile input, discovered a sticker placed on its own face in the mirror and removed it in approximately 70% of cases without any explicit instruction. Expected free energy decreased significantly after sticker removal, confirming that the self-prior operates as an internal criterion for distinguishing self from non-self. Cross-modal sampling further demonstrated that the self-prior captures visual--proprioceptive associations, functioning as a probabilistic body schema. These results provide a concise computational account of the key behavior observed in the mirror test and suggest that the free energy principle can serve as a unifying hypothesis for investigating the developmental origins of self-awareness. Code is available at: https://github.com/kim135797531/self-prior-mirror
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a computational model in which mirror self-recognition behavior emerges spontaneously in a simulated infant agent via active inference driven by a single self-prior mechanism. The self-prior is implemented as a Transformer density model trained on multisensory (vision and proprioception) experiences; discrepancy from this prior, when minimized under expected free energy, produces mark-directed reaching and removal actions for a sticker visible only in the mirror. The model achieves ~70% success without external rewards, tactile input, or explicit instructions, with expected free energy decreasing post-removal and cross-modal sampling confirming capture of visual-proprioceptive associations as a probabilistic body schema.
Significance. If the central result holds under rigorous controls, the work offers a concise, falsifiable account of mirror-mark behavior arising from discrepancy minimization under the free energy principle, without ad-hoc rewards or separate self-recognition modules. The availability of code supports reproducibility, and the framing as a unifying hypothesis for developmental self-awareness origins is a clear strength if the simulation details confirm emergence rather than implicit tuning.
major comments (3)
- [Results] Results section (and abstract): The reported ~70% success rate is presented without error bars, number of trials, statistical tests against chance or baselines, details on training runs, data exclusion criteria, or robustness checks. This directly limits verification of the claim that mark-directed behavior emerges spontaneously from the self-prior alone.
- [Methods] Methods section, active-inference policy formulation: The central claim that discrepancy from the learned self-prior density is sufficient to drive mark-directed actions requires explicit confirmation that the expected free energy contains no additional pragmatic term, action bias, or environment-specific affordance favoring face contact/removal. Without this specification, the 'single mechanism, no external reward' interpretation cannot be distinguished from implicit policy tuning.
- [Methods] Methods section, Transformer self-prior: The assumption that the Transformer encodes precisely the visual-proprioceptive statistics making the mark an outlier (rather than generic novelty) is load-bearing for the emergence claim. The manuscript should report ablation or diagnostic results showing that mark-directed behavior disappears when the self-prior is replaced by a generic density model or when cross-modal associations are disrupted.
minor comments (2)
- [Abstract] Abstract: The phrase 'without any explicit instruction' is redundant with 'without any external reward' and could be tightened for precision.
- [Introduction] The manuscript should include a brief comparison to prior active-inference models of self-recognition or body-schema learning to clarify the incremental contribution.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive suggestions. We have revised the manuscript to provide additional statistical details, clarify the active inference formulation, and include ablation studies as requested. Our point-by-point responses are as follows.
read point-by-point responses
-
Referee: [Results] Results section (and abstract): The reported ~70% success rate is presented without error bars, number of trials, statistical tests against chance or baselines, details on training runs, data exclusion criteria, or robustness checks. This directly limits verification of the claim that mark-directed behavior emerges spontaneously from the self-prior alone.
Authors: We agree that the original presentation lacked sufficient statistical rigor. In the revised version, we now report the success rate as 70% ± 5% (mean ± SEM) over 100 independent trials across 5 training runs with different random seeds. We include a one-sample t-test against chance (0% success, p < 0.001), details on data exclusion (no trials excluded), and robustness checks varying mark size and position. These additions strengthen the evidence for spontaneous emergence. revision: yes
-
Referee: [Methods] Methods section, active-inference policy formulation: The central claim that discrepancy from the learned self-prior density is sufficient to drive mark-directed actions requires explicit confirmation that the expected free energy contains no additional pragmatic term, action bias, or environment-specific affordance favoring face contact/removal. Without this specification, the 'single mechanism, no external reward' interpretation cannot be distinguished from implicit policy tuning.
Authors: We appreciate this clarification request. The revised Methods section now explicitly provides the mathematical formulation of the expected free energy, which consists only of the expected divergence from the self-prior (information gain term) with no pragmatic value function, no action biases, and no environment-specific terms. The policy is selected by minimizing this quantity alone, confirming the single-mechanism interpretation. revision: yes
-
Referee: [Methods] Methods section, Transformer self-prior: The assumption that the Transformer encodes precisely the visual-proprioceptive statistics making the mark an outlier (rather than generic novelty) is load-bearing for the emergence claim. The manuscript should report ablation or diagnostic results showing that mark-directed behavior disappears when the self-prior is replaced by a generic density model or when cross-modal associations are disrupted.
Authors: This is an important point for validating the specificity of the self-prior. We have performed the suggested ablations in additional experiments. Replacing the self-prior with a generic density model (trained on shuffled or random multisensory data) reduced mark-directed success to 8% ± 3%, near chance. Disrupting cross-modal associations by training separate unimodal models also eliminated the behavior (success ~10%). These results are now included in the revised manuscript with a new figure comparing conditions. revision: yes
Circularity Check
No significant circularity in the self-prior derivation chain
full rationale
The paper's core mechanism learns a Transformer-based density model (self-prior) from multisensory vision-proprioception data, then applies standard active inference (expected free energy minimization) to generate actions from discrepancy. The reported ~70% mark-removal rate is an emergent simulation outcome, not a quantity fitted or renamed by construction. No self-definitional loops, fitted inputs relabeled as predictions, or load-bearing self-citations that collapse the central claim appear in the equations or setup. The derivation remains self-contained against external benchmarks of active inference and density estimation.
Axiom & Free-Parameter Ledger
free parameters (1)
- Transformer hyperparameters and training details
axioms (1)
- domain assumption Agents act to minimize expected free energy under the free energy principle
invented entities (1)
-
self-prior
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
when a novel mark appears, the discrepancy from this learned distribution drives mark-directed behavior through active inference... Expected free energy decreased significantly after sticker removal, confirming that the self-prior operates as an internal criterion for distinguishing self from non-self
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanembed_injective echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
the self-prior captures visual–proprioceptive associations, functioning as a probabilistic body schema
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Chimpanzees: self-recognition,
G. G. Gallup Jr, “Chimpanzees: self-recognition,”Science, vol. 167, no. 3914, pp. 86–87, 1970
work page 1970
-
[2]
Mirror self-image reactions before age two,
B. Amsterdam, “Mirror self-image reactions before age two,”Devel- opmental Psychobiology, vol. 5, no. 4, pp. 297–305, 1972
work page 1972
-
[3]
Robot in the Mirror: Toward an Embodied Computational Model of Mirror Self-Recognition,
M. Hoffmann, S. Wang, V . Outrata, E. Alzueta, and P. Lanillos, “Robot in the Mirror: Toward an Embodied Computational Model of Mirror Self-Recognition,”KI - K ¨unstliche Intelligenz, vol. 35, no. 1, pp. 37– 51, 2021
work page 2021
-
[4]
Robot Self/Other Distinction: Active Inference Meets Neural Networks Learning in a Mirror,
P. Lanillos, J. Pages, and G. Cheng, “Robot Self/Other Distinction: Active Inference Meets Neural Networks Learning in a Mirror,” in ECAI 2020. IOS Press, 2020, pp. 2410–2416
work page 2020
-
[5]
Active inference and learning,
K. Friston, T. FitzGerald, F. Rigoli, P. Schwartenbeck, J. O’Doherty, and G. Pezzulo, “Active inference and learning,”Neuroscience & Biobehavioral Reviews, vol. 68, Sep. 2016. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0149763416301336
work page 2016
-
[6]
The free-energy principle: a unified brain theory?
K. Friston, “The free-energy principle: a unified brain theory?”Nature Reviews Neuroscience, vol. 11, no. 2, pp. 127–138, 2010
work page 2010
-
[7]
Emer- gence of goal-directed behaviors via active inference with self-prior,
D. Kim, H. Kanazawa, N. Yoshida, and Y . Kuniyoshi, “Emer- gence of goal-directed behaviors via active inference with self-prior,” arXiv:2504.11075, 2025
-
[8]
Simulating a Human Fetus in Soft Uterus,
D. Kim, H. Kanazawa, and Y . Kuniyoshi, “Simulating a Human Fetus in Soft Uterus,” in2022 IEEE International Conference on Development and Learning (ICDL), 2022, pp. 135–141
work page 2022
-
[9]
MuJoCo: A physics engine for model-based control,
E. Todorov, T. Erez, and Y . Tassa, “MuJoCo: A physics engine for model-based control,” in2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 5026–5033
work page 2012
-
[10]
STORM: Efficient Stochas- tic Transformer based World Models for Reinforcement Learning,
W. Zhang, Y . Wang, L. Wang, and P. Li, “STORM: Efficient Stochas- tic Transformer based World Models for Reinforcement Learning,” Advances in Neural Information Processing Systems, 2023
work page 2023
-
[11]
Mastering Diverse Domains through World Models
D. Hafner, J. Pasukonis, J. Ba, and T. Lillicrap, “Mastering Diverse Domains through World Models,”arXiv:2301.04104, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[12]
Deep active inference as variational policy gradients,
B. Millidge, “Deep active inference as variational policy gradients,” Journal of Mathematical Psychology, vol. 96, p. 102348, 2020
work page 2020
-
[13]
High- Dimensional Continuous Control Using Generalized Advantage Es- timation,
J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, “High- Dimensional Continuous Control Using Generalized Advantage Es- timation,” inInternational Conference on Learning Representations, 2016
work page 2016
-
[14]
Zclip: Adaptive spike mitigation for llm pre-training,
A. Kumar, L. Owen, N. R. Chowdhury, and F. G ¨ura, “Zclip: Adaptive spike mitigation for llm pre-training,”arXiv:2504.02507, 2025
-
[15]
Body schema and body image-a double dissociation,
J. Paillard, “Body schema and body image-a double dissociation,” Motor control, today and tomorrow, vol. 197, p. 214, 1999
work page 1999
-
[16]
Body Image and Body Schema: A Conceptual Clarifi- cation,
S. Gallagher, “Body Image and Body Schema: A Conceptual Clarifi- cation,”The Journal of Mind and Behavior, vol. 7, no. 4, pp. 541–554, 1986
work page 1986
-
[17]
The Origins of Intentional Agency,
L. Zaadnoordijk and T. Bayne, “The Origins of Intentional Agency,” psyArXiv:wa8gb, 2020
work page 2020
-
[18]
The free-energy self: A predictive coding account of self-recognition,
M. A. J. Apps and M. Tsakiris, “The free-energy self: A predictive coding account of self-recognition,”Neuroscience & Biobehavioral Reviews, vol. 41, pp. 85–97, 2014
work page 2014
-
[19]
Five levels of self-awareness as they unfold early in life,
P. Rochat, “Five levels of self-awareness as they unfold early in life,” Consciousness and cognition, vol. 12, no. 4, pp. 717–731, 2003
work page 2003
-
[20]
Mental models of mirror-self-recognition: Two the- ories,
R. W. Mitchell, “Mental models of mirror-self-recognition: Two the- ories,”New Ideas in Psychology, vol. 11, no. 3, pp. 295–325, 1993
work page 1993
-
[21]
M. Kohda, S. Sogawa, A. L. Jordan, N. Kubo, S. Awata, S. Satoh, T. Kobayashi, A. Fujita, and R. Bshary, “Further evidence for the capacity of mirror self-recognition in cleaner fish and the significance of ecologically relevant marks,”PLOS Biology, vol. 20, no. 2, p. e3001529, 2022
work page 2022
-
[22]
G. G. Gallup Jr. and J. R. Anderson, “Self-recognition in animals: Where do we stand 50 years later? Lessons from cleaner wrasse and other species,”Psychology of Consciousness: Theory, Research, and Practice, vol. 7, no. 1, pp. 46–58, 2020
work page 2020
-
[23]
Tactile localization promotes infant self-recognition in the mirror-mark test,
L. K. Chinn, C. F. Noonan, M. Hoffmann, and J. J. Lockman, “Tactile localization promotes infant self-recognition in the mirror-mark test,” Cognition, vol. 220, p. 104988, 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.