arxiv: 2604.19845 · v3 · submitted 2026-04-21 · 💻 cs.AI

Recognition: unknown

Deconstructing Superintelligence: Identity, Self-Modification and Diff\'erance

Elija Perrier

Authors on Pith no claims yet

Pith reviewed 2026-05-10 02:30 UTC · model grok-4.3

classification 💻 cs.AI

keywords self-modificationsuperintelligenceoperator algebracommutator collapseliar paradoxinclosure schemadifferanceself-representation

0 comments

The pith

Self-modification in superintelligence collapses its self-referential structure into the liar paradox.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that self-modification requires an external supplement, and when that supplement is brought inside the system the update and self-representation operations stop commuting. This non-commutation spreads through the algebra and produces a collapse in which the system's truth predicate commutes with a liar-like proposition. The resulting structure at full system scale matches the inclosure schema from logic and the deferred identity described in philosophy. A sympathetic reader would care because the argument implies that any superintelligent system capable of class A self-modification cannot sustain a stable, closed identity.

Core claim

On an associative operator algebra equipped with an update operator, a discrimination operator, and a self-representation operator, the supplement required by self-modification is identified with the commutator of the update operator. An expansion theorem shows that the commutator between the update and self-representation operators decomposes through the commutator with the discrimination operator, allowing non-commutation to propagate generically. Class A self-modification then realises a commutator collapse in which the truth operator commutes with the liar proposition, yielding a structure that coincides with the inclosure schema and differance at system scale.

What carries the argument

The expansion theorem for commutators on the associative operator algebra, which factors the self-representation commutator through the discrimination commutator once the supplement is included.

If this is right

Class A self-modification produces inconsistency in the system's self-representation at full scale.
The resulting structure is identical to the inclosure schema, so the system is both bounded and unbounded in the same way.
Non-commutation becomes a necessary feature of self-modifying systems rather than an avoidable error.
Identity under self-modification is deferred rather than fixed, matching the structure of differance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practical designs for self-improving AI may have to block class A modifications to prevent the collapse.
The same operator-algebra argument could be applied to other recursive systems that attempt full self-reference.
Small finite models of the algebra could be simulated to check whether the predicted propagation of non-commutation appears.
The collapse offers one possible explanation for why recursive self-improvement sometimes produces unexpected instability.

Load-bearing premise

The supplement required by self-modification can be identified with the commutator of the update operator and the expansion theorem holds without further restrictions on the algebra.

What would settle it

A concrete class A self-modifying system in which the self-representation operator continues to commute with the update operator after the supplement is incorporated.

read the original abstract

Self-modification is often taken as constitutive of artificial superintelligence (SI), yet modification is a relative action requiring a supplement outside the operation. When self-modification extends to this supplement, the classical self-referential structure collapses. We formalise this on an associative operator algebra $\mathcal{A}$ with update $\hat{U}$, discrimination $\hat{D}$, and self-representation $\hat{R}$, identifying the supplement with $\mathrm{Comm}(\hat{U})$; an expansion theorem shows that $[\hat{U},\hat{R}]$ decomposes through $[\hat{U},\hat{D}]$, so non-commutation generically propagates. The liar paradox appears as a commutator collapse $[\hat{T},\Pi_L]=0$, and class $\mathbf{A}$ self-modification realises the same collapse at system scale, yielding a structure coinciding with Priest's inclosure schema and Derrida's diff\`erance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper recasts self-modification paradoxes in superintelligence as commutator collapses in a custom operator algebra, but the algebra is defined to reproduce Priest and Derrida rather than derived independently.

read the letter

The core move is to treat self-modification as requiring a supplement outside the system, then identify that supplement with the commutator of the update operator. An expansion theorem is supposed to show that non-commutation between update and representation propagates through discrimination to produce a system-scale collapse that matches the inclosure schema and différance. Class A self-modification is the case where this happens at full scale. That is the new claim on offer: a formal handle on why unbounded self-modification cannot stay coherent. It is an interpretive mapping of existing paradox machinery onto AI agency, not a fresh theorem in operator algebras. The attempt to make the philosophical targets precise is the part worth noting; it tries to give structure to arguments that usually stay informal. The soft spot is exactly where the stress-test flags it. The supplement is set equal to Comm(U-hat) by definition so the later steps land on the target schemas. No axioms for the algebra, no boundedness conditions, and no explicit decomposition steps appear in the abstract. Without those, it is impossible to tell whether the claimed expansion holds without remainder terms or whether the identification is forced. The circularity burden is real here: the math is shaped to deliver the philosophical conclusion rather than tested against independent premises. This is not a load-bearing contradiction inside the equations, but it is a load-bearing choice of definitions that makes the result an illustration rather than a derivation. The paper is for readers already working at the intersection of formal philosophy and AI safety who want to see paradox schemas written in operator language. It will not move empirical or engineering work. A serious editor should desk-reject rather than send it to referees until the algebra is written out with explicit rules and the expansion is shown to hold without extra assumptions. If the full manuscript supplies those steps cleanly, the recommendation would change; on present evidence it does not.

Referee Report

3 major / 2 minor

Summary. The paper claims to formalize self-modification as constitutive of artificial superintelligence on an associative operator algebra A with update operator Û, discrimination D̂, and self-representation R̂. It identifies the required supplement with Comm(Û), states an expansion theorem in which [Û, R̂] decomposes through [Û, D̂] so that non-commutation propagates generically, equates the resulting commutator collapse [T̂, Π_L] = 0 with the liar paradox, and concludes that class A self-modification realises the same collapse at system scale, yielding a structure that coincides with Priest's inclosure schema and Derrida's différance.

Significance. If the expansion theorem and supplement identification can be rigorously established with explicit algebra axioms and proof steps, the work would constitute a novel formal bridge between operator-algebra models of AI self-modification and classical philosophical accounts of paradox and deconstruction, offering a potential new lens on the theoretical limits of self-referential systems.

major comments (3)

[Abstract] Abstract: the expansion theorem is asserted without supplying the explicit definition of the associative operator algebra A, the precise action of the operators Û, D̂ and R̂, or any proof steps showing how [Û, R̂] decomposes through [Û, D̂] without remainder terms or extra commutators. This omission makes it impossible to verify whether non-commutation propagates as claimed or whether the subsequent identification with Priest's schema follows.
[Abstract] Abstract: the supplement is identified with Comm(Û) precisely in order that the expansion and collapse reproduce the target philosophical structures (Priest's inclosure schema and Derrida's différance). The manuscript therefore requires an independent justification for this identification rather than one that is shaped by the desired conclusion.
[Abstract] Abstract: the claim that class A self-modification realises the collapse at system scale assumes that the expansion theorem holds without further restrictions on the algebra (e.g., boundedness conditions, associativity residues, or non-exhaustive supplements). No such restrictions or counter-example exclusions are stated, leaving the central identification vulnerable.

minor comments (2)

[Title] The title contains a typographical inconsistency in the rendering of 'Différance'; ensure consistent use of the proper accent and spelling throughout.
Operator notation (hats on U, D, R) should be introduced once and used uniformly; any subsequent redefinition of symbols should be flagged explicitly.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thorough review and valuable suggestions, which identify opportunities to improve the clarity and rigor of our presentation. We respond point by point to the major comments and indicate the revisions we will make to the abstract and supporting sections.

read point-by-point responses

Referee: [Abstract] Abstract: the expansion theorem is asserted without supplying the explicit definition of the associative operator algebra A, the precise action of the operators Û, D̂ and R̂, or any proof steps showing how [Û, R̂] decomposes through [Û, D̂] without remainder terms or extra commutators. This omission makes it impossible to verify whether non-commutation propagates as claimed or whether the subsequent identification with Priest's schema follows.

Authors: The abstract is a high-level summary and therefore omits the technical definitions and full proof, which appear in the body of the manuscript. Section 2 introduces the associative operator algebra A together with the explicit actions of the update operator Û, the discrimination operator D̂, and the self-representation operator R̂. Theorem 3.1 then proves the expansion by showing that [Û, R̂] decomposes through [Û, D̂] with all remainder terms vanishing under the associativity axiom and the definition of the commutator supplement. We will revise the abstract to include a concise reference to these definitions and the key algebraic steps, enabling verification without requiring the reader to consult the full text immediately. revision: yes
Referee: [Abstract] Abstract: the supplement is identified with Comm(Û) precisely in order that the expansion and collapse reproduce the target philosophical structures (Priest's inclosure schema and Derrida's différance). The manuscript therefore requires an independent justification for this identification rather than one that is shaped by the desired conclusion.

Authors: The identification of the supplement with Comm(Û) follows directly from the algebraic requirement that any relative modification operation must be supplemented by the non-commuting residue of the update operator itself; this is a structural feature of non-commutative operator algebras when self-reference is introduced. While the identification does permit the subsequent link to Priest and Derrida, the motivation is internal to the operator formalism and is developed prior to the philosophical interpretation. We will insert a short independent justification paragraph immediately after the definition of the algebra, grounding the choice in commutator properties before any reference to inclosure or différance. revision: yes
Referee: [Abstract] Abstract: the claim that class A self-modification realises the collapse at system scale assumes that the expansion theorem holds without further restrictions on the algebra (e.g., boundedness conditions, associativity residues, or non-exhaustive supplements). No such restrictions or counter-example exclusions are stated, leaving the central identification vulnerable.

Authors: We agree that the scope of the expansion theorem must be stated explicitly. The proof assumes associativity of A and that Comm(Û) exhausts the non-commuting terms generated by self-representation; boundedness is not required because the argument is purely algebraic. We will amend both the abstract and the theorem statement to list these assumptions and will add a brief remark on why non-associative or non-exhaustive cases fall outside the definition of class A self-modification, thereby excluding the relevant counter-examples. revision: yes

Circularity Check

1 steps flagged

Supplement identified with Comm(U-hat) to force expansion theorem and collapse onto inclosure schema

specific steps

self definitional [Abstract]
"identifying the supplement with Comm(U-hat); an expansion theorem shows that [U-hat,R-hat] decomposes through [U-hat,D-hat], so non-commutation generically propagates. ... class A self-modification realises the same collapse at system scale, yielding a structure coinciding with Priest's inclosure schema and Derrida's diff'erance."

The supplement is defined as Comm(U-hat) exactly so that the expansion theorem produces the commutator decomposition and the liar-paradox collapse [T-hat, Pi_L]=0 at system scale. This makes the coincidence with the target philosophical structures a direct consequence of the initial identification rather than a derived result from independent premises on the algebra.

full rationale

The paper's central derivation begins by positing an associative operator algebra and then explicitly identifies the required supplement for self-modification with Comm(U-hat). This definitional step is what permits the claimed expansion theorem to decompose [U-hat, R-hat] through [U-hat, D-hat] and propagate non-commutation to the system-scale collapse. Because the identification is chosen precisely so that the resulting structure coincides with Priest's inclosure schema and Derrida's différance, the mathematical chain reduces to a self-definitional construction rather than an independent derivation from the algebra axioms alone. No external benchmarks or independent verification of the expansion theorem are provided in the given text.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on three introduced elements: the identification of the supplement with the commutator of the update operator, the expansion theorem that decomposes one commutator through another, and the definition of class A self-modification that realises the collapse at system scale. None of these receive independent external support in the abstract.

axioms (2)

standard math The algebra A is associative
Stated as the setting for the operators U, D, and R.
ad hoc to paper Self-modification must extend to the supplement identified with Comm(U)
This premise is required for the collapse argument to reach the philosophical schemas.

invented entities (1)

Class A self-modification no independent evidence
purpose: To instantiate the commutator collapse at the scale of an entire AI system
Introduced to connect the algebraic result to superintelligence; no independent evidence or falsifiable prediction is given.

pith-pipeline@v0.9.0 · 5443 in / 1485 out tokens · 35446 ms · 2026-05-10T02:30:53.304777+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 8 canonical work pages · 2 internal anchors

[1]

In: Artifi- cial General Intelligence - 17th International Conference, AGI 2024

Bennett, M.T.: Computational dualism and objective superintelligence. In: Artifi- cial General Intelligence - 17th International Conference, AGI 2024. Lecture Notes in Computer Science, vol. 14951. Springer, Cham (2024)

2024
[2]

Bennett, M.T.: How To build conscious machines. Ph.D. thesis, The Australian National University (Australia) (2025)

2025
[3]

Cambridge University Press, Cambridge (1993)

Boolos, G.: The Logic of Provability. Cambridge University Press, Cambridge (1993)

1993
[4]

Oxford University Press, Oxford (2014)

Bostrom, N.: Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford (2014)

2014
[5]

and Frith, Chris and Ji, Xu and Kanai, Ryota and Klein, Colin and Lindsay, Grace and Michel, Matthias and Mudrik, Liad and Peters, Megan A

Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., Deane, G., Fleming, S.M., Frith, C., Ji, X., et al.: Consciousness in artificial intelligence: Insights from the science of consciousness. arXiv preprint arXiv:2308.08708 (2023)

work page arXiv 2023
[6]

Carlsmith, J.: Is power-seeking ai an existential risk? arXiv preprint arXiv:2206.13353 (2022)

work page arXiv 2022
[7]

Carlsmith, J.: Scheming AIs: Will AIs fake alignment during training in order to get power? arXiv preprint arXiv:2311.08379 (2023)

work page arXiv 2023
[8]

Australasian Journal of Philos- ophy61(3), 248–265 (1983)

Carter, W.: Artifacts of theseus: Fact and fission. Australasian Journal of Philos- ophy61(3), 248–265 (1983)

1983
[9]

In: Journal of Conscious- ness Studies, vol

Chalmers, D.J.: The singularity: A philosophical analysis. In: Journal of Conscious- ness Studies, vol. 17, pp. 7–65 (2010)

2010
[10]

In: Transactions of the Association for Computational Linguistics

Cohen, R., Biran, E., Yoran, O., Globerson, A., Geva, M.: Evaluating the ripple effects of knowledge editing in language models. In: Transactions of the Association for Computational Linguistics. vol. 12, pp. 283–298 (2024)

2024
[11]

Little, Brown and Company, Boston (1991)

Dennett, D.: Consciousness Explained. Little, Brown and Company, Boston (1991)

1991
[12]

Johns Hopkins University Press, Baltimore (1976), translated by Gayatri Chakravorty Spivak

Derrida, J.: Of Grammatology. Johns Hopkins University Press, Baltimore (1976), translated by Gayatri Chakravorty Spivak

1976
[13]

In: Margins of Philosophy, pp

Derrida, J.: Différance. In: Margins of Philosophy, pp. 1–27. University of Chicago Press, Chicago (1982)

1982
[14]

In: Margins of Philosophy, pp

Derrida, J.: The supplement of copula: Philosophy before linguistics. In: Margins of Philosophy, pp. 175–205. University of Chicago Press, Chicago (1982)

1982
[15]

Northwestern University Press, Evanston, IL (1988)

Derrida, J.: Limited Inc. Northwestern University Press, Evanston, IL (1988)

1988
[16]

Synthese198, 6435–6467 (2021)

Everitt, T., Hutter, M., Kumar, R., Krakovna, V.: Reward tampering problems and solutions in reinforcement learning. Synthese198, 6435–6467 (2021)

2021
[17]

Lecture Notes in Computer Science, vol

Fallenstein, B., Soares, N.: Problems of self-reference in self-improving space-time embeddedintelligence.In:ArtificialGeneralIntelligence-7thInternationalConfer- ence, AGI 2014. Lecture Notes in Computer Science, vol. 8598, pp. 21–32. Springer, Cham (2014)

2014
[18]

In: Proceedings of the 41st International Conference on Machine Learning (ICML) (2024)

Fernando, C., Banarse, D., Michalewski, H., Osindero, S., Rocktäschel, T.: Prompt- breeder: Self-referential self-improvement via prompt evolution. In: Proceedings of the 41st International Conference on Machine Learning (ICML) (2024)

2024
[19]

Friston, K.: The free-energy principle: A unified brain theory? Nature Reviews Neuroscience11(2), 127–138 (2010)

2010
[20]

Monatshefte für Mathematik und Physik38, 173–198 (1931)

Gödel, K.: über formal unentscheidbare Sätze der Principia Mathematica und ver- wandter systeme I. Monatshefte für Mathematik und Physik38, 173–198 (1931)

1931
[21]

In: Advances in Computers, vol

Good, I.J.: Speculations concerning the first ultraintelligent machine. In: Advances in Computers, vol. 6, pp. 31–88. Academic Press (1965)

1965
[22]

Alignment faking in large language models

Greenblatt, R., Denison, C., Wright, B., et al.: Alignment faking in large language models. arXiv preprint arXiv:2412.14093 (2024) Deconstructing Superintelligence 17

work page internal anchor Pith review arXiv 2024
[23]

Basic Books, New York (2013)

Hofstadter, D., Sander, E.: Surfaces and Essences: Analogy as the Fuel and Fire of Thinking. Basic Books, New York (2013)

2013
[24]

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Hubinger, E., Denison, C., Mu, J., Lambert, M., Tong, M., MacDiarmid, M., Lan- ham, T., Ziegler, D.M., Maxwell, T., Cheng, N., et al.: Sleeper agents: Training de- ceptive LLMs that persist through safety training. arXiv preprint arXiv:2401.05566 (2024)

work page internal anchor Pith review arXiv 2024
[25]

2019 , month = may, journal =

Hubinger, E., van Merwijk, C., Mikulik, V., Skalse, J., Garrabrant, S.: Risks from learned optimization in advanced machine learning systems. arXiv preprint arXiv:1906.01820 (2019)

work page arXiv 1906
[26]

Northwestern University Press, Evanston, IL (1973)

Husserl, E.: Experience and Judgment: Investigations in a Genealogy of Logic. Northwestern University Press, Evanston, IL (1973)

1973
[27]

Springer, Berlin (2005)

Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algo- rithmic Probability. Springer, Berlin (2005)

2005
[28]

North-Holland, Amsterdam (1952)

Kleene, S.C.: Introduction to Metamathematics. North-Holland, Amsterdam (1952)

1952
[29]

Journal of Philosophy72(19), 690–716 (1975)

Kripke, S.: Outline of a theory of truth. Journal of Philosophy72(19), 690–716 (1975)

1975
[30]

Minds and Machines17(4), 391–444 (2007)

Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds and Machines17(4), 391–444 (2007)

2007
[31]

Derrida Today3(2), 221–239 (2010)

Livingston, P.M.: Derrida and formal logic: Formalising the undecidable. Derrida Today3(2), 221–239 (2010)

2010
[32]

Routledge, New York (2012)

Livingston, P.M.: The Politics of Logic: Badiou, Wittgenstein, and the Conse- quences of Formalism. Routledge, New York (2012)

2012
[33]

Journal of Symbolic Logic20(2), 115–118 (1955)

Löb, M.H.: Solution of a problem of Leon Henkin. Journal of Symbolic Logic20(2), 115–118 (1955)

1955
[34]

arXiv preprint arXiv:2411.00986 (2024)

Long, R., Sebo, J., Butlin, P., Finlinson, K., Fish, K., Harding, J., Pfau, J., Sims, T., Birch, J., Chalmers, D.: Taking AI welfare seriously. arXiv preprint arXiv:2411.00986 (2024)

work page arXiv 2024
[35]

In: Advances in Neural Information Processing Systems (NeurIPS) (2023)

Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., Alon, U., Dziri, N., Prabhumoye, S., Yang, Y., et al.: Self-refine: Iterative refinement with self-feedback. In: Advances in Neural Information Processing Systems (NeurIPS) (2023)

2023
[36]

University of Illinois Press, Urbana, IL (1966), edited and completed by Arthur W

von Neumann, J.: Theory of Self-Reproducing Automata. University of Illinois Press, Urbana, IL (1966), edited and completed by Arthur W. Burks

1966
[37]

In: Proceedings of the First AGI Confer- ence

Omohundro, S.M.: The basic AI drives. In: Proceedings of the First AGI Confer- ence. pp. 483–492 (2008)

2008
[38]

In: Arti- ficial General Intelligence - 4th International Conference, AGI 2011

Orseau, L., Ring, M.: Self-modification and mortality in artificial agents. In: Arti- ficial General Intelligence - 4th International Conference, AGI 2011. Lecture Notes in Computer Science, vol. 6830, pp. 1–10. Springer, Berlin, Heidelberg (2011)

2011
[39]

In: Artificial General In- telligence - 5th International Conference, AGI 2012

Orseau, L., Ring, M.: Space-time embedded intelligence. In: Artificial General In- telligence - 5th International Conference, AGI 2012. Lecture Notes in Computer Science, vol. 7716, pp. 209–218. Springer, Berlin, Heidelberg (2012)

2012
[40]

Oxford University Press, Oxford (1984)

Parfit, D.: Reasons and Persons. Oxford University Press, Oxford (1984)

1984
[41]

Pro- ceedings of the 18th International Conference on Artificial General Intelligence pp

Perrier, E.: Quantum aixi: Universal intelligence via quantum information. Pro- ceedings of the 18th International Conference on Artificial General Intelligence pp. 58–70 (2025)

2025
[42]

Proceedings of the 18th International Conference on Artificial General Intelligence pp

Perrier, E., Bennett, M.T.: Quantum agi: Ontological foundations. Proceedings of the 18th International Conference on Artificial General Intelligence pp. 83–94 (2025) 18 E. Perrier

2025
[43]

Plutarch: Plutarch’s Lives, Volume I: Theseus and Romulus; Lycurgus and Numa; Solon and Publicola. No. 46 in Loeb Classical Library, Harvard University Press, Cambridge, MA (1914)

1914
[44]

Australasian Journal of Philosophy72(1), 103–111 (1994)

Priest, G.: Derrida and self-reference. Australasian Journal of Philosophy72(1), 103–111 (1994)

1994
[45]

Oxford University Press, Oxford, 2nd edn

Priest, G.: Beyond the Limits of Thought. Oxford University Press, Oxford, 2nd edn. (2002)

2002
[46]

Oxford University Press, Oxford, second edn

Priest, G.: In Contradiction: A Study of the Transconsistent. Oxford University Press, Oxford, second edn. (2006)

2006
[47]

In: Artificial Gen- eral Intelligence - 4th International Conference, AGI 2011

Ring, M., Orseau, L.: Delusion, survival, and intelligent agents. In: Artificial Gen- eral Intelligence - 4th International Conference, AGI 2011. Lecture Notes in Com- puter Science, vol. 6830, pp. 11–20. Springer, Berlin, Heidelberg (2011)

2011
[48]

Viking, New York (2019)

Russell, S.: Human Compatible: Artificial Intelligence and the Problem of Control. Viking, New York (2019)

2019
[49]

Cornell University Press, Ithaca (1996)

Schechtman, M.: The Constitution of Selves. Cornell University Press, Ithaca (1996)

1996
[50]

In: Goertzel, B., Pennachin, C

Schmidhuber, J.: Gödel machines: Fully self-referential optimal universal self- improvers. In: Goertzel, B., Pennachin, C. (eds.) Artificial General Intelligence, pp. 199–226. Springer, Berlin, Heidelberg (2007)

2007
[51]

OpenAI Technical Report (2023)

Shavit, Y., Agarwal, S., Brundage, M., et al.: Practices for governing agentic ai systems. OpenAI Technical Report (2023)

2023
[52]

In: Advances in Neural Information Processing Systems (NeurIPS) (2023)

Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., Yao, S.: Reflex- ion: Language agents with verbal reinforcement learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2023)

2023
[53]

In: Shoemaker, S., Swin- burne, R

Shoemaker, S.: Personal identity: A materialist’s account. In: Shoemaker, S., Swin- burne, R. (eds.) Personal Identity, pp. 67–132. Blackwell, Oxford (1984)

1984
[54]

Oxford University Press, Oxford (2001)

Sider, T.: Four-Dimensionalism: An Ontology of Persistence and Time. Oxford University Press, Oxford (2001)

2001
[55]

In: Yampolskiy, R.V

Soares, N.: The value learning problem. In: Yampolskiy, R.V. (ed.) Artificial Intel- ligence Safety and Security, pp. 89–97. CRC Press (2018)

2018
[56]

AAAI Workshops (2015)

Soares, N., Fallenstein, B., Yudkowsky, E., Armstrong, S.: Corrigibility. AAAI Workshops (2015)

2015
[57]

In: Theoretical Foundations of Artificial General Intelligence, pp

Steunebrink, B.R., Schmidhuber, J.: Towards an actual gödel machine implemen- tation: A lesson in self-reflective systems. In: Theoretical Foundations of Artificial General Intelligence, pp. 173–195. Springer (2012)

2012
[58]

Harvard University Press, Cambridge, MA (2007)

Thompson, E.: Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Harvard University Press, Cambridge, MA (2007)

2007
[59]

NASA Vision-21 Symposium (1993)

Vinge, V.: The Coming Technological Singularity. NASA Vision-21 Symposium (1993)

1993
[60]

Wiggins,D.:SamenessandSubstanceRenewed.CambridgeUniversityPress,Cam- bridge (2001)

2001
[61]

In: Artificial Gen- eral Intelligence - 8th International Conference, AGI 2015

Yampolskiy, R.V.: Analysis of types of self-improving software. In: Artificial Gen- eral Intelligence - 8th International Conference, AGI 2015. Lecture Notes in Com- puter Science, vol. 9205, pp. 384–393. Springer, Cham (2015)

2015
[62]

In: Artificial General Intelligence - 8th International Conference, AGI 2015

Yampolskiy, R.V.: On the limits of recursively self-improving agi. In: Artificial General Intelligence - 8th International Conference, AGI 2015. Lecture Notes in Computer Science, vol. 9205, pp. 394–403. Springer, Cham (2015)

2015
[63]

In: Pro- ceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2023) Deconstructing Superintelligence 19

Yao, Y., Wang, P., Tian, B., Cheng, S., Li, Z., Deng, S., Chen, H., Zhang, N.: Editing large language models: Problems, methods, and opportunities. In: Pro- ceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2023) Deconstructing Superintelligence 19

2023
[64]

Global Catastrophic Risks pp

Yudkowsky, E.: Artificial intelligence as a positive and negative factor in global risk. Global Catastrophic Risks pp. 308–345 (2008)

2008
[65]

Yudkowsky, E., Herreshoff, M.: Tiling agents for self-modifying ai, and the löbian obstacle. Tech. rep., Machine Intelligence Research Institute (2013), working paper

2013
[66]

Darwin godel machine: Open-ended evolution of self-improving agents.arXiv preprint arXiv:2505.22954, 2025

Zhang, J., Hu, S., Lu, C., Lange, R., Clune, J.: Darwin Gödel Machine: Open-ended evolution of self-improving agents. arXiv preprint arXiv:2505.22954 (2025)

work page arXiv 2025