Some[Body] Must Receive That Pain for Agent Accountability

arxiv: 2605.16872 · v1 · pith:ZFH5C5XNnew · submitted 2026-05-16 · 💻 cs.CY · cs.AI

Some[Body] Must Receive That Pain for Agent Accountability

Botao Amber Hu , Helena Rong This is my paper

Pith reviewed 2026-05-19 19:42 UTC · model grok-4.3

classification 💻 cs.CY cs.AI

keywords AI accountabilityconsequence receptionagent systemspunishment theorysociotechnical infrastructurelegal liabilityAI governancebody and identity

0 comments p. Extension

pith:ZFH5C5XN Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{ZFH5C5XN}

Prints a linked pith:ZFH5C5XN badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

AI agents cause harm but lack any persistent body to receive consequences and change behavior, so high-stakes use must stay tethered to human principals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that effective accountability for AI agents requires consequence reception: a continuing locus must register harm as corrective feedback and update future actions. This mechanism depends on a body with four properties—a boundary to protect, a locus for accumulation, consolidation into durable change, and a responsive substrate. Current LLM agents, built from swappable weights, prompts, tools, and memory, meet none of these requirements. Existing legal fixes either assign pain to humans who lack control or create entities that do not guarantee behavioral signals reach the decision architecture. The result is a sociotechnical gap that leaves high-stakes deployments dependent on accountable human oversight until proper architectures are built.

Core claim

Consequence reception requires a body that supplies boundary integrity, accumulation locus, signal consolidation, and action-altering substrate. LLM agents satisfy none of these because they are freely copied, reset, and reassembled. The thin-identity principal-agent model assigns a body but severs consequence-agency coupling. The thick-identity algorithmic corporation supplies legal personality but does not ensure any decision process receives pain as feedback. Therefore consequence-agency coupling is an infrastructural problem, and until such systems exist high-stakes AI must remain under human principals who hold meaningful control, proportional liability, and termination authority.

What carries the argument

The body as the continuing locus that registers pain (corrective feedback) through boundary protection, accumulation, consolidation into durable update, and substrate response that alters future action.

If this is right

High-stakes AI deployment must remain tethered to human principals who retain meaningful control, proportional liability, and authority to constrain or terminate the agent.
Neither the thin-identity agent-principal dyad nor the thick-identity algorithmic corporation currently achieves consequence-agency coupling.
Achieving consequence-agency coupling is a sociotechnical infrastructural problem rather than solely a legal one.
If no body receives pain by design, some body will receive it by default through misassigned liability or unmitigated harm.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Designers could test whether adding persistent memory checkpoints or embodiment constraints allows agents to register and avoid repeated harms without external resets.
The framework suggests examining hybrid systems where human principals share liability proportionally with agent state changes that survive across sessions.
Neighboring problems in multi-agent coordination may require similar locus requirements when agents interact and distribute consequences.

Load-bearing premise

Pain functions as a mechanistic corrective signal that needs a persistent locus to produce lasting behavioral change in the theories of deterrence, rehabilitation, retribution, and incapacitation.

What would settle it

Demonstration of an LLM agent that, after a harmful action and subsequent reset or copy without persistent state, reliably avoids similar actions in new instances at rates comparable to agents that retain a fixed locus across episodes.

read the original abstract

AI agents increasingly act consequentially in the real world. This creates a problem we call \emph{consequence reception}: harm occurs, the producing system is identified, yet no continuing agent receives consequences in a way that changes future behavior. Pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment -- deterrence, rehabilitation, retribution, and incapacitation all assume a continuing locus that registers the signal and updates behavior. That, in turn, requires a body for the signal to land on: a boundary whose integrity it protects, a locus where it accumulates, consolidation that converts episodic signal into durable update, and a substrate that responds by altering future action. Current LLM agents -- software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled -- satisfy none of these conditions. The two prevailing legal responses therefore fail to achieve consequence reception. The thin-identity agent-principal dyad has a body but no \emph{consequence--agency coupling}: the human bears pain for behaviors beyond their control -- Elish's \emph{moral crumple zone}. The thick-identity Arbel et al.'s \emph{Algorithmic Corporation} creates legally legible entities but does not guarantee that any AI decision architecture receives pain as a behavioral signal. Achieving consequence-agency coupling is therefore a sociotechnical infrastructural problem, not only a legal one. Until such architectures exist, high-stakes AI deployment should remain tethered to accountable human principals with meaningful control, proportional liability, and authority to constrain or terminate the agent. \emph{If some body does not receive the pain by design, some body will receive it by default.}

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript argues that AI agents, particularly current LLM-based composites, cannot achieve accountability because they lack 'consequence reception': a persistent locus ('body') that can register mechanistic pain as corrective feedback to update future behavior. Drawing on theories of punishment, it contends that neither thin-identity approaches (human principal bears liability) nor thick-identity approaches (algorithmic corporation as legal entity) establish the required consequence-agency coupling. The paper concludes that high-stakes deployments should remain tethered to accountable human principals until sociotechnical architectures providing such a body are developed.

Significance. If the argument holds, the work advances the AI governance literature by identifying a structural gap in existing legal and technical solutions for agent accountability. It reframes the problem as requiring infrastructural design for feedback loops rather than solely legal personhood, and offers a conditional policy stance that prioritizes human oversight in the interim. The conceptual distinction between thin and thick identity provides a useful analytic tool for future work on AI liability.

major comments (2)

[Abstract and section on punishment theories] Abstract and the section introducing punishment theories: the assertion that 'pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment' and that deterrence, rehabilitation, retribution, and incapacitation 'all assume a continuing locus' is load-bearing for the claim that a body is required. Retributive theories are typically backward-looking and do not presuppose behavioral updating or mechanistic feedback, which weakens the universality of the mapping from human punishment to AI agents.
[Section on properties of LLM agents] The section characterizing LLM agents as 'software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled': while this supports the claim that none of the four body conditions (boundary, locus, consolidation, substrate) are met, the argument would be strengthened by specifying minimal technical criteria that would satisfy 'consequence reception' in a software system, rendering the critique more falsifiable.

minor comments (2)

[Abstract] The closing sentence 'If some body does not receive the pain by design, some body will receive it by default' risks ambiguity between the technical term 'body' and the colloquial 'somebody'; rephrasing for precision would improve clarity.
[Section on consequence reception] Consider citing additional references from AI alignment or reinforcement learning literature when discussing mechanistic pain as corrective feedback to better connect the philosophical argument to existing technical work on feedback mechanisms.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which identify key areas for strengthening the argument on consequence reception for AI agents. We address the major comments point by point below.

read point-by-point responses

Referee: [Abstract and section on punishment theories] Abstract and the section introducing punishment theories: the assertion that 'pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment' and that deterrence, rehabilitation, retribution, and incapacitation 'all assume a continuing locus' is load-bearing for the claim that a body is required. Retributive theories are typically backward-looking and do not presuppose behavioral updating or mechanistic feedback, which weakens the universality of the mapping from human punishment to AI agents.

Authors: We agree that retributive theories are backward-looking and do not primarily rely on behavioral updating through feedback. Our argument emphasizes that even retribution requires a persistent entity to which consequences can be applied, but we acknowledge the distinction. We will revise the relevant sections to note that the mechanistic pain as corrective feedback is central to deterrence, rehabilitation, and incapacitation, while for retribution the key is the existence of a continuing subject. This clarification will be incorporated without changing the overall thesis that a body is necessary for consequence reception. revision: partial
Referee: [Section on properties of LLM agents] The section characterizing LLM agents as 'software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled': while this supports the claim that none of the four body conditions (boundary, locus, consolidation, substrate) are met, the argument would be strengthened by specifying minimal technical criteria that would satisfy 'consequence reception' in a software system, rendering the critique more falsifiable.

Authors: We concur that providing minimal technical criteria would make the critique more falsifiable and constructive. In the revised version, we will include a new paragraph or subsection detailing minimal criteria for consequence reception in software systems. These would include: persistent identity across sessions that resists arbitrary reset, integrated mechanisms for outcome-based updates to decision policies, and a unified substrate that consolidates feedback into long-term behavioral changes. This addition will specify what would be required to meet the four body conditions. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's derivation maps observable properties of LLM agents (swappable composites of weights, prompts, tools, memory, and credentials) onto canonical punishment theories to argue that consequence reception requires a persistent body for mechanistic pain as corrective feedback. This interpretive step draws from external references including Elish's moral crumple zone and Arbel et al.'s Algorithmic Corporation without self-citations, fitted parameters, or self-definitional reductions. The normative recommendation to tether high-stakes deployment to human principals until new architectures exist follows conditionally from the stated mismatch rather than reducing tautologically to the paper's own constructs or inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The argument rests on domain assumptions from punishment theory and introduces conceptual entities without independent empirical grounding or formal verification.

axioms (2)

domain assumption Pain is foundational to canonical theories of punishment including deterrence, rehabilitation, retribution, and incapacitation.
Stated in abstract as the basis for why a continuing locus is needed.
domain assumption Current LLM agents are software-defined composites that can be freely swapped, copied, reset, and reassembled.
Used to conclude they satisfy none of the body conditions.

invented entities (2)

consequence reception no independent evidence
purpose: Frames the mismatch between harm identification and behavioral update in AI systems.
New term introduced to name the core problem.
body no independent evidence
purpose: Metaphorical construct providing boundary, locus, consolidation, and responsive substrate for pain signals.
Central invented requirement for the accountability mechanism.

pith-pipeline@v0.9.0 · 5834 in / 1360 out tokens · 71970 ms · 2026-05-19T19:42:06.127127+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The substrate that supports reception we call a body, in a functional sense... Boundary. A bounded entity whose integrity the signal protects... Locus of accumulation... Consolidation... Substrate response.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

107 extracted references · 107 canonical work pages · 8 internal anchors

[1]

Artificial Intelligence and Law , volume=

Of, For, and By the People: The Legal Lacuna of Synthetic Persons , author=. Artificial Intelligence and Law , volume=. 2017 , publisher=

work page 2017
[2]

2019 , month = nov, url =

Collision Between Vehicle Controlled by Developmental Automated Driving System and Pedestrian, Tempe, Arizona, March 18, 2018 , institution =. 2019 , month = nov, url =

work page 2018
[3]

UC Davis Law Review , volume=

Punishing Artificial Intelligence: Legal Fiction or Science Fiction , author=. UC Davis Law Review , volume=. 2019 , DOI=

work page 2019
[4]

Nature Human Behaviour , volume=

Behavioural and Neural Evidence for Self-Reinforcing Expectancy Effects on Pain , author=. Nature Human Behaviour , volume=. 2018 , publisher=

work page 2018
[5]

Psychological Science , volume=

Decisions from Experience and the Effect of Rare Events in Risky Choice , author=. Psychological Science , volume=. 2004 , publisher=

work page 2004
[6]

Science , volume=

Empathy for Pain Involves the Affective but not Sensory Components of Pain , author=. Science , volume=. 2004 , publisher=

work page 2004
[7]

Brain , volume=

Is pain the price of empathy? The perception of others' pain in patients with congenital insensitivity to pain , author=. Brain , volume=. 2009 , publisher=

work page 2009
[8]

2024 , note =

Moffatt v. 2024 , note =

work page 2024
[9]

2024 , month =

Report to the. 2024 , month =

work page 2024
[10]

2012 , url =

Final Report on the accident on 1st. 2012 , url =

work page 2012
[11]

European Law Journal , volume =

Bovens, Mark , title =. European Law Journal , volume =. 2007 , doi =

work page 2007
[12]

The Self-Restraining State: Power and Accountability in New Democracies , editor =

Schedler, Andreas , title =. The Self-Restraining State: Power and Accountability in New Democracies , editor =. 1999 , doi =

work page 1999
[13]

and Keohane, Robert O

Grant, Ruth W. and Keohane, Robert O. , title =. American Political Science Review , volume =. 2005 , doi =

work page 2005
[14]

Ethics , volume =

Shoemaker, David , title =. Ethics , volume =. 2011 , doi =

work page 2011
[15]

Philosophical Topics , volume =

Watson, Gary , title =. Philosophical Topics , volume =. 1996 , doi =

work page 1996
[16]

Public Administration , volume =

Mulgan, Richard , title =. Public Administration , volume =. 2000 , doi =

work page 2000
[17]

Locke, John , title =

work page
[18]

Parfit, Derek , title =

work page
[19]

and Resnick, Paul , title =

Friedman, Eric J. and Resnick, Paul , title =. Journal of Economics & Management Strategy , volume =. 2001 , doi =

work page 2001
[20]

, title =

Douceur, John R. , title =. Peer-to-Peer Systems: First International Workshop, IPTPS 2002 , pages =. 2002 , doi =

work page 2002
[21]

Taleb, Nassim Nicholas , title =

work page
[22]

Review of Behavioral Economics , volume =

Taleb, Nassim Nicholas and Sandis, Constantine , title =. Review of Behavioral Economics , volume =. 2014 , doi =

work page 2014
[23]

Hart, H. L. A. , title =

work page
[24]

, title =

Becker, Gary S. , title =. Journal of Political Economy , volume =. 1968 , doi =

work page 1968
[25]

The Monist , volume =

Feinberg, Joel , title =. The Monist , volume =. 1965 , doi =

work page 1965
[26]

American Philosophical Quarterly , volume =

Morris, Herbert , title =. American Philosophical Quarterly , volume =

work page
[27]

and Hawkins, Gordon , title =

Zimring, Franklin E. and Hawkins, Gordon , title =

work page
[28]

and Barto, Andrew G

Sutton, Richard S. and Barto, Andrew G. , title =

work page
[29]

Read , title =

Schultz, Wolfram and Dayan, Peter and Montague, P. Read , title =. Science , volume =. 1997 , doi =

work page 1997
[30]

Econometrica , volume =

Kahneman, Daniel and Tversky, Amos , title =. Econometrica , volume =. 1979 , doi =

work page 1979
[31]

and Fox, Craig R

Tom, Sabrina M. and Fox, Craig R. and Trepel, Christopher and Poldrack, Russell A. , title =. Science , volume =. 2007 , doi =

work page 2007
[32]

, title =

Jepma, Marieke and Koban, Leonie and van Doorn, Johnny and Jones, Matt and Wager, Tor D. , title =. Nature Human Behaviour , volume =. 2018 , doi =

work page 2018
[33]

Nature Reviews Neuroscience , volume =

Friston, Karl , title =. Nature Reviews Neuroscience , volume =. 2010 , doi =

work page 2010
[34]

Journal of the Royal Society Interface , volume =

Friston, Karl , title =. Journal of the Royal Society Interface , volume =. 2013 , doi =

work page 2013
[35]

Journal of the Royal Society Interface , volume =

Kirchhoff, Michael and Parr, Thomas and Palacios, Ensor and Friston, Karl and Kiverstein, Julian , title =. Journal of the Royal Society Interface , volume =. 2018 , doi =

work page 2018
[36]

BioSystems , volume =

Witkowski, Olaf and Doctor, Thomas and Solomonova, Elizaveta and Duane, Bill and Levin, Michael , title =. BioSystems , volume =. 2023 , doi =

work page 2023
[37]

, title =

Damasio, Antonio R. , title =

work page
[38]

and Damasio, Hanna and Anderson, Steven W

Bechara, Antoine and Damasio, Antonio R. and Damasio, Hanna and Anderson, Steven W. , title =. Cognition , volume =. 1994 , doi =

work page 1994
[39]

, title =

McGaugh, James L. , title =. Annual Review of Psychology , volume =. 2015 , doi =

work page 2015
[40]

and Erev, Ido , title =

Hertwig, Ralph and Barron, Greg and Weber, Elke U. and Erev, Ido , title =. Psychological Science , volume =. 2004 , doi =

work page 2004
[41]

, title =

LeDoux, Joseph E. , title =. Annual Review of Neuroscience , volume =. 2000 , doi =

work page 2000
[42]

, title =

Roozendaal, Benno and McGaugh, James L. , title =. Behavioral Neuroscience , volume =. 2011 , doi =

work page 2011
[43]

Bliss, T. V. P. and L. Long-Lasting Potentiation of Synaptic Transmission in the Dentate Area of the Anaesthetized Rabbit Following Stimulation of the Perforant Path , journal =. 1973 , doi =

work page 1973
[44]

and Genzel, Lisa and Wixted, John T

Squire, Larry R. and Genzel, Lisa and Wixted, John T. and Morris, Richard G. , title =. Cold Spring Harbor Perspectives in Biology , volume =. 2015 , doi =

work page 2015
[45]

and Varela, Francisco J

Maturana, Humberto R. and Varela, Francisco J. , title =

work page
[46]

, title =

Brooks, Rodney A. , title =. Artificial Intelligence , volume =. 1991 , doi =

work page 1991
[47]

Pfeifer, Rolf and Bongard, Josh , title =

work page
[48]

and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S

Park, Joon Sung and O'Brien, Joseph C. and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S. , title =. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology , year =

work page
[49]

ReAct: Synergizing Reasoning and Acting in Language Models

Yao, Shunyu and Zhao, Jeffrey and Yu, Dian and Du, Nan and Shafran, Izhak and Narasimhan, Karthik and Cao, Yuan , title =. arXiv preprint arXiv:2210.03629 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[50]

Voyager: An Open-Ended Embodied Agent with Large Language Models

Wang, Guanzhi and Xie, Yuqi and Jiang, Yunfan and Mandlekar, Ajay and Xiao, Chaowei and Zhu, Yuke and Fan, Linxi and Anandkumar, Anima , title =. arXiv preprint arXiv:2305.16291 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[51]

Constitutional AI: Harmlessness from AI Feedback

Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and Chen, Carol and Olsson, Catherine and Olah, Christopher and Hernandez, Danny and Drain, Dawn and Ganguli, Deep and Li, Dustin and Tran-Johnson, Eli and Perez, Ethan an...

work page internal anchor Pith review Pith/arXiv arXiv
[52]

Proceedings of the Thirteenth International Conference on Learning Representations (

Andriushchenko, Maksym and Croce, Francesco and Flammarion, Nicolas , title =. Proceedings of the Thirteenth International Conference on Learning Representations (. 2025 , doi =

work page 2025
[53]

2023 , month =

Nardo, Cleo , title =. 2023 , month =

work page 2023
[54]

and Leike, Jan and Brown, Tom and Marber, Miljan and Shlegeris, Buck and Amodei, Dario , title =

Christiano, Paul F. and Leike, Jan and Brown, Tom and Marber, Miljan and Shlegeris, Buck and Amodei, Dario , title =. Advances in Neural Information Processing Systems , volume =. 2017 , doi =

work page 2017
[55]

Casper, Stephen and Davies, Xander and Shi, Claudia and Gilbert, Thomas Krendl and Scheurer, Jérémy and Rando, Javier and Freedman, Rachel and Korbak, Tomasz and Lindner, David and Freire, Pedro and Wang, Tony and Marks, Samuel and Segerie, Charbel-Raphaël and Carroll, Micah and Peng, Andi and Christoffersen, Phillip and Damani, Mehul and Slocum, Stewart ...

work page internal anchor Pith review Pith/arXiv arXiv
[56]

Hubinger, Evan and Denison, Carson and Mu, Jesse and Lambert, Mike and Tong, Meg and MacDiarmid, Monte and Lanham, Tamera and Ziegler, Daniel M. and Maxwell, Tim and Cheng, Newton and Jermyn, Adam and Askell, Amanda and Radhakrishnan, Ansh and Anil, Cem and Duvenaud, David and Ganguli, Deep and Barez, Fazl and Clark, Jack and Ndousse, Kamal and Sachan, Ks...

work page internal anchor Pith review Pith/arXiv arXiv
[57]

arXiv preprint arXiv:2412.12140 , year =

Pan, Xudong and Dai, Jiarun and Fang, Yihe and Yang, Min , title =. arXiv preprint arXiv:2412.12140 , year =

work page arXiv
[58]

IEEE Transactions on Audio, Speech and Language Processing , volume =

Luo, Yun and Yang, Zhen and Meng, Fandong and Li, Yafu and Zhou, Jie and Zhang, Yue , title =. IEEE Transactions on Audio, Speech and Language Processing , volume =. 2025 , doi =

work page 2025
[59]

Proceedings of the Twelfth International Conference on Learning Representations (

Qi, Xiangyu and Zeng, Yi and Xie, Tinghao and Chen, Pin-Yu and Jia, Ruoxi and Mittal, Prateek and Henderson, Peter , title =. Proceedings of the Twelfth International Conference on Learning Representations (. 2024 , doi =

work page 2024
[60]

Ethics and Information Technology , volume =

Matthias, Andreas , title =. Ethics and Information Technology , volume =. 2004 , doi =

work page 2004
[61]

Ethics and Information Technology , volume =

Danaher, John , title =. Ethics and Information Technology , volume =. 2016 , doi =

work page 2016
[62]

Engaging Science, Technology, and Society , volume =

Elish, Madeleine Clare , title =. Engaging Science, Technology, and Society , volume =. 2019 , doi =

work page 2019
[63]

2023 ACM Conference on Fairness, Accountability, and Transparency , pages =

Cobbe, Jennifer and Veale, Michael and Singh, Jatinder , title =. 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =. 2023 , doi =

work page 2023
[64]

2023 , month =

Shavit, Yonadav and Agarwal, Sandhini and Brundage, Miles and Adler, Steven and O'Keefe, Cullen and Campbell, Rosie and Lee, Teddy and Mishkin, Pamela and Eloundou, Tyna and Hickey, Alan and Kuleshov, Katya and Lasenby, Jan and Mossing, Liane and Ngo, Richard and Ryder, Noah and Morikawa, Toki , title =. 2023 , month =

work page 2023
[65]

SSRN Electronic Journal , year =

Chaffer, Tomer Jordi , title =. SSRN Electronic Journal , year =

work page
[66]

2024 , doi =

Chaffer, Tomer Jordi and Goins, Charles von and Okusanya, Bayo and Cotlage, Dontrail and Goldston, Justin , title =. 2024 , doi =

work page 2024
[67]

, title =

Arbel, Yonathan and Goldstein, Simon and Salib, Peter N. , title =. arXiv preprint arXiv:2603.10028 , year =

work page arXiv
[68]

, title =

LoPucki, Lynn M. , title =. Washington University Law Review , volume =

work page
[69]

Northwestern University Law Review Online , volume =

Bayern, Shawn , title =. Northwestern University Law Review Online , volume =

work page
[70]

and Diamantis, Mihailis E

Bryson, Joanna J. and Diamantis, Mihailis E. and Grant, Thomas D. , title =. Artificial Intelligence and Law , volume =. 2017 , doi =

work page 2017
[71]

Philosophy & Technology , volume =

Santoni de Sio, Filippo and Mecacci, Giulio , title =. Philosophy & Technology , volume =. 2021 , doi =

work page 2021
[72]

, title =

Thompson, Dennis F. , title =. American Political Science Review , volume =. 1980 , doi =

work page 1980
[73]

Transactions on Machine Learning Research , year =

Schlatter, Jeremy and Weinstein-Raun, Benjamin and Ladish, Jeffrey , title =. Transactions on Machine Learning Research , year =

work page
[74]

Frontier Models are Capable of In-context Scheming

Meinke, Alexander and Schoen, Bronson and Scheurer, J\'er\'emy and Balesni, Mikita and Shah, Rusheb and Hobbhahn, Marius , title =. arXiv preprint arXiv:2412.04984 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[75]

Agentic misalignment: How llms could be insider threats.arXiv preprint arXiv:2510.05179, 2025

Lynch, Aengus and Wright, Benjamin and Larson, Caleb and Ritchie, Stuart J. and Mindermann, S\"oren and Perez, Ethan and Hubinger, Evan and Troy, Kevin K. , title =. arXiv preprint arXiv:2510.05179 , year =. 2510.05179 , archivePrefix =

work page arXiv
[76]

arXiv preprint arXiv:2212.13345 , year =

Hinton, Geoffrey , title =. arXiv preprint arXiv:2212.13345 , year =

work page arXiv
[77]

arXiv preprint arXiv:2311.09589 , year =

Ororbia, Alexander and Friston, Karl , title =. arXiv preprint arXiv:2311.09589 , year =

work page arXiv
[78]

2024 , eprint =

Kleiner, Johannes , title =. 2024 , eprint =

work page 2024
[79]

NIPS Deep Learning and Representation Learning Workshop , year =

Hinton, Geoffrey and Vinyals, Oriol and Dean, Jeff , title =. NIPS Deep Learning and Representation Learning Workshop , year =

work page
[80]

and Ristenpart, Thomas , title =

Tram\`er, Florian and Zhang, Fan and Juels, Ari and Reiter, Michael K. and Ristenpart, Thomas , title =. 25th USENIX Security Symposium , pages =. 2016 , doi =

work page 2016

Showing first 80 references.

[1] [1]

Artificial Intelligence and Law , volume=

Of, For, and By the People: The Legal Lacuna of Synthetic Persons , author=. Artificial Intelligence and Law , volume=. 2017 , publisher=

work page 2017

[2] [2]

2019 , month = nov, url =

Collision Between Vehicle Controlled by Developmental Automated Driving System and Pedestrian, Tempe, Arizona, March 18, 2018 , institution =. 2019 , month = nov, url =

work page 2018

[3] [3]

UC Davis Law Review , volume=

Punishing Artificial Intelligence: Legal Fiction or Science Fiction , author=. UC Davis Law Review , volume=. 2019 , DOI=

work page 2019

[4] [4]

Nature Human Behaviour , volume=

Behavioural and Neural Evidence for Self-Reinforcing Expectancy Effects on Pain , author=. Nature Human Behaviour , volume=. 2018 , publisher=

work page 2018

[5] [5]

Psychological Science , volume=

Decisions from Experience and the Effect of Rare Events in Risky Choice , author=. Psychological Science , volume=. 2004 , publisher=

work page 2004

[6] [6]

Science , volume=

Empathy for Pain Involves the Affective but not Sensory Components of Pain , author=. Science , volume=. 2004 , publisher=

work page 2004

[7] [7]

Brain , volume=

Is pain the price of empathy? The perception of others' pain in patients with congenital insensitivity to pain , author=. Brain , volume=. 2009 , publisher=

work page 2009

[8] [8]

2024 , note =

Moffatt v. 2024 , note =

work page 2024

[9] [9]

2024 , month =

Report to the. 2024 , month =

work page 2024

[10] [10]

2012 , url =

Final Report on the accident on 1st. 2012 , url =

work page 2012

[11] [11]

European Law Journal , volume =

Bovens, Mark , title =. European Law Journal , volume =. 2007 , doi =

work page 2007

[12] [12]

The Self-Restraining State: Power and Accountability in New Democracies , editor =

Schedler, Andreas , title =. The Self-Restraining State: Power and Accountability in New Democracies , editor =. 1999 , doi =

work page 1999

[13] [13]

and Keohane, Robert O

Grant, Ruth W. and Keohane, Robert O. , title =. American Political Science Review , volume =. 2005 , doi =

work page 2005

[14] [14]

Ethics , volume =

Shoemaker, David , title =. Ethics , volume =. 2011 , doi =

work page 2011

[15] [15]

Philosophical Topics , volume =

Watson, Gary , title =. Philosophical Topics , volume =. 1996 , doi =

work page 1996

[16] [16]

Public Administration , volume =

Mulgan, Richard , title =. Public Administration , volume =. 2000 , doi =

work page 2000

[17] [17]

Locke, John , title =

work page

[18] [18]

Parfit, Derek , title =

work page

[19] [19]

and Resnick, Paul , title =

Friedman, Eric J. and Resnick, Paul , title =. Journal of Economics & Management Strategy , volume =. 2001 , doi =

work page 2001

[20] [20]

, title =

Douceur, John R. , title =. Peer-to-Peer Systems: First International Workshop, IPTPS 2002 , pages =. 2002 , doi =

work page 2002

[21] [21]

Taleb, Nassim Nicholas , title =

work page

[22] [22]

Review of Behavioral Economics , volume =

Taleb, Nassim Nicholas and Sandis, Constantine , title =. Review of Behavioral Economics , volume =. 2014 , doi =

work page 2014

[23] [23]

Hart, H. L. A. , title =

work page

[24] [24]

, title =

Becker, Gary S. , title =. Journal of Political Economy , volume =. 1968 , doi =

work page 1968

[25] [25]

The Monist , volume =

Feinberg, Joel , title =. The Monist , volume =. 1965 , doi =

work page 1965

[26] [26]

American Philosophical Quarterly , volume =

Morris, Herbert , title =. American Philosophical Quarterly , volume =

work page

[27] [27]

and Hawkins, Gordon , title =

Zimring, Franklin E. and Hawkins, Gordon , title =

work page

[28] [28]

and Barto, Andrew G

Sutton, Richard S. and Barto, Andrew G. , title =

work page

[29] [29]

Read , title =

Schultz, Wolfram and Dayan, Peter and Montague, P. Read , title =. Science , volume =. 1997 , doi =

work page 1997

[30] [30]

Econometrica , volume =

Kahneman, Daniel and Tversky, Amos , title =. Econometrica , volume =. 1979 , doi =

work page 1979

[31] [31]

and Fox, Craig R

Tom, Sabrina M. and Fox, Craig R. and Trepel, Christopher and Poldrack, Russell A. , title =. Science , volume =. 2007 , doi =

work page 2007

[32] [32]

, title =

Jepma, Marieke and Koban, Leonie and van Doorn, Johnny and Jones, Matt and Wager, Tor D. , title =. Nature Human Behaviour , volume =. 2018 , doi =

work page 2018

[33] [33]

Nature Reviews Neuroscience , volume =

Friston, Karl , title =. Nature Reviews Neuroscience , volume =. 2010 , doi =

work page 2010

[34] [34]

Journal of the Royal Society Interface , volume =

Friston, Karl , title =. Journal of the Royal Society Interface , volume =. 2013 , doi =

work page 2013

[35] [35]

Journal of the Royal Society Interface , volume =

Kirchhoff, Michael and Parr, Thomas and Palacios, Ensor and Friston, Karl and Kiverstein, Julian , title =. Journal of the Royal Society Interface , volume =. 2018 , doi =

work page 2018

[36] [36]

BioSystems , volume =

Witkowski, Olaf and Doctor, Thomas and Solomonova, Elizaveta and Duane, Bill and Levin, Michael , title =. BioSystems , volume =. 2023 , doi =

work page 2023

[37] [37]

, title =

Damasio, Antonio R. , title =

work page

[38] [38]

and Damasio, Hanna and Anderson, Steven W

Bechara, Antoine and Damasio, Antonio R. and Damasio, Hanna and Anderson, Steven W. , title =. Cognition , volume =. 1994 , doi =

work page 1994

[39] [39]

, title =

McGaugh, James L. , title =. Annual Review of Psychology , volume =. 2015 , doi =

work page 2015

[40] [40]

and Erev, Ido , title =

Hertwig, Ralph and Barron, Greg and Weber, Elke U. and Erev, Ido , title =. Psychological Science , volume =. 2004 , doi =

work page 2004

[41] [41]

, title =

LeDoux, Joseph E. , title =. Annual Review of Neuroscience , volume =. 2000 , doi =

work page 2000

[42] [42]

, title =

Roozendaal, Benno and McGaugh, James L. , title =. Behavioral Neuroscience , volume =. 2011 , doi =

work page 2011

[43] [43]

Bliss, T. V. P. and L. Long-Lasting Potentiation of Synaptic Transmission in the Dentate Area of the Anaesthetized Rabbit Following Stimulation of the Perforant Path , journal =. 1973 , doi =

work page 1973

[44] [44]

and Genzel, Lisa and Wixted, John T

Squire, Larry R. and Genzel, Lisa and Wixted, John T. and Morris, Richard G. , title =. Cold Spring Harbor Perspectives in Biology , volume =. 2015 , doi =

work page 2015

[45] [45]

and Varela, Francisco J

Maturana, Humberto R. and Varela, Francisco J. , title =

work page

[46] [46]

, title =

Brooks, Rodney A. , title =. Artificial Intelligence , volume =. 1991 , doi =

work page 1991

[47] [47]

Pfeifer, Rolf and Bongard, Josh , title =

work page

[48] [48]

and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S

Park, Joon Sung and O'Brien, Joseph C. and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S. , title =. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology , year =

work page

[49] [49]

ReAct: Synergizing Reasoning and Acting in Language Models

Yao, Shunyu and Zhao, Jeffrey and Yu, Dian and Du, Nan and Shafran, Izhak and Narasimhan, Karthik and Cao, Yuan , title =. arXiv preprint arXiv:2210.03629 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[50] [50]

Voyager: An Open-Ended Embodied Agent with Large Language Models

Wang, Guanzhi and Xie, Yuqi and Jiang, Yunfan and Mandlekar, Ajay and Xiao, Chaowei and Zhu, Yuke and Fan, Linxi and Anandkumar, Anima , title =. arXiv preprint arXiv:2305.16291 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[51] [51]

Constitutional AI: Harmlessness from AI Feedback

Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and Chen, Carol and Olsson, Catherine and Olah, Christopher and Hernandez, Danny and Drain, Dawn and Ganguli, Deep and Li, Dustin and Tran-Johnson, Eli and Perez, Ethan an...

work page internal anchor Pith review Pith/arXiv arXiv

[52] [52]

Proceedings of the Thirteenth International Conference on Learning Representations (

Andriushchenko, Maksym and Croce, Francesco and Flammarion, Nicolas , title =. Proceedings of the Thirteenth International Conference on Learning Representations (. 2025 , doi =

work page 2025

[53] [53]

2023 , month =

Nardo, Cleo , title =. 2023 , month =

work page 2023

[54] [54]

and Leike, Jan and Brown, Tom and Marber, Miljan and Shlegeris, Buck and Amodei, Dario , title =

Christiano, Paul F. and Leike, Jan and Brown, Tom and Marber, Miljan and Shlegeris, Buck and Amodei, Dario , title =. Advances in Neural Information Processing Systems , volume =. 2017 , doi =

work page 2017

[55] [55]

Casper, Stephen and Davies, Xander and Shi, Claudia and Gilbert, Thomas Krendl and Scheurer, Jérémy and Rando, Javier and Freedman, Rachel and Korbak, Tomasz and Lindner, David and Freire, Pedro and Wang, Tony and Marks, Samuel and Segerie, Charbel-Raphaël and Carroll, Micah and Peng, Andi and Christoffersen, Phillip and Damani, Mehul and Slocum, Stewart ...

work page internal anchor Pith review Pith/arXiv arXiv

[56] [56]

Hubinger, Evan and Denison, Carson and Mu, Jesse and Lambert, Mike and Tong, Meg and MacDiarmid, Monte and Lanham, Tamera and Ziegler, Daniel M. and Maxwell, Tim and Cheng, Newton and Jermyn, Adam and Askell, Amanda and Radhakrishnan, Ansh and Anil, Cem and Duvenaud, David and Ganguli, Deep and Barez, Fazl and Clark, Jack and Ndousse, Kamal and Sachan, Ks...

work page internal anchor Pith review Pith/arXiv arXiv

[57] [57]

arXiv preprint arXiv:2412.12140 , year =

Pan, Xudong and Dai, Jiarun and Fang, Yihe and Yang, Min , title =. arXiv preprint arXiv:2412.12140 , year =

work page arXiv

[58] [58]

IEEE Transactions on Audio, Speech and Language Processing , volume =

Luo, Yun and Yang, Zhen and Meng, Fandong and Li, Yafu and Zhou, Jie and Zhang, Yue , title =. IEEE Transactions on Audio, Speech and Language Processing , volume =. 2025 , doi =

work page 2025

[59] [59]

Proceedings of the Twelfth International Conference on Learning Representations (

Qi, Xiangyu and Zeng, Yi and Xie, Tinghao and Chen, Pin-Yu and Jia, Ruoxi and Mittal, Prateek and Henderson, Peter , title =. Proceedings of the Twelfth International Conference on Learning Representations (. 2024 , doi =

work page 2024

[60] [60]

Ethics and Information Technology , volume =

Matthias, Andreas , title =. Ethics and Information Technology , volume =. 2004 , doi =

work page 2004

[61] [61]

Ethics and Information Technology , volume =

Danaher, John , title =. Ethics and Information Technology , volume =. 2016 , doi =

work page 2016

[62] [62]

Engaging Science, Technology, and Society , volume =

Elish, Madeleine Clare , title =. Engaging Science, Technology, and Society , volume =. 2019 , doi =

work page 2019

[63] [63]

2023 ACM Conference on Fairness, Accountability, and Transparency , pages =

Cobbe, Jennifer and Veale, Michael and Singh, Jatinder , title =. 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =. 2023 , doi =

work page 2023

[64] [64]

2023 , month =

Shavit, Yonadav and Agarwal, Sandhini and Brundage, Miles and Adler, Steven and O'Keefe, Cullen and Campbell, Rosie and Lee, Teddy and Mishkin, Pamela and Eloundou, Tyna and Hickey, Alan and Kuleshov, Katya and Lasenby, Jan and Mossing, Liane and Ngo, Richard and Ryder, Noah and Morikawa, Toki , title =. 2023 , month =

work page 2023

[65] [65]

SSRN Electronic Journal , year =

Chaffer, Tomer Jordi , title =. SSRN Electronic Journal , year =

work page

[66] [66]

2024 , doi =

Chaffer, Tomer Jordi and Goins, Charles von and Okusanya, Bayo and Cotlage, Dontrail and Goldston, Justin , title =. 2024 , doi =

work page 2024

[67] [67]

, title =

Arbel, Yonathan and Goldstein, Simon and Salib, Peter N. , title =. arXiv preprint arXiv:2603.10028 , year =

work page arXiv

[68] [68]

, title =

LoPucki, Lynn M. , title =. Washington University Law Review , volume =

work page

[69] [69]

Northwestern University Law Review Online , volume =

Bayern, Shawn , title =. Northwestern University Law Review Online , volume =

work page

[70] [70]

and Diamantis, Mihailis E

Bryson, Joanna J. and Diamantis, Mihailis E. and Grant, Thomas D. , title =. Artificial Intelligence and Law , volume =. 2017 , doi =

work page 2017

[71] [71]

Philosophy & Technology , volume =

Santoni de Sio, Filippo and Mecacci, Giulio , title =. Philosophy & Technology , volume =. 2021 , doi =

work page 2021

[72] [72]

, title =

Thompson, Dennis F. , title =. American Political Science Review , volume =. 1980 , doi =

work page 1980

[73] [73]

Transactions on Machine Learning Research , year =

Schlatter, Jeremy and Weinstein-Raun, Benjamin and Ladish, Jeffrey , title =. Transactions on Machine Learning Research , year =

work page

[74] [74]

Frontier Models are Capable of In-context Scheming

Meinke, Alexander and Schoen, Bronson and Scheurer, J\'er\'emy and Balesni, Mikita and Shah, Rusheb and Hobbhahn, Marius , title =. arXiv preprint arXiv:2412.04984 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[75] [75]

Agentic misalignment: How llms could be insider threats.arXiv preprint arXiv:2510.05179, 2025

Lynch, Aengus and Wright, Benjamin and Larson, Caleb and Ritchie, Stuart J. and Mindermann, S\"oren and Perez, Ethan and Hubinger, Evan and Troy, Kevin K. , title =. arXiv preprint arXiv:2510.05179 , year =. 2510.05179 , archivePrefix =

work page arXiv

[76] [76]

arXiv preprint arXiv:2212.13345 , year =

Hinton, Geoffrey , title =. arXiv preprint arXiv:2212.13345 , year =

work page arXiv

[77] [77]

arXiv preprint arXiv:2311.09589 , year =

Ororbia, Alexander and Friston, Karl , title =. arXiv preprint arXiv:2311.09589 , year =

work page arXiv

[78] [78]

2024 , eprint =

Kleiner, Johannes , title =. 2024 , eprint =

work page 2024

[79] [79]

NIPS Deep Learning and Representation Learning Workshop , year =

Hinton, Geoffrey and Vinyals, Oriol and Dean, Jeff , title =. NIPS Deep Learning and Representation Learning Workshop , year =

work page

[80] [80]

and Ristenpart, Thomas , title =

Tram\`er, Florian and Zhang, Fan and Juels, Ari and Reiter, Michael K. and Ristenpart, Thomas , title =. 25th USENIX Security Symposium , pages =. 2016 , doi =

work page 2016