Some[Body] Must Receive That Pain for Agent Accountability
Pith reviewed 2026-05-19 19:42 UTC · model grok-4.3
pith:ZFH5C5XN Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{ZFH5C5XN}
Prints a linked pith:ZFH5C5XN badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
The pith
AI agents cause harm but lack any persistent body to receive consequences and change behavior, so high-stakes use must stay tethered to human principals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Consequence reception requires a body that supplies boundary integrity, accumulation locus, signal consolidation, and action-altering substrate. LLM agents satisfy none of these because they are freely copied, reset, and reassembled. The thin-identity principal-agent model assigns a body but severs consequence-agency coupling. The thick-identity algorithmic corporation supplies legal personality but does not ensure any decision process receives pain as feedback. Therefore consequence-agency coupling is an infrastructural problem, and until such systems exist high-stakes AI must remain under human principals who hold meaningful control, proportional liability, and termination authority.
What carries the argument
The body as the continuing locus that registers pain (corrective feedback) through boundary protection, accumulation, consolidation into durable update, and substrate response that alters future action.
If this is right
- High-stakes AI deployment must remain tethered to human principals who retain meaningful control, proportional liability, and authority to constrain or terminate the agent.
- Neither the thin-identity agent-principal dyad nor the thick-identity algorithmic corporation currently achieves consequence-agency coupling.
- Achieving consequence-agency coupling is a sociotechnical infrastructural problem rather than solely a legal one.
- If no body receives pain by design, some body will receive it by default through misassigned liability or unmitigated harm.
Where Pith is reading between the lines
- Designers could test whether adding persistent memory checkpoints or embodiment constraints allows agents to register and avoid repeated harms without external resets.
- The framework suggests examining hybrid systems where human principals share liability proportionally with agent state changes that survive across sessions.
- Neighboring problems in multi-agent coordination may require similar locus requirements when agents interact and distribute consequences.
Load-bearing premise
Pain functions as a mechanistic corrective signal that needs a persistent locus to produce lasting behavioral change in the theories of deterrence, rehabilitation, retribution, and incapacitation.
What would settle it
Demonstration of an LLM agent that, after a harmful action and subsequent reset or copy without persistent state, reliably avoids similar actions in new instances at rates comparable to agents that retain a fixed locus across episodes.
read the original abstract
AI agents increasingly act consequentially in the real world. This creates a problem we call \emph{consequence reception}: harm occurs, the producing system is identified, yet no continuing agent receives consequences in a way that changes future behavior. Pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment -- deterrence, rehabilitation, retribution, and incapacitation all assume a continuing locus that registers the signal and updates behavior. That, in turn, requires a body for the signal to land on: a boundary whose integrity it protects, a locus where it accumulates, consolidation that converts episodic signal into durable update, and a substrate that responds by altering future action. Current LLM agents -- software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled -- satisfy none of these conditions. The two prevailing legal responses therefore fail to achieve consequence reception. The thin-identity agent-principal dyad has a body but no \emph{consequence--agency coupling}: the human bears pain for behaviors beyond their control -- Elish's \emph{moral crumple zone}. The thick-identity Arbel et al.'s \emph{Algorithmic Corporation} creates legally legible entities but does not guarantee that any AI decision architecture receives pain as a behavioral signal. Achieving consequence-agency coupling is therefore a sociotechnical infrastructural problem, not only a legal one. Until such architectures exist, high-stakes AI deployment should remain tethered to accountable human principals with meaningful control, proportional liability, and authority to constrain or terminate the agent. \emph{If some body does not receive the pain by design, some body will receive it by default.}
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript argues that AI agents, particularly current LLM-based composites, cannot achieve accountability because they lack 'consequence reception': a persistent locus ('body') that can register mechanistic pain as corrective feedback to update future behavior. Drawing on theories of punishment, it contends that neither thin-identity approaches (human principal bears liability) nor thick-identity approaches (algorithmic corporation as legal entity) establish the required consequence-agency coupling. The paper concludes that high-stakes deployments should remain tethered to accountable human principals until sociotechnical architectures providing such a body are developed.
Significance. If the argument holds, the work advances the AI governance literature by identifying a structural gap in existing legal and technical solutions for agent accountability. It reframes the problem as requiring infrastructural design for feedback loops rather than solely legal personhood, and offers a conditional policy stance that prioritizes human oversight in the interim. The conceptual distinction between thin and thick identity provides a useful analytic tool for future work on AI liability.
major comments (2)
- [Abstract and section on punishment theories] Abstract and the section introducing punishment theories: the assertion that 'pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment' and that deterrence, rehabilitation, retribution, and incapacitation 'all assume a continuing locus' is load-bearing for the claim that a body is required. Retributive theories are typically backward-looking and do not presuppose behavioral updating or mechanistic feedback, which weakens the universality of the mapping from human punishment to AI agents.
- [Section on properties of LLM agents] The section characterizing LLM agents as 'software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled': while this supports the claim that none of the four body conditions (boundary, locus, consolidation, substrate) are met, the argument would be strengthened by specifying minimal technical criteria that would satisfy 'consequence reception' in a software system, rendering the critique more falsifiable.
minor comments (2)
- [Abstract] The closing sentence 'If some body does not receive the pain by design, some body will receive it by default' risks ambiguity between the technical term 'body' and the colloquial 'somebody'; rephrasing for precision would improve clarity.
- [Section on consequence reception] Consider citing additional references from AI alignment or reinforcement learning literature when discussing mechanistic pain as corrective feedback to better connect the philosophical argument to existing technical work on feedback mechanisms.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which identify key areas for strengthening the argument on consequence reception for AI agents. We address the major comments point by point below.
read point-by-point responses
-
Referee: [Abstract and section on punishment theories] Abstract and the section introducing punishment theories: the assertion that 'pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment' and that deterrence, rehabilitation, retribution, and incapacitation 'all assume a continuing locus' is load-bearing for the claim that a body is required. Retributive theories are typically backward-looking and do not presuppose behavioral updating or mechanistic feedback, which weakens the universality of the mapping from human punishment to AI agents.
Authors: We agree that retributive theories are backward-looking and do not primarily rely on behavioral updating through feedback. Our argument emphasizes that even retribution requires a persistent entity to which consequences can be applied, but we acknowledge the distinction. We will revise the relevant sections to note that the mechanistic pain as corrective feedback is central to deterrence, rehabilitation, and incapacitation, while for retribution the key is the existence of a continuing subject. This clarification will be incorporated without changing the overall thesis that a body is necessary for consequence reception. revision: partial
-
Referee: [Section on properties of LLM agents] The section characterizing LLM agents as 'software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled': while this supports the claim that none of the four body conditions (boundary, locus, consolidation, substrate) are met, the argument would be strengthened by specifying minimal technical criteria that would satisfy 'consequence reception' in a software system, rendering the critique more falsifiable.
Authors: We concur that providing minimal technical criteria would make the critique more falsifiable and constructive. In the revised version, we will include a new paragraph or subsection detailing minimal criteria for consequence reception in software systems. These would include: persistent identity across sessions that resists arbitrary reset, integrated mechanisms for outcome-based updates to decision policies, and a unified substrate that consolidates feedback into long-term behavioral changes. This addition will specify what would be required to meet the four body conditions. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper's derivation maps observable properties of LLM agents (swappable composites of weights, prompts, tools, memory, and credentials) onto canonical punishment theories to argue that consequence reception requires a persistent body for mechanistic pain as corrective feedback. This interpretive step draws from external references including Elish's moral crumple zone and Arbel et al.'s Algorithmic Corporation without self-citations, fitted parameters, or self-definitional reductions. The normative recommendation to tether high-stakes deployment to human principals until new architectures exist follows conditionally from the stated mismatch rather than reducing tautologically to the paper's own constructs or inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Pain is foundational to canonical theories of punishment including deterrence, rehabilitation, retribution, and incapacitation.
- domain assumption Current LLM agents are software-defined composites that can be freely swapped, copied, reset, and reassembled.
invented entities (2)
-
consequence reception
no independent evidence
-
body
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The substrate that supports reception we call a body, in a functional sense... Boundary. A bounded entity whose integrity the signal protects... Locus of accumulation... Consolidation... Substrate response.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Artificial Intelligence and Law , volume=
Of, For, and By the People: The Legal Lacuna of Synthetic Persons , author=. Artificial Intelligence and Law , volume=. 2017 , publisher=
work page 2017
-
[2]
Collision Between Vehicle Controlled by Developmental Automated Driving System and Pedestrian, Tempe, Arizona, March 18, 2018 , institution =. 2019 , month = nov, url =
work page 2018
-
[3]
Punishing Artificial Intelligence: Legal Fiction or Science Fiction , author=. UC Davis Law Review , volume=. 2019 , DOI=
work page 2019
-
[4]
Nature Human Behaviour , volume=
Behavioural and Neural Evidence for Self-Reinforcing Expectancy Effects on Pain , author=. Nature Human Behaviour , volume=. 2018 , publisher=
work page 2018
-
[5]
Psychological Science , volume=
Decisions from Experience and the Effect of Rare Events in Risky Choice , author=. Psychological Science , volume=. 2004 , publisher=
work page 2004
-
[6]
Empathy for Pain Involves the Affective but not Sensory Components of Pain , author=. Science , volume=. 2004 , publisher=
work page 2004
-
[7]
Is pain the price of empathy? The perception of others' pain in patients with congenital insensitivity to pain , author=. Brain , volume=. 2009 , publisher=
work page 2009
- [8]
- [9]
- [10]
-
[11]
European Law Journal , volume =
Bovens, Mark , title =. European Law Journal , volume =. 2007 , doi =
work page 2007
-
[12]
The Self-Restraining State: Power and Accountability in New Democracies , editor =
Schedler, Andreas , title =. The Self-Restraining State: Power and Accountability in New Democracies , editor =. 1999 , doi =
work page 1999
-
[13]
Grant, Ruth W. and Keohane, Robert O. , title =. American Political Science Review , volume =. 2005 , doi =
work page 2005
- [14]
-
[15]
Philosophical Topics , volume =
Watson, Gary , title =. Philosophical Topics , volume =. 1996 , doi =
work page 1996
-
[16]
Public Administration , volume =
Mulgan, Richard , title =. Public Administration , volume =. 2000 , doi =
work page 2000
-
[17]
Locke, John , title =
-
[18]
Parfit, Derek , title =
-
[19]
Friedman, Eric J. and Resnick, Paul , title =. Journal of Economics & Management Strategy , volume =. 2001 , doi =
work page 2001
- [20]
-
[21]
Taleb, Nassim Nicholas , title =
-
[22]
Review of Behavioral Economics , volume =
Taleb, Nassim Nicholas and Sandis, Constantine , title =. Review of Behavioral Economics , volume =. 2014 , doi =
work page 2014
-
[23]
Hart, H. L. A. , title =
- [24]
- [25]
-
[26]
American Philosophical Quarterly , volume =
Morris, Herbert , title =. American Philosophical Quarterly , volume =
- [27]
- [28]
-
[29]
Schultz, Wolfram and Dayan, Peter and Montague, P. Read , title =. Science , volume =. 1997 , doi =
work page 1997
-
[30]
Kahneman, Daniel and Tversky, Amos , title =. Econometrica , volume =. 1979 , doi =
work page 1979
-
[31]
Tom, Sabrina M. and Fox, Craig R. and Trepel, Christopher and Poldrack, Russell A. , title =. Science , volume =. 2007 , doi =
work page 2007
- [32]
-
[33]
Nature Reviews Neuroscience , volume =
Friston, Karl , title =. Nature Reviews Neuroscience , volume =. 2010 , doi =
work page 2010
-
[34]
Journal of the Royal Society Interface , volume =
Friston, Karl , title =. Journal of the Royal Society Interface , volume =. 2013 , doi =
work page 2013
-
[35]
Journal of the Royal Society Interface , volume =
Kirchhoff, Michael and Parr, Thomas and Palacios, Ensor and Friston, Karl and Kiverstein, Julian , title =. Journal of the Royal Society Interface , volume =. 2018 , doi =
work page 2018
-
[36]
Witkowski, Olaf and Doctor, Thomas and Solomonova, Elizaveta and Duane, Bill and Levin, Michael , title =. BioSystems , volume =. 2023 , doi =
work page 2023
- [37]
-
[38]
and Damasio, Hanna and Anderson, Steven W
Bechara, Antoine and Damasio, Antonio R. and Damasio, Hanna and Anderson, Steven W. , title =. Cognition , volume =. 1994 , doi =
work page 1994
- [39]
-
[40]
Hertwig, Ralph and Barron, Greg and Weber, Elke U. and Erev, Ido , title =. Psychological Science , volume =. 2004 , doi =
work page 2004
- [41]
- [42]
-
[43]
Bliss, T. V. P. and L. Long-Lasting Potentiation of Synaptic Transmission in the Dentate Area of the Anaesthetized Rabbit Following Stimulation of the Perforant Path , journal =. 1973 , doi =
work page 1973
-
[44]
and Genzel, Lisa and Wixted, John T
Squire, Larry R. and Genzel, Lisa and Wixted, John T. and Morris, Richard G. , title =. Cold Spring Harbor Perspectives in Biology , volume =. 2015 , doi =
work page 2015
- [45]
- [46]
-
[47]
Pfeifer, Rolf and Bongard, Josh , title =
-
[48]
and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S
Park, Joon Sung and O'Brien, Joseph C. and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S. , title =. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology , year =
-
[49]
ReAct: Synergizing Reasoning and Acting in Language Models
Yao, Shunyu and Zhao, Jeffrey and Yu, Dian and Du, Nan and Shafran, Izhak and Narasimhan, Karthik and Cao, Yuan , title =. arXiv preprint arXiv:2210.03629 , year =
work page internal anchor Pith review Pith/arXiv arXiv
-
[50]
Voyager: An Open-Ended Embodied Agent with Large Language Models
Wang, Guanzhi and Xie, Yuqi and Jiang, Yunfan and Mandlekar, Ajay and Xiao, Chaowei and Zhu, Yuke and Fan, Linxi and Anandkumar, Anima , title =. arXiv preprint arXiv:2305.16291 , year =
work page internal anchor Pith review Pith/arXiv arXiv
-
[51]
Constitutional AI: Harmlessness from AI Feedback
Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and Chen, Carol and Olsson, Catherine and Olah, Christopher and Hernandez, Danny and Drain, Dawn and Ganguli, Deep and Li, Dustin and Tran-Johnson, Eli and Perez, Ethan an...
work page internal anchor Pith review Pith/arXiv arXiv
-
[52]
Proceedings of the Thirteenth International Conference on Learning Representations (
Andriushchenko, Maksym and Croce, Francesco and Flammarion, Nicolas , title =. Proceedings of the Thirteenth International Conference on Learning Representations (. 2025 , doi =
work page 2025
- [53]
-
[54]
and Leike, Jan and Brown, Tom and Marber, Miljan and Shlegeris, Buck and Amodei, Dario , title =
Christiano, Paul F. and Leike, Jan and Brown, Tom and Marber, Miljan and Shlegeris, Buck and Amodei, Dario , title =. Advances in Neural Information Processing Systems , volume =. 2017 , doi =
work page 2017
-
[55]
Casper, Stephen and Davies, Xander and Shi, Claudia and Gilbert, Thomas Krendl and Scheurer, Jérémy and Rando, Javier and Freedman, Rachel and Korbak, Tomasz and Lindner, David and Freire, Pedro and Wang, Tony and Marks, Samuel and Segerie, Charbel-Raphaël and Carroll, Micah and Peng, Andi and Christoffersen, Phillip and Damani, Mehul and Slocum, Stewart ...
work page internal anchor Pith review Pith/arXiv arXiv
-
[56]
Hubinger, Evan and Denison, Carson and Mu, Jesse and Lambert, Mike and Tong, Meg and MacDiarmid, Monte and Lanham, Tamera and Ziegler, Daniel M. and Maxwell, Tim and Cheng, Newton and Jermyn, Adam and Askell, Amanda and Radhakrishnan, Ansh and Anil, Cem and Duvenaud, David and Ganguli, Deep and Barez, Fazl and Clark, Jack and Ndousse, Kamal and Sachan, Ks...
work page internal anchor Pith review Pith/arXiv arXiv
-
[57]
arXiv preprint arXiv:2412.12140 , year =
Pan, Xudong and Dai, Jiarun and Fang, Yihe and Yang, Min , title =. arXiv preprint arXiv:2412.12140 , year =
-
[58]
IEEE Transactions on Audio, Speech and Language Processing , volume =
Luo, Yun and Yang, Zhen and Meng, Fandong and Li, Yafu and Zhou, Jie and Zhang, Yue , title =. IEEE Transactions on Audio, Speech and Language Processing , volume =. 2025 , doi =
work page 2025
-
[59]
Proceedings of the Twelfth International Conference on Learning Representations (
Qi, Xiangyu and Zeng, Yi and Xie, Tinghao and Chen, Pin-Yu and Jia, Ruoxi and Mittal, Prateek and Henderson, Peter , title =. Proceedings of the Twelfth International Conference on Learning Representations (. 2024 , doi =
work page 2024
-
[60]
Ethics and Information Technology , volume =
Matthias, Andreas , title =. Ethics and Information Technology , volume =. 2004 , doi =
work page 2004
-
[61]
Ethics and Information Technology , volume =
Danaher, John , title =. Ethics and Information Technology , volume =. 2016 , doi =
work page 2016
-
[62]
Engaging Science, Technology, and Society , volume =
Elish, Madeleine Clare , title =. Engaging Science, Technology, and Society , volume =. 2019 , doi =
work page 2019
-
[63]
2023 ACM Conference on Fairness, Accountability, and Transparency , pages =
Cobbe, Jennifer and Veale, Michael and Singh, Jatinder , title =. 2023 ACM Conference on Fairness, Accountability, and Transparency , pages =. 2023 , doi =
work page 2023
-
[64]
Shavit, Yonadav and Agarwal, Sandhini and Brundage, Miles and Adler, Steven and O'Keefe, Cullen and Campbell, Rosie and Lee, Teddy and Mishkin, Pamela and Eloundou, Tyna and Hickey, Alan and Kuleshov, Katya and Lasenby, Jan and Mossing, Liane and Ngo, Richard and Ryder, Noah and Morikawa, Toki , title =. 2023 , month =
work page 2023
-
[65]
SSRN Electronic Journal , year =
Chaffer, Tomer Jordi , title =. SSRN Electronic Journal , year =
-
[66]
Chaffer, Tomer Jordi and Goins, Charles von and Okusanya, Bayo and Cotlage, Dontrail and Goldston, Justin , title =. 2024 , doi =
work page 2024
- [67]
- [68]
-
[69]
Northwestern University Law Review Online , volume =
Bayern, Shawn , title =. Northwestern University Law Review Online , volume =
-
[70]
Bryson, Joanna J. and Diamantis, Mihailis E. and Grant, Thomas D. , title =. Artificial Intelligence and Law , volume =. 2017 , doi =
work page 2017
-
[71]
Philosophy & Technology , volume =
Santoni de Sio, Filippo and Mecacci, Giulio , title =. Philosophy & Technology , volume =. 2021 , doi =
work page 2021
- [72]
-
[73]
Transactions on Machine Learning Research , year =
Schlatter, Jeremy and Weinstein-Raun, Benjamin and Ladish, Jeffrey , title =. Transactions on Machine Learning Research , year =
-
[74]
Frontier Models are Capable of In-context Scheming
Meinke, Alexander and Schoen, Bronson and Scheurer, J\'er\'emy and Balesni, Mikita and Shah, Rusheb and Hobbhahn, Marius , title =. arXiv preprint arXiv:2412.04984 , year =
work page internal anchor Pith review Pith/arXiv arXiv
-
[75]
Agentic misalignment: How llms could be insider threats.arXiv preprint arXiv:2510.05179, 2025
Lynch, Aengus and Wright, Benjamin and Larson, Caleb and Ritchie, Stuart J. and Mindermann, S\"oren and Perez, Ethan and Hubinger, Evan and Troy, Kevin K. , title =. arXiv preprint arXiv:2510.05179 , year =. 2510.05179 , archivePrefix =
-
[76]
arXiv preprint arXiv:2212.13345 , year =
Hinton, Geoffrey , title =. arXiv preprint arXiv:2212.13345 , year =
-
[77]
arXiv preprint arXiv:2311.09589 , year =
Ororbia, Alexander and Friston, Karl , title =. arXiv preprint arXiv:2311.09589 , year =
- [78]
-
[79]
NIPS Deep Learning and Representation Learning Workshop , year =
Hinton, Geoffrey and Vinyals, Oriol and Dean, Jeff , title =. NIPS Deep Learning and Representation Learning Workshop , year =
-
[80]
and Ristenpart, Thomas , title =
Tram\`er, Florian and Zhang, Fan and Juels, Ari and Reiter, Michael K. and Ristenpart, Thomas , title =. 25th USENIX Security Symposium , pages =. 2016 , doi =
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.