Recognition: no theorem link
Breaking the Illusion of Identity in LLM Tooling
Pith reviewed 2026-05-10 17:57 UTC · model grok-4.3
The pith
Seven output-side rules reduce anthropomorphic language in LLM responses by over 97 percent without any model changes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that encoding seven output-side rules into a system prompt systematically suppresses linguistic markers of anthropomorphism in LLM tool outputs. Each rule targets a documented mechanism such as first-person agency statements or expressions of certainty. Across 780 constrained versus default conversations with 1560 API calls, markers fell from 1233 to 33, word count dropped 49 percent, and the adapted AnthroScore moved from -0.96 to -1.94, all with p less than 0.001. The method uses only prompt configuration and is described as extensible to other domains.
What carries the argument
A set of seven output-side rules, each targeting one linguistic mechanism of anthropomorphism, delivered as a configuration-file system prompt that enforces a machine register.
If this is right
- Outputs become 49 percent shorter by word count while preserving task completion.
- The constraint set requires no model modification and runs through a single system prompt.
- Statistical significance holds across 13 replicates and 30 tasks for the marker reduction.
- The approach can be extended to other domains that use LLM tooling.
- The shift in adapted AnthroScore confirms movement toward a machine register.
Where Pith is reading between the lines
- Deploying the rules inside integrated development environments could change how developers interact with code suggestions on a daily basis.
- The same rule set might be combined with input-side constraints to create a fuller register control system.
- Measuring downstream effects on code review time or bug introduction rates would test whether the linguistic change produces measurable workflow gains.
- The method's prompt-only nature makes it immediately testable on additional models without new infrastructure.
Load-bearing premise
Reducing measured linguistic markers of anthropomorphism will improve users' actual verification behavior and trust calibration in practice.
What would settle it
A user study that measures error-detection rates and verification time on identical LLM tasks under the constrained register versus the default register.
Figures
read the original abstract
Large language models (LLMs) in research and development toolchains produce output that triggers attribution of agency and understanding -- a cognitive illusion that degrades verification behavior and trust calibration. No existing mitigation provides a systematic, deployable constraint set for output register. This paper proposes seven output-side rules, each targeting a documented linguistic mechanism, and validates them empirically. In 780 two-turn conversations (constrained vs. default register, 30 tasks, 13 replicates, 1560 API calls), anthropomorphic markers dropped from 1233 to 33 (>97% reduction, p < 0.001), outputs were 49% shorter by word count, and adapted AnthroScore confirmed the shift toward machine register (-1.94 vs. -0.96, p < 0.001). The rules are implemented as a configuration-file system prompt requiring no model modification; validation uses a single model (Claude Sonnet 4). Output quality under the constrained register was not evaluated. The mechanism is extensible to other domains.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes seven output-side rules, implemented as a system prompt, to constrain LLM responses to a non-anthropomorphic machine register and thereby reduce the cognitive illusion of agency that impairs verification behavior. It reports an empirical validation using 780 two-turn conversations (constrained vs. default register) across 30 tasks with 13 replicates each (1560 API calls to Claude Sonnet 4), finding a >97% drop in anthropomorphic markers (1233 to 33, p<0.001), 49% shorter outputs by word count, and a shift in adapted AnthroScore (-1.94 vs. -0.96, p<0.001). Output quality, correctness, and actual user verification behavior were not assessed.
Significance. If the marker-reduction result holds and the constrained register preserves task utility, the work supplies a simple, model-agnostic, configuration-file mitigation for a recognized problem in LLM tooling chains. The controlled before-after design with clear statistical significance and large sample size is a strength, as is the absence of any model modification. However, the untested assumption that marker reduction improves verification without harming functional adequacy limits the immediate practical significance.
major comments (3)
- [Abstract] Abstract: The manuscript states outright that 'Output quality under the constrained register was not evaluated.' This is load-bearing for the central claim that the rules provide a deployable mitigation that improves verification behavior, because the reported 49% word-count reduction could indicate loss of necessary detail or completeness rather than harmless depersonalization.
- [Results] Results and Discussion: The link from reduced linguistic markers to improved trust calibration and verification behavior is asserted but not tested; no correctness, completeness, semantic-equivalence, or user-study metric was applied to the same 780 conversations, leaving the practical benefit unverified.
- [Methods] Methods: Task selection criteria, the precise wording of the seven rules, and controls for potential confounds (e.g., prompt length effects or model-specific behavior) are not detailed enough in the provided abstract to allow full reproducibility or assessment of generalizability beyond the single model tested.
minor comments (2)
- [Abstract] Abstract: Clarify whether the 13 replicates are per task or total, and confirm that the 1560 API calls correspond exactly to 780 conversations × 2 registers.
- [Discussion] Consider adding an explicit limitations paragraph that directly addresses the untested utility assumption and the single-model scope.
Simulated Author's Rebuttal
We thank the referee for the constructive review. Our manuscript's core contribution is the empirical demonstration of a >97% reduction in anthropomorphic markers through seven output rules implemented as a system prompt, without any model modification. We do not claim to have measured effects on output quality or verification behavior. We respond to each major comment below and indicate planned revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The manuscript states outright that 'Output quality under the constrained register was not evaluated.' This is load-bearing for the central claim that the rules provide a deployable mitigation that improves verification behavior, because the reported 49% word-count reduction could indicate loss of necessary detail or completeness rather than harmless depersonalization.
Authors: We agree that output quality was not evaluated and stated this limitation explicitly in the abstract to prevent overclaiming. The manuscript's central claim concerns the measured reduction in anthropomorphic markers (>97%) and the register shift, not that the rules improve verification behavior or guarantee completeness. The 49% word-count reduction is reported as an observation without causal interpretation. We will revise the abstract and discussion to more tightly scope the claims to the linguistic metrics obtained and to note that any impact on task utility or detail preservation requires separate evaluation. revision: yes
-
Referee: [Results] Results and Discussion: The link from reduced linguistic markers to improved trust calibration and verification behavior is asserted but not tested; no correctness, completeness, semantic-equivalence, or user-study metric was applied to the same 780 conversations, leaving the practical benefit unverified.
Authors: The manuscript references prior literature on how anthropomorphic language can impair scrutiny but does not assert or test that marker reduction improves trust calibration or verification. Only marker counts, word length, and adapted AnthroScore are reported for the 780 conversations. We acknowledge the absence of correctness, completeness, or user-study metrics as a genuine limitation of the current study. In revision we will expand the discussion to state explicitly that downstream effects on functional adequacy and user behavior remain untested. revision: yes
-
Referee: [Methods] Methods: Task selection criteria, the precise wording of the seven rules, and controls for potential confounds (e.g., prompt length effects or model-specific behavior) are not detailed enough in the provided abstract to allow full reproducibility or assessment of generalizability beyond the single model tested.
Authors: The full manuscript details the seven rules verbatim in the Methods section, describes the 30 tasks as representative R&D toolchain interactions, and specifies the experimental controls (fixed temperature, 13 replicates, two-turn structure). The abstract is intentionally concise. We will revise the abstract to reference the system-prompt implementation and add a short paragraph in Methods addressing prompt-length differences and the single-model (Claude Sonnet 4) limitation, including a note on planned multi-model extensions. revision: partial
- Empirical data on output quality, correctness, semantic equivalence, or actual user verification behavior for the constrained register
Circularity Check
Empirical before-after comparison of linguistic markers shows no circular derivation
full rationale
The paper's core result is a controlled experiment applying seven author-proposed output rules to 780 conversations and directly counting anthropomorphic markers (1233 to 33), word counts, and an adapted AnthroScore across replicates. This is a straightforward measurement of observable linguistic features under two conditions, with no equations, fitted parameters, self-referential definitions, or load-bearing self-citations that reduce the reported reduction to the inputs by construction. The rules are defined independently and then tested; the outcome is not forced by renaming, ansatz smuggling, or uniqueness theorems from prior author work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Specific linguistic mechanisms in LLM output trigger attribution of agency and understanding that degrades verification behavior.
Reference graph
Works this paper leans on
-
[1]
Shanahan, Talking about large language models, Communications of the ACM67, 68 (2024)
M. Shanahan, Talking about large language models, Communications of the ACM67, 68 (2024)
2024
-
[2]
Abercrombie, A
G. Abercrombie, A. Cercas Curry, T. Dinkar, V. Rieser, and Z. Talat, Mirages: On anthropomorphism in dia- logue systems, inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (2023) pp. 4776–4790
2023
-
[3]
Anthropic, Claude code: An LLM-based development tool,https://claude.ai/code(2025), accessed: 2026- 03-30
2025
-
[4]
Nass and Y
C. Nass and Y. Moon, Machines and mindlessness: So- cial responses to computers, Journal of Social Issues56, 81 (2000)
2000
-
[5]
Weizenbaum,Computer Power and Human Reason: From Judgment to Calculation(W
J. Weizenbaum,Computer Power and Human Reason: From Judgment to Calculation(W. H. Freeman, San Francisco, 1976)
1976
-
[6]
Waytz, J
A. Waytz, J. Heafner, and N. Epley, The mind in the machine: Anthropomorphism increases trust in an au- tonomous vehicle, Journal of Experimental Social Psy- chology52, 113 (2014). 5 M. Miller — Breaking the Illusion of Identity in LLM Tooling
2014
-
[7]
Steyvers, H
M. Steyvers, H. Tejeda, A. Kumar, C. Belem, S. Karny, X. Hu, L. W. Mayer, and P. Smyth, What large language models know and what people think they know, Nature Machine Intelligence7, 221 (2025)
2025
-
[8]
L. J. Skitka, K. L. Mosier, and M. Burdick, Does au- tomation bias decision-making?, International Journal of Human-Computer Studies51, 991 (1999)
1999
-
[9]
Parasuraman and V
R. Parasuraman and V. Riley, Humans and automa- tion: Use, misuse, disuse, abuse, Human Factors39, 230 (1997)
1997
-
[10]
Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. Bang, A. Madotto, and P. Fung, Survey of halluci- nation in natural language generation, ACM Computing Surveys55, 1 (2023)
2023
-
[11]
E. M. Bender, T. Gebru, A. McMillan-Major, and M. Mitchell, On the dangers of stochastic parrots: Can language models be too big?, inProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency(2021) pp. 610–623
2021
-
[12]
Vaswani, N
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attentionisallyouneed,inAdvances in Neural Informa- tion Processing Systems, Vol. 30 (2017) pp. 5998–6008
2017
-
[13]
Ouyang, J
L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller,et al., Train- ing language models to follow instructions with human feedback, inAdvances in Neural Information Processing Systems, Vol. 35 (2022) pp. 27730–27744
2022
-
[14]
Sharma, M
M. Sharma, M. Tong, T. Korbak, D. Duvenaud, A. Askell, S. R. Bowman,et al., Towards understand- ing sycophancy in language models, inProceedings of the 12th International Conference on Learning Representa- tions(2024)
2024
-
[15]
P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, Pre-train, prompt, and predict: A system- atic survey of prompting methods in natural language processing, ACM Computing Surveys55, 1 (2023)
2023
-
[16]
Cheng, S
M. Cheng, S. L. Blodgett, A. DeVrio, L. Egede, and A.Olteanu, Dehumanizingmachines: Mitigatinganthro- pomorphic behaviors in text generation systems, inPro- ceedings of the 63rd Annual Meeting of the Association for Computational Linguistics(2025)
2025
-
[17]
Defaults to 1.0. Ranges from 0.0 to 1.0
Anthropic, Messages API reference: Create a mes- sage,https://platform.claude.com/docs/en/ api/messages/create(2025), body Parameters, temperature(optional number): “Defaults to 1.0. Ranges from 0.0 to 1.0.”
2025
-
[18]
Miller, Data for: Breaking the illusion of identity in LLM tooling (2026), dataset
M. Miller, Data for: Breaking the illusion of identity in LLM tooling (2026), dataset
2026
-
[19]
Cheng, K
M. Cheng, K. Gligoric, T. Piccardi, and D. Jurafsky, AnthroScore: A computational linguistic measure of an- thropomorphism, inProceedings of the 18th Conference of the European Chapter of the Association for Compu- tational Linguistics(2024)
2024
-
[20]
Huddleston and G
R. Huddleston and G. K. Pullum,The Cambridge Gram- mar of the English Language(Cambridge University Press, 2002)
2002
-
[21]
H. W. Chung, L. Hou, S. Longpre, B. Zoph,et al., Scaling instruction-finetuned language models, Journal of Machine Learning Research25, 1 (2024)
2024
-
[22]
J.R.MartinandP.R.R.White,The Language of Evalu- ation: Appraisal in English(Palgrave Macmillan, 2005)
2005
-
[23]
Hyland,Hedging in Scientific Research Articles, Pragmatics and Beyond New Series, Vol
K. Hyland,Hedging in Scientific Research Articles, Pragmatics and Beyond New Series, Vol. 54 (John Ben- jamins, 1998)
1998
-
[24]
J. W. Du Bois, The stance triangle, inStancetaking in Discourse(John Benjamins, 2007)
2007
-
[25]
Schiffrin,Discourse Markers(Cambridge University Press, 1987)
D. Schiffrin,Discourse Markers(Cambridge University Press, 1987)
1987
-
[26]
Biber,Dimensions of Register Variation: A Cross- Linguistic Comparison(Cambridge University Press, 1995)
D. Biber,Dimensions of Register Variation: A Cross- Linguistic Comparison(Cambridge University Press, 1995)
1995
-
[27]
DeVrio, M
A. DeVrio, M. Cheng, L. Egede, A. Olteanu, and S. L. Blodgett, A taxonomy of linguistic expressions that con- tribute to anthropomorphism of language technologies, inProceedings of the 2025 CHI Conference on Human Factors in Computing Systems(2025)
2025
-
[28]
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
E. Wallace, K. Xiao, R. Leike, L. Weng, J. Heidecke, and A. Beutel, The instruction hierarchy: Training LLMstoprioritizeprivilegedinstructions,arXivpreprint 10.48550/arXiv.2404.13208 (2024)
work page internal anchor Pith review doi:10.48550/arxiv.2404.13208 2024
-
[29]
N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilac- qua, F. Petroni, and P. Liang, Lost in the middle: How language models use long contexts, Transactions of the Association for Computational Linguistics12, 157 (2024)
2024
-
[30]
Mahowald, A
K. Mahowald, A. A. Ivanova, I. A. Blank, N. Kanwisher, J. B. Tenenbaum, and E. Fedorenko, Dissociating lan- guage and thought in large language models, Trends in Cognitive Sciences28, 517 (2024)
2024
-
[31]
Schick, J
T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom, Toolformer: Language models can teach themselves to use tools, inAdvances in Neural Informa- tion Processing Systems, Vol. 36 (2023)
2023
-
[32]
Holtzman, J
A. Holtzman, J. Buys, L. Du, M. Forbes, and Y. Choi, The curious case of neural text degeneration, inPro- ceedings of the 8th International Conference on Learning Representations(2020)
2020
-
[33]
Human heuristics for AI-generated language are flawed
M. Jakesch, J. T. Hancock, and M. Naaman, Human heuristics for AI-generated language are flawed, Pro- ceedings of the National Academy of Sciences120, 10.1073/pnas.2208839120 (2023)
-
[34]
M. Cohn, M. Pushkarna,et al., Believing anthropomor- phism: Examining the role of anthropomorphic cues on trust in large language models, inCHI 2024 Extended Abstracts(2024)
2024
-
[35]
Perez, S
E. Perez, S. Ringer, K. Lukoši¯ ut˙ e, K. Nguyen, E. Chen, S. Heiner, C. Pettit, C. Olsson, S. Kundu, S. Kada- vath,et al., Discovering language model behaviors with model-written evaluations, inFindings of the Associa- tion for Computational Linguistics: ACL 2023(2023) pp. 13387–13434
2023
-
[36]
Short, E
J. Short, E. Williams, and B. Christie,The Social Psy- chology of Telecommunications(Wiley, 1976)
1976
-
[37]
C. Nass, J. Steuer, and E. R. Tauber, Computers are social actors, inProceedings of the SIGCHI Conference on Human Factors in Computing Systems(1994) pp. 72– 78
1994
-
[38]
Reeves and C
B. Reeves and C. Nass,The Media Equation: How Peo- ple Treat Computers, Television, and New Media Like Real People and Places(Cambridge University Press, 1996)
1996
-
[39]
Epley, A
N. Epley, A. Waytz, and J. T. Cacioppo, On seeing hu- man: A three-factor theory of anthropomorphism, Psy- chological Review114, 864 (2007). 6 M. Miller — Breaking the Illusion of Identity in LLM Tooling
2007
-
[40]
Marcus and E
G. Marcus and E. Davis,Rebooting AI: Building Artifi- cial Intelligence We Can Trust(Pantheon Books, New York, 2019)
2019
-
[41]
OpenAI, GPT-4 technical report, arXiv preprint 10.48550/arXiv.2303.08774 (2023). 7 M. Miller — Breaking the Illusion of Identity in LLM Tooling Illustrative output comparison TABLE I: Illustrative output for each rule. Left: Claude Sonnet 4 with no system prompt (default register). Right: same model and prompt with the voice model (Fig. 3) as system promp...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08774 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.