arxiv: 2604.07398 · v1 · submitted 2026-04-08 · 💻 cs.SE · cs.AI

Recognition: no theorem link

Breaking the Illusion of Identity in LLM Tooling

Marek Miller

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:57 UTC · model grok-4.3

classification 💻 cs.SE cs.AI

keywords LLM toolinganthropomorphic languageoutput constraintssystem promptstrust calibrationverification behaviormachine registerlinguistic markers

0 comments

The pith

Seven output-side rules reduce anthropomorphic language in LLM responses by over 97 percent without any model changes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Large language models used in research and development toolchains generate text that users often interpret as showing agency and understanding. This illusion can reduce how carefully people check the outputs and how well they calibrate trust. The paper introduces seven specific rules that force a machine-like output register by targeting documented linguistic patterns. In a test of 780 two-turn conversations across 30 tasks, the rules cut anthropomorphic markers from 1233 to 33, shortened responses by 49 percent on average, and shifted the adapted AnthroScore toward a machine register. The rules are delivered through a configuration-file system prompt, require no model retraining, and the paper notes that output quality itself was not measured.

Core claim

The central claim is that encoding seven output-side rules into a system prompt systematically suppresses linguistic markers of anthropomorphism in LLM tool outputs. Each rule targets a documented mechanism such as first-person agency statements or expressions of certainty. Across 780 constrained versus default conversations with 1560 API calls, markers fell from 1233 to 33, word count dropped 49 percent, and the adapted AnthroScore moved from -0.96 to -1.94, all with p less than 0.001. The method uses only prompt configuration and is described as extensible to other domains.

What carries the argument

A set of seven output-side rules, each targeting one linguistic mechanism of anthropomorphism, delivered as a configuration-file system prompt that enforces a machine register.

If this is right

Outputs become 49 percent shorter by word count while preserving task completion.
The constraint set requires no model modification and runs through a single system prompt.
Statistical significance holds across 13 replicates and 30 tasks for the marker reduction.
The approach can be extended to other domains that use LLM tooling.
The shift in adapted AnthroScore confirms movement toward a machine register.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Deploying the rules inside integrated development environments could change how developers interact with code suggestions on a daily basis.
The same rule set might be combined with input-side constraints to create a fuller register control system.
Measuring downstream effects on code review time or bug introduction rates would test whether the linguistic change produces measurable workflow gains.
The method's prompt-only nature makes it immediately testable on additional models without new infrastructure.

Load-bearing premise

Reducing measured linguistic markers of anthropomorphism will improve users' actual verification behavior and trust calibration in practice.

What would settle it

A user study that measures error-detection rates and verification time on identical LLM tasks under the constrained register versus the default register.

Figures

Figures reproduced from arXiv: 2604.07398 by Marek Miller.

**Figure 2.** Figure 2: FIG. 2: Distribution of AnthroScore (adapted from Cheng et al. [19]; [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3: Verbatim reproduction of [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4: Verbatim output from Claude Code (Opus 4.6, voice model [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

read the original abstract

Large language models (LLMs) in research and development toolchains produce output that triggers attribution of agency and understanding -- a cognitive illusion that degrades verification behavior and trust calibration. No existing mitigation provides a systematic, deployable constraint set for output register. This paper proposes seven output-side rules, each targeting a documented linguistic mechanism, and validates them empirically. In 780 two-turn conversations (constrained vs. default register, 30 tasks, 13 replicates, 1560 API calls), anthropomorphic markers dropped from 1233 to 33 (>97% reduction, p < 0.001), outputs were 49% shorter by word count, and adapted AnthroScore confirmed the shift toward machine register (-1.94 vs. -0.96, p < 0.001). The rules are implemented as a configuration-file system prompt requiring no model modification; validation uses a single model (Claude Sonnet 4). Output quality under the constrained register was not evaluated. The mechanism is extensible to other domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows a 97% drop in anthropomorphic markers from seven rules in a system prompt, but skips any check on whether the shorter outputs still solve the tasks correctly.

read the letter

The core result is straightforward: across 780 two-turn conversations on 30 tasks, the constrained register cut anthropomorphic markers from 1233 to 33 and shifted the adapted AnthroScore from -0.96 to -1.94, both with p less than 0.001. Outputs also ran 49% shorter by word count. The seven rules are packaged as a configuration-file system prompt that needs no model changes, and the test used 1560 API calls with 13 replicates per task. That scale and the before-after design are the parts that stand out as new and usable right away for anyone running LLM toolchains.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes seven output-side rules, implemented as a system prompt, to constrain LLM responses to a non-anthropomorphic machine register and thereby reduce the cognitive illusion of agency that impairs verification behavior. It reports an empirical validation using 780 two-turn conversations (constrained vs. default register) across 30 tasks with 13 replicates each (1560 API calls to Claude Sonnet 4), finding a >97% drop in anthropomorphic markers (1233 to 33, p<0.001), 49% shorter outputs by word count, and a shift in adapted AnthroScore (-1.94 vs. -0.96, p<0.001). Output quality, correctness, and actual user verification behavior were not assessed.

Significance. If the marker-reduction result holds and the constrained register preserves task utility, the work supplies a simple, model-agnostic, configuration-file mitigation for a recognized problem in LLM tooling chains. The controlled before-after design with clear statistical significance and large sample size is a strength, as is the absence of any model modification. However, the untested assumption that marker reduction improves verification without harming functional adequacy limits the immediate practical significance.

major comments (3)

[Abstract] Abstract: The manuscript states outright that 'Output quality under the constrained register was not evaluated.' This is load-bearing for the central claim that the rules provide a deployable mitigation that improves verification behavior, because the reported 49% word-count reduction could indicate loss of necessary detail or completeness rather than harmless depersonalization.
[Results] Results and Discussion: The link from reduced linguistic markers to improved trust calibration and verification behavior is asserted but not tested; no correctness, completeness, semantic-equivalence, or user-study metric was applied to the same 780 conversations, leaving the practical benefit unverified.
[Methods] Methods: Task selection criteria, the precise wording of the seven rules, and controls for potential confounds (e.g., prompt length effects or model-specific behavior) are not detailed enough in the provided abstract to allow full reproducibility or assessment of generalizability beyond the single model tested.

minor comments (2)

[Abstract] Abstract: Clarify whether the 13 replicates are per task or total, and confirm that the 1560 API calls correspond exactly to 780 conversations × 2 registers.
[Discussion] Consider adding an explicit limitations paragraph that directly addresses the untested utility assumption and the single-model scope.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive review. Our manuscript's core contribution is the empirical demonstration of a >97% reduction in anthropomorphic markers through seven output rules implemented as a system prompt, without any model modification. We do not claim to have measured effects on output quality or verification behavior. We respond to each major comment below and indicate planned revisions where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract: The manuscript states outright that 'Output quality under the constrained register was not evaluated.' This is load-bearing for the central claim that the rules provide a deployable mitigation that improves verification behavior, because the reported 49% word-count reduction could indicate loss of necessary detail or completeness rather than harmless depersonalization.

Authors: We agree that output quality was not evaluated and stated this limitation explicitly in the abstract to prevent overclaiming. The manuscript's central claim concerns the measured reduction in anthropomorphic markers (>97%) and the register shift, not that the rules improve verification behavior or guarantee completeness. The 49% word-count reduction is reported as an observation without causal interpretation. We will revise the abstract and discussion to more tightly scope the claims to the linguistic metrics obtained and to note that any impact on task utility or detail preservation requires separate evaluation. revision: yes
Referee: [Results] Results and Discussion: The link from reduced linguistic markers to improved trust calibration and verification behavior is asserted but not tested; no correctness, completeness, semantic-equivalence, or user-study metric was applied to the same 780 conversations, leaving the practical benefit unverified.

Authors: The manuscript references prior literature on how anthropomorphic language can impair scrutiny but does not assert or test that marker reduction improves trust calibration or verification. Only marker counts, word length, and adapted AnthroScore are reported for the 780 conversations. We acknowledge the absence of correctness, completeness, or user-study metrics as a genuine limitation of the current study. In revision we will expand the discussion to state explicitly that downstream effects on functional adequacy and user behavior remain untested. revision: yes
Referee: [Methods] Methods: Task selection criteria, the precise wording of the seven rules, and controls for potential confounds (e.g., prompt length effects or model-specific behavior) are not detailed enough in the provided abstract to allow full reproducibility or assessment of generalizability beyond the single model tested.

Authors: The full manuscript details the seven rules verbatim in the Methods section, describes the 30 tasks as representative R&D toolchain interactions, and specifies the experimental controls (fixed temperature, 13 replicates, two-turn structure). The abstract is intentionally concise. We will revise the abstract to reference the system-prompt implementation and add a short paragraph in Methods addressing prompt-length differences and the single-model (Claude Sonnet 4) limitation, including a note on planned multi-model extensions. revision: partial

standing simulated objections not resolved

Empirical data on output quality, correctness, semantic equivalence, or actual user verification behavior for the constrained register

Circularity Check

0 steps flagged

Empirical before-after comparison of linguistic markers shows no circular derivation

full rationale

The paper's core result is a controlled experiment applying seven author-proposed output rules to 780 conversations and directly counting anthropomorphic markers (1233 to 33), word counts, and an adapted AnthroScore across replicates. This is a straightforward measurement of observable linguistic features under two conditions, with no equations, fitted parameters, self-referential definitions, or load-bearing self-citations that reduce the reported reduction to the inputs by construction. The rules are defined independently and then tested; the outcome is not forced by renaming, ansatz smuggling, or uniqueness theorems from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the seven rules target documented linguistic mechanisms causing attribution of agency, and that marker reduction via system prompt will address the cognitive illusion. No free parameters are introduced or fitted. No new entities are postulated.

axioms (1)

domain assumption Specific linguistic mechanisms in LLM output trigger attribution of agency and understanding that degrades verification behavior.
Invoked in the abstract as the justification for targeting these mechanisms with output rules.

pith-pipeline@v0.9.0 · 5463 in / 1600 out tokens · 36508 ms · 2026-05-10T17:57:18.497751+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Shanahan, Talking about large language models, Communications of the ACM67, 68 (2024)

M. Shanahan, Talking about large language models, Communications of the ACM67, 68 (2024)

2024
[2]

Abercrombie, A

G. Abercrombie, A. Cercas Curry, T. Dinkar, V. Rieser, and Z. Talat, Mirages: On anthropomorphism in dia- logue systems, inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (2023) pp. 4776–4790

2023
[3]

Anthropic, Claude code: An LLM-based development tool,https://claude.ai/code(2025), accessed: 2026- 03-30

2025
[4]

Nass and Y

C. Nass and Y. Moon, Machines and mindlessness: So- cial responses to computers, Journal of Social Issues56, 81 (2000)

2000
[5]

Weizenbaum,Computer Power and Human Reason: From Judgment to Calculation(W

J. Weizenbaum,Computer Power and Human Reason: From Judgment to Calculation(W. H. Freeman, San Francisco, 1976)

1976
[6]

Waytz, J

A. Waytz, J. Heafner, and N. Epley, The mind in the machine: Anthropomorphism increases trust in an au- tonomous vehicle, Journal of Experimental Social Psy- chology52, 113 (2014). 5 M. Miller — Breaking the Illusion of Identity in LLM Tooling

2014
[7]

Steyvers, H

M. Steyvers, H. Tejeda, A. Kumar, C. Belem, S. Karny, X. Hu, L. W. Mayer, and P. Smyth, What large language models know and what people think they know, Nature Machine Intelligence7, 221 (2025)

2025
[8]

L. J. Skitka, K. L. Mosier, and M. Burdick, Does au- tomation bias decision-making?, International Journal of Human-Computer Studies51, 991 (1999)

1999
[9]

Parasuraman and V

R. Parasuraman and V. Riley, Humans and automa- tion: Use, misuse, disuse, abuse, Human Factors39, 230 (1997)

1997
[10]

Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. Bang, A. Madotto, and P. Fung, Survey of halluci- nation in natural language generation, ACM Computing Surveys55, 1 (2023)

2023
[11]

E. M. Bender, T. Gebru, A. McMillan-Major, and M. Mitchell, On the dangers of stochastic parrots: Can language models be too big?, inProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency(2021) pp. 610–623

2021
[12]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attentionisallyouneed,inAdvances in Neural Informa- tion Processing Systems, Vol. 30 (2017) pp. 5998–6008

2017
[13]

Ouyang, J

L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller,et al., Train- ing language models to follow instructions with human feedback, inAdvances in Neural Information Processing Systems, Vol. 35 (2022) pp. 27730–27744

2022
[14]

Sharma, M

M. Sharma, M. Tong, T. Korbak, D. Duvenaud, A. Askell, S. R. Bowman,et al., Towards understand- ing sycophancy in language models, inProceedings of the 12th International Conference on Learning Representa- tions(2024)

2024
[15]

P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, Pre-train, prompt, and predict: A system- atic survey of prompting methods in natural language processing, ACM Computing Surveys55, 1 (2023)

2023
[16]

Cheng, S

M. Cheng, S. L. Blodgett, A. DeVrio, L. Egede, and A.Olteanu, Dehumanizingmachines: Mitigatinganthro- pomorphic behaviors in text generation systems, inPro- ceedings of the 63rd Annual Meeting of the Association for Computational Linguistics(2025)

2025
[17]

Defaults to 1.0. Ranges from 0.0 to 1.0

Anthropic, Messages API reference: Create a mes- sage,https://platform.claude.com/docs/en/ api/messages/create(2025), body Parameters, temperature(optional number): “Defaults to 1.0. Ranges from 0.0 to 1.0.”

2025
[18]

Miller, Data for: Breaking the illusion of identity in LLM tooling (2026), dataset

M. Miller, Data for: Breaking the illusion of identity in LLM tooling (2026), dataset

2026
[19]

Cheng, K

M. Cheng, K. Gligoric, T. Piccardi, and D. Jurafsky, AnthroScore: A computational linguistic measure of an- thropomorphism, inProceedings of the 18th Conference of the European Chapter of the Association for Compu- tational Linguistics(2024)

2024
[20]

Huddleston and G

R. Huddleston and G. K. Pullum,The Cambridge Gram- mar of the English Language(Cambridge University Press, 2002)

2002
[21]

H. W. Chung, L. Hou, S. Longpre, B. Zoph,et al., Scaling instruction-finetuned language models, Journal of Machine Learning Research25, 1 (2024)

2024
[22]

J.R.MartinandP.R.R.White,The Language of Evalu- ation: Appraisal in English(Palgrave Macmillan, 2005)

2005
[23]

Hyland,Hedging in Scientific Research Articles, Pragmatics and Beyond New Series, Vol

K. Hyland,Hedging in Scientific Research Articles, Pragmatics and Beyond New Series, Vol. 54 (John Ben- jamins, 1998)

1998
[24]

J. W. Du Bois, The stance triangle, inStancetaking in Discourse(John Benjamins, 2007)

2007
[25]

Schiffrin,Discourse Markers(Cambridge University Press, 1987)

D. Schiffrin,Discourse Markers(Cambridge University Press, 1987)

1987
[26]

Biber,Dimensions of Register Variation: A Cross- Linguistic Comparison(Cambridge University Press, 1995)

D. Biber,Dimensions of Register Variation: A Cross- Linguistic Comparison(Cambridge University Press, 1995)

1995
[27]

DeVrio, M

A. DeVrio, M. Cheng, L. Egede, A. Olteanu, and S. L. Blodgett, A taxonomy of linguistic expressions that con- tribute to anthropomorphism of language technologies, inProceedings of the 2025 CHI Conference on Human Factors in Computing Systems(2025)

2025
[28]

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

E. Wallace, K. Xiao, R. Leike, L. Weng, J. Heidecke, and A. Beutel, The instruction hierarchy: Training LLMstoprioritizeprivilegedinstructions,arXivpreprint 10.48550/arXiv.2404.13208 (2024)

work page internal anchor Pith review doi:10.48550/arxiv.2404.13208 2024
[29]

N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilac- qua, F. Petroni, and P. Liang, Lost in the middle: How language models use long contexts, Transactions of the Association for Computational Linguistics12, 157 (2024)

2024
[30]

Mahowald, A

K. Mahowald, A. A. Ivanova, I. A. Blank, N. Kanwisher, J. B. Tenenbaum, and E. Fedorenko, Dissociating lan- guage and thought in large language models, Trends in Cognitive Sciences28, 517 (2024)

2024
[31]

Schick, J

T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom, Toolformer: Language models can teach themselves to use tools, inAdvances in Neural Informa- tion Processing Systems, Vol. 36 (2023)

2023
[32]

Holtzman, J

A. Holtzman, J. Buys, L. Du, M. Forbes, and Y. Choi, The curious case of neural text degeneration, inPro- ceedings of the 8th International Conference on Learning Representations(2020)

2020
[33]

Human heuristics for AI-generated language are ﬂawed

M. Jakesch, J. T. Hancock, and M. Naaman, Human heuristics for AI-generated language are flawed, Pro- ceedings of the National Academy of Sciences120, 10.1073/pnas.2208839120 (2023)

work page doi:10.1073/pnas.2208839120 2023
[34]

M. Cohn, M. Pushkarna,et al., Believing anthropomor- phism: Examining the role of anthropomorphic cues on trust in large language models, inCHI 2024 Extended Abstracts(2024)

2024
[35]

Perez, S

E. Perez, S. Ringer, K. Lukoši¯ ut˙ e, K. Nguyen, E. Chen, S. Heiner, C. Pettit, C. Olsson, S. Kundu, S. Kada- vath,et al., Discovering language model behaviors with model-written evaluations, inFindings of the Associa- tion for Computational Linguistics: ACL 2023(2023) pp. 13387–13434

2023
[36]

Short, E

J. Short, E. Williams, and B. Christie,The Social Psy- chology of Telecommunications(Wiley, 1976)

1976
[37]

C. Nass, J. Steuer, and E. R. Tauber, Computers are social actors, inProceedings of the SIGCHI Conference on Human Factors in Computing Systems(1994) pp. 72– 78

1994
[38]

Reeves and C

B. Reeves and C. Nass,The Media Equation: How Peo- ple Treat Computers, Television, and New Media Like Real People and Places(Cambridge University Press, 1996)

1996
[39]

Epley, A

N. Epley, A. Waytz, and J. T. Cacioppo, On seeing hu- man: A three-factor theory of anthropomorphism, Psy- chological Review114, 864 (2007). 6 M. Miller — Breaking the Illusion of Identity in LLM Tooling

2007
[40]

Marcus and E

G. Marcus and E. Davis,Rebooting AI: Building Artifi- cial Intelligence We Can Trust(Pantheon Books, New York, 2019)

2019
[41]

OpenAI, GPT-4 technical report, arXiv preprint 10.48550/arXiv.2303.08774 (2023). 7 M. Miller — Breaking the Illusion of Identity in LLM Tooling Illustrative output comparison TABLE I: Illustrative output for each rule. Left: Claude Sonnet 4 with no system prompt (default register). Right: same model and prompt with the voice model (Fig. 3) as system promp...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08774 2023