pith. sign in

arxiv: 2605.19838 · v1 · pith:FFWBOBR3new · submitted 2026-05-19 · 💻 cs.HC

From Role to Person: Trust Calibration Challenges in Twin Agents

Pith reviewed 2026-05-20 04:04 UTC · model grok-4.3

classification 💻 cs.HC
keywords twin agentstrust calibrationdigital twinsAI agentshuman-AI collaborationprofessional settingserror attribution
0
0 comments X

The pith

Twin agents that represent specific people introduce trust problems current AI methods cannot solve.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes that AI agents will soon take on the role of standing in for individual people, acting as their digital representatives in professional settings. This creates a situation where a colleague receiving advice from such an agent cannot easily determine if a mistake stems from the person's own knowledge, missing information, or an error in how the model was built. Existing approaches to preventing overreliance on AI assume a clear separation between the AI tool and the human user, but twin agents remove that separation by design. As a result, new ways to calibrate trust are needed for these personal stand-ins.

Core claim

Twin agents dissolve the boundary between AI and the human decision-maker by representing an unavailable person's knowledge, perspective, and style, which means that doubts about their output cannot be attributed reliably to one of three failure modes: a schema gap, an epistemic gap, or a model artifact. This raises a class of trust calibration challenges that cognitive forcing functions and related frameworks were not designed to handle.

What carries the argument

The twin agent, defined as a digital representation of an individual's knowledge, perspective, and communicative style that interacts with colleagues on their behalf when unavailable, which blurs the line between the agent and the represented person.

If this is right

  • Colleagues will face difficulty in attributing errors when interacting with a twin agent standing in for an absent person.
  • Standard methods for reducing overreliance on AI advice will prove insufficient in this context.
  • New research questions arise around designing trust calibration for agents that embody specific individuals.
  • Professional settings using such agents may require redesigned oversight mechanisms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach could extend to personal life, affecting how families or friends delegate communications.
  • Organizations might need policies on the use of such agents to maintain accountability.
  • Empirical studies could test whether users develop new strategies for verifying twin agent outputs over time.

Load-bearing premise

The three failure modes have no reliable attribution path between them when a colleague interacts with a twin agent.

What would settle it

A user study in which participants review outputs from a twin agent and are asked to identify which of the three failure modes caused any errors, measuring if attribution accuracy exceeds random guessing.

Figures

Figures reproduced from arXiv: 2605.19838 by Hugo Andersson, Niklas Elmqvist.

Figure 1
Figure 1. Figure 1: Agent behavior. A twin agent (left, represented as an avatar) sends a message to a human colleague, who reacts with doubt. The colleague cannot determine the source of that doubt: it may stem from a schema gap (the agent’s representation of the person is incomplete), an epistemic gap (the colleague simply does not know the person’s view), or a model artifact (an LLM failure such as hallucination, sycophanc… view at source ↗
read the original abstract

Agentic AI has taken on the role of assistant, collaborator, and decision-support tool. We argue the next role on that list is more personal: you. These are digital twins of each individual -- twin agents -- representing their knowledge, perspective, and communicative style to colleagues when they are unavailable. Drawing on early design work in an ongoing project in which agents represent knowledge workers in a professional setting, we identify a trust calibration problem specific to this approach. When a human colleague doubts a twin agent's output, they face three failure modes (a schema gap, an epistemic gap, and a model artifact) with no reliable attribution path between them. Cognitive forcing functions and related frameworks address overreliance effectively in contexts where there is a clear boundary between the AI and the human decision-maker. However, twin agents dissolve that boundary, raising a class of trust calibration challenge these frameworks were not designed to handle. We introduce the concept, distinguish it from digital twins, and outline the research questions this new class of agent demands.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces the concept of twin agents—AI systems that represent an individual's knowledge, perspective, and communicative style to colleagues when the person is unavailable. It argues that these agents create a distinct trust calibration problem: when output is doubted, the user encounters a schema gap, epistemic gap, or model artifact with no reliable attribution path among them. This dissolves the human-AI boundary that existing frameworks such as cognitive forcing functions presuppose, and the paper distinguishes twin agents from both conventional agents and digital twins while outlining resulting research questions based on early design work in a professional knowledge-worker setting.

Significance. If the boundary-dissolution claim and the attribution-path observation hold, the work identifies a timely gap in HCI research on trust as agentic AI shifts from generic tools to personalized representations. It could usefully direct empirical studies and design interventions toward this emerging class of system. The paper's explicit framing as problem identification rather than empirical validation is consistent with its scope.

major comments (1)
  1. [Abstract and failure-modes section] The central claim that the three failure modes lack a reliable attribution path (abstract and the section defining the failure modes) is load-bearing for the argument that cognitive forcing functions are inapplicable, yet it is advanced purely by logical distinction without a concrete interaction scenario or example from the mentioned ongoing project that demonstrates why attribution cannot be performed.
minor comments (2)
  1. [Introduction / concept definition] The distinction between twin agents and digital twins is stated but would benefit from a short explicit comparison table or bullet list early in the manuscript to prevent readers from conflating the two.
  2. [Introduction] The manuscript refers to 'early design work in an ongoing project' but provides no high-level description of that work (e.g., participant roles or observed interactions), which would help ground the identified gaps.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the timeliness of the trust calibration issues for twin agents. We address the single major comment below and have made revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract and failure-modes section] The central claim that the three failure modes lack a reliable attribution path (abstract and the section defining the failure modes) is load-bearing for the argument that cognitive forcing functions are inapplicable, yet it is advanced purely by logical distinction without a concrete interaction scenario or example from the mentioned ongoing project that demonstrates why attribution cannot be performed.

    Authors: We agree that the load-bearing claim would be strengthened by a concrete illustration. The current argument relies on the logical observation that twin agents dissolve the human-AI boundary by design, so that a doubted output cannot be cleanly attributed to the represented person's schema, their epistemic state at the time of twin creation, or a model artifact. To address the referee's point directly, we have revised the failure-modes section to include a realistic interaction scenario drawn from our early design explorations in the professional knowledge-worker setting. The added scenario describes a colleague querying a twin agent about a project decision, receiving output that appears inconsistent with known facts, and then being unable to determine whether the discrepancy should be attributed to the absent colleague's perspective, incomplete knowledge transfer into the twin, or an AI generation error. This example shows why attribution mechanisms presupposed by cognitive forcing functions become unreliable. The revision preserves the paper's scope as problem identification rather than empirical validation. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is a conceptual position paper that identifies trust calibration challenges for twin agents by distinguishing them from conventional agents and digital twins through logical argumentation. It draws on early design work in an ongoing project to outline three failure modes and contrasts them with existing frameworks such as cognitive forcing functions, but contains no equations, fitted parameters, predictions, or self-referential derivations that reduce to inputs by construction. The central claim frames the contribution as raising open research questions rather than asserting a derived or proven result, rendering the argument self-contained against external benchmarks with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The argument depends on the domain assumption that twin agents create attribution ambiguity not addressable by boundary-preserving frameworks, with the new entity of twin agents introduced without independent empirical grounding in the abstract.

axioms (2)
  • domain assumption Twin agents represent an individual's knowledge, perspective, and communicative style when the person is unavailable.
    This premise defines the core object of study and is invoked in the opening description of the approach.
  • domain assumption Existing cognitive forcing functions and related frameworks require a clear boundary between AI and human decision-maker.
    This is used to argue that current solutions do not apply to twin agents.
invented entities (1)
  • twin agents no independent evidence
    purpose: Digital representations of specific individuals for professional collaboration when the person is unavailable.
    New postulated class of agent introduced to frame the trust calibration problem.

pith-pipeline@v0.9.0 · 5700 in / 1385 out tokens · 42020 ms · 2026-05-20T04:04:35.121180+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    When a colleague’s twin agent tells you something, you are not receiving a system recommendation. You are receiving what is framed as your colleague’s knowledge and perspective. ... three simultaneous explanations that cannot be disentangled: Schema gap, Epistemic gap, Model artifact

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 2 internal anchors

  1. [1]

    Lasecki, Daniel S

    Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S. Lasecki, Daniel S. Weld, and Eric Horvitz. 2019. Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance. InProceedings of the Seventh AAAI Conference on Human Computation and Crowdsourcing. AAAI Press. doi:10.1609/hcomp.v7i1.5285

  2. [2]

    Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making. InProceedings of the ACM on Human-Computer Interaction. Association for Computing Machinery, New York, NY, USA. doi:10.1145/3449287

  3. [3]

    Yi Fei Cheng, Hirokazu Shirado, and Shunichi Kasahara. 2025. Conversational Agents on Your Behalf: Opportunities and Challenges of Shared Autonomy in Voice Communication for Multitasking. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA. doi:10.1145/3706598.3714017

  4. [4]

    Upol Ehsan and Mark O. Riedl. 2020. Human-centered Explainable AI: Towards a Reflective Sociotechnical Approach. InProceedings of the AAAI Workshop on Artificial Intelligence Safety (SafeAI). AAAI Press. arXiv:2002.01092

  5. [5]

    Aisha S. Gani. 2025. When Customers Dial Klarna’s Hotline, An AI CEO Picks Up. Bloomberg Tech In Depth (newsletter). https://www.bloomberg. com/news/newsletters/2025-09-10/when-customers-dial-klarna-s-hotline-an-ai-ceo-picks-up

  6. [6]

    Qing Hu, Qing Xiao, Hancheng Cao, and Hong Shen. 2026. When Your Boss Is an AI Bot: Exploring Opportunities and Risks of Manager Clone Agents in the Future Workplace. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA. doi:10.48550/arXiv.2509.10993

  7. [7]

    Hancock, and Mor Naaman

    Maurice Jakesch, Jeffrey T. Hancock, and Mor Naaman. 2023. Human Heuristics for AI-Generated Language Are Flawed.Proceedings of the National Academy of Sciences120, 11 (2023), e2208839120. doi:10.1073/pnas.2208839120

  8. [8]

    Clifford Nass and Youngme Moon. 2000. Machines and Mindlessness: Social Responses to Computers.Journal of Social Issues56, 1 (2000), 81–103. doi:10.1111/0022-4537.00153

  9. [9]

    , year = 2023, booktitle =

    Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative Agents: Interactive Simulacra of Human Behavior. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA. doi:10.1145/3586183.3606763

  10. [10]

    LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals

    Joon Sung Park, Carolyn Q. Zou, Aaron Shaw, Benjamin Mako Hill, Carrie Cai, Meredith Ringel Morris, Robb Willer, Percy Liang, and Michael S. Bernstein. 2024. Generative Agent Simulations of 1,000 People. doi:10.48550/arXiv.2411.10109 6

  11. [11]

    Bernstein

    Omar Shaikh, Shardul Sapkota, Shan Rizvi, Eric Horvitz, Joon Sung Park, Diyi Yang, and Michael S. Bernstein. 2025. Creating General User Models from Computer Use. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA. doi:10.1145/3746059.3747722

  12. [12]

    The Simile Team. 2026. The Simulation Company. Simile Blog. https://simile.ai/blog/the-simulation-company 7