A Need for Trust in Conversational Interface Research
Pith reviewed 2026-05-25 09:56 UTC · model grok-4.3
The pith
Trust is critical yet inconsistently defined and measured across conversational interface research.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across several branches of conversational interaction research including interactions with social robots, embodied agents, and conversational assistants, users have identified trust as a critical part of those interactions. Nevertheless, there is little agreement on what trust means within these sort of interactions or how trust can be measured. The paper explores some of the dimensions of trust as it has been understood in previous work and outlines some of the ways trust has been measured in the hopes of furthering discussion of the concept across the field.
What carries the argument
The review and comparison of trust dimensions and measurement techniques drawn from prior studies on robots, agents, and assistants.
If this is right
- Shared definitions would let researchers directly compare trust findings from robot studies with those from agent and assistant studies.
- Agreed-upon measures could improve how conversational systems are evaluated for their ability to build trust.
- Greater consensus on trust might guide the design of interfaces that more consistently earn user confidence.
Where Pith is reading between the lines
- If trust differs by interaction type, context-specific scales for robots versus assistants may work better than one universal framework.
- Future tests could check whether using a common trust measure alters which features designers prioritize in new interfaces.
- Linking this review to psychological models of trust might clarify whether conversational trust is a distinct type of relationship.
Load-bearing premise
The lack of agreement on trust definitions and measures is mainly a barrier to progress rather than a sign that trust means genuinely different things in each context.
What would settle it
An empirical study showing that trust in social robot interactions and trust in text-based conversational assistant interactions are unrelated constructs with no shared ability to predict user behavior or preferences.
read the original abstract
Across several branches of conversational interaction research including interactions with social robots, embodied agents, and conversational assistants, users have identified trust as a critical part of those interactions. Nevertheless, there is little agreement on what trust means within these sort of interactions or how trust can be measured. In this paper, we explore some of the dimensions of trust as it has been understood in previous work and we outline some of the ways trust has been measured in the hopes of furthering discussion of the concept across the field.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a position paper claiming that trust is identified as a critical factor by users in interactions with social robots, embodied agents, and conversational assistants across conversational interaction research. However, there is little agreement on the meaning of trust in these contexts or on appropriate measurement methods. The paper reviews dimensions of trust from prior work and outlines measurement approaches with the goal of stimulating discussion in the field.
Significance. If the central observation of fragmented understanding of trust holds, the paper could play a useful role in prompting the conversational interfaces community to develop more shared definitions and metrics, potentially improving comparability of studies across sub-areas like robotics and virtual agents.
major comments (2)
- Abstract: The assertion that 'there is little agreement on what trust means within these sort of interactions or how trust can be measured' is presented without specific examples of conflicting definitions or measures from the literature, which is load-bearing for the paper's motivation to survey dimensions and methods.
- Introduction (or equivalent section): The paper does not address whether observed differences in trust conceptualizations reflect genuinely distinct phenomena across robot, agent, and assistant contexts rather than a lack of consensus that requires resolution.
minor comments (2)
- The abstract could benefit from one or two concrete citations illustrating divergent trust definitions to ground the claim of disagreement.
- The manuscript would be strengthened by a brief concluding section that proposes next steps for the community rather than ending solely on the invitation to discuss.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our position paper. We address each major comment below and indicate planned revisions where appropriate.
read point-by-point responses
-
Referee: Abstract: The assertion that 'there is little agreement on what trust means within these sort of interactions or how trust can be measured' is presented without specific examples of conflicting definitions or measures from the literature, which is load-bearing for the paper's motivation to survey dimensions and methods.
Authors: We agree the abstract claim would be stronger with immediate grounding. The manuscript body reviews multiple dimensions and measurement approaches drawn from prior work across robots, agents, and assistants, which collectively illustrate the variation. In revision we will expand the introduction to include two or three brief, concrete examples of conflicting definitions and measures (e.g., differing emphasis on competence versus benevolence, or questionnaire versus behavioral metrics) so the motivation is explicitly supported before the survey sections. revision: yes
-
Referee: Introduction (or equivalent section): The paper does not address whether observed differences in trust conceptualizations reflect genuinely distinct phenomena across robot, agent, and assistant contexts rather than a lack of consensus that requires resolution.
Authors: The paper frames the observed fragmentation as motivation for cross-field discussion rather than asserting that all differences must be resolved into a single consensus. We acknowledge the referee's point that some differences may legitimately reflect context-specific phenomena. In the revised introduction we will add a short paragraph explicitly noting this alternative explanation and positioning the call for discussion as a means to determine whether shared metrics, context-tailored approaches, or both are warranted. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper is a qualitative position statement surveying dimensions and measurements of trust across conversational interaction research. It advances no derivations, equations, predictions, or fitted quantities. The core claim of limited agreement on trust definitions is presented as an observation drawn from external literature rather than derived from any internal construction or self-citation chain. No load-bearing steps reduce to inputs by definition or renaming.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Oya Celiktutan and Hatice Gunes. 2017. Automatic Predic tion of Impres- sions in Time and across Varying Context: Personality, Attr activeness and Likeability. IEEE Transactions on Affective Computing 8, 1 (Jan. 2017), 29–42. https://doi.org/10.1109/TAFFC.2015.2513401
-
[2]
Leigh Clark, Phillip Doyle, Diego Garaialde, Emer Gilma rtin, Stephan Schlögl, Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, and Benjamin Cowan. 2018. The State of Speech in HCI: Trends, Themes and Ch allenges. arXiv:1810.06828 [cs] (Oct. 2018). http://arxiv.org/abs/1810.06828 arXiv: 1810.06828
-
[3]
Leigh Clark, Cosmin Munteanu, Vincent Wade, Benjamin R. Cowan, Na- dia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Ju stin Edwards, Brendan Spillane, Emer Gilmartin, and Christine Murad. 201 9. What Makes a Good Conversation?: Challenges in Designing Truly C onversa- tional Agents. In Proceedings of the 2019 CHI Conference on Human Factors in Com...
-
[4]
Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Mor rissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. W hat can i help you with?: infrequent users’ experiences of intelligent perso nal assistants. In Proceed- ings of the 19th International Conference on Human-Compute r Interaction with Mobile Devices and Services . ACM, 43
work page 2017
-
[5]
Ewart J de Visser, Samuel S Monfort, Ryan McKendrick, Mel issa AB Smith, Patrick E McKnight, Frank Krueger, and Raja Parasuraman. 20 16. Almost hu- man: Anthropomorphism increases trust resilience in cogni tive agents. Journal of Experimental Psychology: Applied 22, 3 (2016), 331
work page 2016
-
[6]
Florian N Egger. 2000. Trust me, I’m an online vendor: tow ards a model of trust for e-commerce system design. In CHI’00 extended abstracts on Human factors in computing systems. ACM, 101–102
work page 2000
-
[7]
Andrew J. Flanagin and Miriam J. Metzger. 2007. The role o f site features, user attributes, and information verification behaviors on the p erceived credibility of web-based information. New Media & Society 9, 2 (April 2007), 319–342. https://doi.org/10.1177/1461444807075015
-
[8]
BJ Fogg and Hsiang Tseng. 1999. The elements of computer c redibility. In Pro- ceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 80–87
work page 1999
-
[9]
Amos Freedy, Ewart DeVisser, Gershon Weltman, and Nicol e Coeyman. 2007. Measurement of trust in human-robot collaboration. In 2007 International Sym- posium on Collaborative Technologies and Systems . IEEE, 106–114
work page 2007
-
[10]
Kerstin Sophie Haring, David Silvera-Tawil, Yoshio Ma tsumoto, Mari Velonaki, and Katsumi Watanabe. 2014. Perception of an android robot i n Japan and Aus- tralia: A cross-cultural comparison. In International conference on social robotics . Springer, 166–175
work page 2014
-
[11]
Kevin Anthony Hoff and Masooda Bashir. 2015. Trust in aut omation: Integrating empirical evidence on factors that influence trust. Human Factors 57, 3 (2015), 407–434
work page 2015
-
[12]
Oliver P John, Sanjay Srivastava, and others. 1999. The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of personality: Theory and research 2, 1999 (1999), 102–138
work page 1999
-
[13]
Spiro Kiousis. 2001. Public Trust or Mistrust? Percept ions of Media Credibility in the Information Age. Mass Communication and Society 4, 4 (Nov. 2001), 381–403. https://doi.org/10.1207/S15327825MCS0404_4
-
[14]
John D Lee and Katrina A See. 2004. Trust in automation: D esigning for appro- priate reliance. Human factors 46, 1 (2004), 50–80
work page 2004
-
[15]
Jin Joo Lee, Brad Knox, Jolie Baumann, Cynthia Breazeal , and David DeSteno
-
[16]
Frontiers in psychology 4 (2013), 893
Computationally modeling interpersonal trust. Frontiers in psychology 4 (2013), 893
work page 2013
-
[17]
Ewa Luger and Abigail Sellen. 2016. "Like Having a Reall y Bad PA": The Gulf between User Expectation and Experience of Conversati onal Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Com put- ing Systems - CHI ’16 . ACM Press, Santa Clara, California, USA, 5286–5297. https://doi.org/10.1145/2858036.2858288
-
[18]
D Harrison McKnight, Vivek Choudhury, and Charles Kacm ar. 2002. The impact of initial consumer trust on intentions to transact with a web site: a trust building model. The journal of strategic information systems 11, 3-4 (2002), 297–323. A Need for Trust in Conversational Interface Research CUI 20 19, August 22–23, 2019, Dublin, Ireland
work page 2002
-
[19]
Panagiotis Mitkidis, John J McGraw, Andreas Roepstorff , and Sebastian Wallot
-
[20]
Physiology & behavior 149 (2015), 101–106
Building trust: Heart rate synchrony and arousal duri ng joint action in- creased by public goods game. Physiology & behavior 149 (2015), 101–106
work page 2015
-
[21]
Christie Olson and Kelli Kemery. 2019. 2019 Voice report: Consumer adoption of voice technology and digital assistants . Technical Report. Microsoft
work page 2019
-
[22]
Jens Riegelsberger, M Angela Sasse, and John D McCarthy . 2003. Shiny happy people building trust?: photos on e-commerce websites and c onsumer trust. In Proceedings of the SIGCHI conference on Human factors in comp uting systems . ACM, 121–128
work page 2003
-
[23]
Denise M Rousseau, Sim B Sitkin, Ronald S Burt, and Colin Camerer. 1998. Not so different after all: A cross-discipline view of trust. Academy of management review 23, 3 (1998), 393–404
work page 1998
-
[24]
Maha Salem, Gabriella Lakatos, Farshid Amirabdollahi an, and Kerstin Dauten- hahn. 2015. Would you trust a (faulty) robot?: Effects of erro r, task type and personality on human-robot cooperation and trust. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot I nteraction. ACM, 141–148
work page 2015
-
[25]
Lauren E. Scissors, Alastair J. Gill, Kathleen Geraght y, and Darren Gergle. 2009. In CMC we trust: the role of similarity. In Proceedings of the 27th international conference on Human factors in computing systems - CHI 09 . ACM Press, Boston, MA, USA, 527. https://doi.org/10.1145/1518701.1518783
-
[26]
Elaine Short, Justin Hart, Michelle Vu, and Brian Scass ellati. 2010. No fair an interaction with a cheating robot. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI) . IEEE, 219–226
work page 2010
-
[27]
Ilaria Torre, Leigh Clark, and Benjamin R Cowan. 2018. Measuring and designing trust in Human-Agent Interaction
work page 2018
-
[28]
Ilaria Torre, Jeremy Goslin, Laurence White, and Debor a Zanatto. 2018. Trust in artificial voices: A congruency effect of first impressions an d behavioural expe- rience. In Proceedings of the Technology, Mind, and Society . ACM, 40
work page 2018
-
[29]
Lin Wang, Pei-Luen Patrick Rau, Vanessa Evers, Benjamin Krisper Robinson, and Pamela Hinds. 2010. When in Rome: the role of culture & contex t in adherence to robot recommendations. In Proceedings of the 5th ACM/IEEE international con- ference on Human-robot interaction . IEEE Press, 359–366
work page 2010
-
[30]
James E Young, Richard Hawkins, Ehud Sharlin, and Takeo Igarashi. 2009. To- ward acceptable domestic robots: Applying insights from so cial psychology. In- ternational Journal of Social Robotics 1, 1 (2009), 95
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.