Comparing LLM-Based Conversational and Graphical Interfaces for Industrial Decision Tasks: An Exploratory Mixed-Methods Study

Alan Serrano; Daniele Mazzei; Roberto Figli\`e; Simone Caputo; Tommaso Turchi

arxiv: 2605.31224 · v1 · pith:GRQI6BLMnew · submitted 2026-05-29 · 💻 cs.CY · cs.AI· cs.HC

Comparing LLM-Based Conversational and Graphical Interfaces for Industrial Decision Tasks: An Exploratory Mixed-Methods Study

Roberto Figli\`e , Simone Caputo , Alan Serrano , Tommaso Turchi , Daniele Mazzei This is my paper

Pith reviewed 2026-06-28 20:41 UTC · model grok-4.3

classification 💻 cs.CY cs.AIcs.HC

keywords conversational user interfacesLLMdashboardsindustrial decision makingmixed-methods studyuser interface comparisongenerative AIIoT data

0 comments

The pith

LLM-based conversational interfaces can reduce interaction effort for industrial decision tasks compared to dashboards.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests whether LLM-powered chat interfaces deliver on the promise of easier data access in industry by comparing them directly to traditional dashboards. Twenty participants completed four simulated decision tasks using both systems, tracking mental workload, time, accuracy, and gathering interview feedback. The conversational agent allowed quicker direct access to specific information, cutting effort, while the dashboard helped users maintain an overview and verify details. Advantages depended on the task and call for larger studies to confirm.

Core claim

The findings suggest that the conversational agent can reduce interactional effort by supporting more direct access to information, while the dashboard remains valuable for overview and verification. However, these benefits may vary across tasks and require validation through larger-scale studies.

What carries the argument

Mixed-methods evaluation of an LLM conversational agent versus a graphical dashboard in four simulated industrial decision tasks, combining quantitative measures of workload, time, and accuracy with qualitative thematic analysis of interviews.

If this is right

Conversational agents may enable more efficient targeted information retrieval in data-intensive industrial settings.
Dashboards provide complementary value for broad monitoring and result confirmation.
Interface choice should account for task-specific demands rather than assuming one is universally superior.
Further validation with larger and more diverse participant groups is required before broad adoption.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hybrid interfaces that combine chat queries with dashboard visuals could capture the strengths of both for complex decisions.
If the effort reduction holds, it could influence how IoT monitoring systems are designed in manufacturing.
The task-dependent results suggest testing the same comparison in adjacent fields like logistics or energy management.

Load-bearing premise

The four simulated tasks of varying complexity and the sample of 20 participants adequately represent real-world industrial decision-makers and production LLM-based conversational agents.

What would settle it

A follow-up study with a larger sample of actual industrial users on live IoT data showing no consistent reduction in effort for the conversational agent or no task-dependent variation would challenge the reported benefits.

read the original abstract

The use of Generative AI Conversational User Interfaces (CUI) as a new way to access and analyze data is growing in all sectors, and the industrial one is no exception. There, large amounts of data produced by IoT devices are flowing through user interfaces and may require them a new adaptation to the new analyses needs of decision-makers. LLM-based CUIs are promising a new way to directly interact with those data through the directness of natural language and without the learning costs that every GUI design has. Moreover, the capabilities of LLMs and their agency open up the possibility to automate some tasks and help with the reasoning during decision-making activities. But are this promises well founded? We try to scope this general question with a mixed-approach study comparing a state-of-the-art dashboard with a conversational agent. A total of 20 participants used both interfaces to complete four simulated industrial decision tasks of varying complexity. We combined measures of mental workload, completion time, and decision accuracy with a post-study questionnaire and semi-structured interviews analyzed through thematic analysis. The findings suggest that the conversational agent can reduce interactional effort by supporting more direct access to information, while the dashboard remains valuable for overview and verification. However, these benefits may vary across tasks and require validation through larger-scale studies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Exploratory study with 20 participants on simulated tasks gives early signals that conversational interfaces cut effort but dashboards aid overview, yet the limits on generalizability are the main story.

read the letter

The paper runs a mixed-methods comparison of an LLM conversational interface against a dashboard on four simulated industrial decision tasks. Twenty participants completed the tasks while the authors tracked mental workload, completion time, accuracy, plus post-study questionnaires and thematic analysis of interviews. The directional finding is that the conversational version supports more direct access and lower interaction effort, while the dashboard stays useful for overview and verification, with some variation by task.

What stands out as new is extending the CUI versus GUI comparison into the specific setting of industrial IoT data access and decision tasks, with thematic coding on effort and verification preferences. The authors combine quantitative measures with qualitative data and they state clearly that the work is exploratory and needs larger validation.

The soft spot is the representativeness issue the stress test flags. The sample is small, the tasks are simulated rather than drawn from actual production decisions, and there is little detail on whether participants had relevant domain experience. That leaves the effort-reduction claim tentative and vulnerable to artifacts from the artificial setup. The paper itself notes the need for bigger studies, which is honest but also means the evidence does not yet support strong design recommendations.

This is the sort of paper that could interest HCI or industrial software researchers looking for initial data points on where conversational agents might fit in manufacturing tools. It is not strong enough on its own to guide practice, but it could serve as a prompt for follow-up work.

I would send it to peer review. The methods fit an exploratory study, the topic is timely, and referees could usefully push on sample and task design without the work being fundamentally flawed.

Referee Report

1 major / 2 minor

Summary. The manuscript reports an exploratory mixed-methods user study in which 20 participants completed four simulated industrial decision tasks of varying complexity using both an LLM-based conversational interface and a traditional dashboard. Quantitative measures included mental workload, task completion time, and decision accuracy; these were supplemented by a post-study questionnaire and semi-structured interviews subjected to thematic analysis. The central claim is that the conversational agent can reduce interactional effort by enabling more direct information access, while the dashboard remains useful for overview and verification, although benefits appear to vary by task and the authors explicitly call for larger-scale validation.

Significance. If confirmed through larger studies with domain-experienced participants and real-world tasks, the work would supply useful early evidence on the complementary roles of conversational and graphical interfaces for IoT-driven industrial decisions. The mixed-methods design is well-suited to an exploratory HCI study and allows both performance metrics and user perceptions to be captured, which strengthens the tentative conclusions offered.

major comments (1)

[Methods] Methods section (participant recruitment and task design): The central suggestion that the conversational interface reduces interactional effort depends on the 20-participant sample and four simulated tasks being representative of industrial decision-makers and actual IoT scenarios. No evidence is supplied that participants possessed relevant domain expertise or that the tasks were validated against real production decision contexts; this assumption is load-bearing for generalizing the directional finding even though the abstract already flags the need for larger validation.

minor comments (2)

[Abstract] Abstract: The summary paragraph is concise but would benefit from explicitly stating the participant count and number of tasks to improve immediate readability for readers scanning the paper.
[Discussion] Discussion: The integration between the quantitative results and the thematic-analysis themes could be strengthened by more explicit cross-references showing how interview excerpts align with or qualify the effort-reduction observations.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation for minor revision. The feedback correctly identifies a key boundary condition of our exploratory design, which we address directly below while preserving the manuscript's stated scope.

read point-by-point responses

Referee: [Methods] Methods section (participant recruitment and task design): The central suggestion that the conversational interface reduces interactional effort depends on the 20-participant sample and four simulated tasks being representative of industrial decision-makers and actual IoT scenarios. No evidence is supplied that participants possessed relevant domain expertise or that the tasks were validated against real production decision contexts; this assumption is load-bearing for generalizing the directional finding even though the abstract already flags the need for larger validation.

Authors: We agree that the study does not claim representativeness for domain-experienced industrial decision-makers or validated real-world tasks. The work is explicitly positioned as exploratory, with the abstract and discussion already stating that larger-scale validation with domain experts is required. Participants were drawn from a university community with mixed technical backgrounds; the four tasks were constructed from publicly described IoT decision scenarios in the industrial literature to produce controlled variation in complexity. To improve transparency we will expand the Methods section with additional detail on recruitment criteria, participant self-reported backgrounds, and the literature sources used to shape the task scenarios. These additions will not alter the directional findings but will more clearly bound their interpretation. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical user study with direct measurements

full rationale

This paper is an exploratory mixed-methods user study that collects and reports direct participant measurements (mental workload, completion time, decision accuracy, post-study questionnaire, semi-structured interviews with thematic analysis) from 20 users performing four simulated tasks. There are no equations, derivations, fitted parameters, predictions, or load-bearing self-citations that reduce any claim to its own inputs by construction. All findings stand on observed data independent of prior work, satisfying the criteria for a self-contained empirical report.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical HCI study with no mathematical content; central claim rests on domain assumptions about task representativeness and participant behavior rather than free parameters or new entities.

axioms (1)

domain assumption The four simulated industrial decision tasks accurately model real industrial decision-making complexity and data access needs.
The interface comparison is built on these tasks standing in for actual factory scenarios.

pith-pipeline@v0.9.1-grok · 5778 in / 1280 out tokens · 35621 ms · 2026-06-28T20:41:21.256196+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

[1]

doi: 10.1207/S15327906MBR3403 2

ISSN 0027-3171. doi: 10.1207/S15327906MBR3403 2. URLhttps://doi.org/10.1207/S15327906MBR3403_2. eprint: https://doi.org/10.1207/S15327906MBR3403 2. S. Colabianchi, F. Costantino, and N. Sabetta. Assessment of a large language model based digital intelligent assistant in assembly manufacturing.Computers in Industry, 162:104129, Nov. 2024. ISSN 0166-3615. d...

work page doi:10.1207/s15327906mbr3403 2024
[2]

doi: 10.1007/s44217-024-00214-7

ISSN 2731-5525. doi: 10.1007/s44217-024-00214-7. URL https://doi.org/10.1007/s44217-024-00214-7. Deloitte. The State of AI in the Enterprise - 2026 AI report,

work page doi:10.1007/s44217-024-00214-7 2026
[3]

URLhttps://www.deloitte.com/us/en/what-we-do/ capabilities/applied-artificial-intelligence/content/ state-of-ai-in-the-enterprise.html. E. Dimara, H. Zhang, M. Tory, and S. Franconeri. The Unmet Data Visualization Needs of Decision Makers Within Organizations.IEEE Transactions on Visualization and Computer Graphics, 28(12):4101–4112, Dec. 2022. ISSN 1941-...

work page doi:10.1109/tvcg.2021.3074023 2022
[4]

doi: 10.1016/S0166-4115(08) 62386-9

ISBN 978-0-444-70388-0. doi: 10.1016/S0166-4115(08) 62386-9. URLhttps://linkinghub.elsevier.com/retrieve/pii/ S0166411508623869. S. Hjelle, P. Mikalef, N. Altwaijry, and V. Parida. Organizational decision making and analytics: An experimental study on dashboard visualizations.Information & Management, 61(6): 104011, Sept. 2024. ISSN 0378-7206. doi: 10.101...

work page doi:10.1016/s0166-4115(08 2024
[5]

doi: 10.1518/hfes.46.1.50 30392

ISSN 0018-7208. doi: 10.1518/hfes.46.1.50 30392. URL https://journals.sagepub.com/action/showAbstract. Y. Lee, L. Sargsyan, S. Choi, and S.-H. Kim. Exploring User Perceptions and Preferences in Voice Assistant Conversation Design: The Role of Linguistic Features.International Journal of Human–Computer Interaction, 42(4):2524–2541, Feb

work page doi:10.1518/hfes.46.1.50
[6]

& Xie, Y

ISSN 1044-7318. doi: 10.1080/10447318.2025.2530058. URLhttps://doi.org/10.1080/10447318.2025.2530058. eprint: https://doi.org/10.1080/10447318.2025.2530058. Q. V. Liao, W. Geyer, M. Muller, and Y. Khazaen. Conversational Interfaces for Information Search. In W. T. Fu and H. van Oostendorp, editors,Understanding and Improving Information Search: A Cognitiv...

work page doi:10.1080/10447318.2025.2530058 2025
[7]

URLhttps://doi.org/ 10.1007/s12063-024-00534-9

doi: 10.1007/s12063-024-00534-9. URLhttps://doi.org/ 10.1007/s12063-024-00534-9. X. Liu, T. Rietz, and A. Maedche. Conversational versus graphical user interfaces: the influence of rational decision style when individuals perform decision-making tasks repeatedly. Universal Access in the Information Society, June 2024. ISSN 1615-5297. doi: 10.1007/s10209-0...

work page doi:10.1007/s12063-024-00534-9 2024
[8]

URLhttps://www

doi: 10.1016/j.tics.2016.07.002. URLhttps://www. sciencedirect.com/science/article/pii/S1364661316300985. S. Sch¨ obel, A. Schmitt, D. Benner, M. Saqr, A. Janson, and J. M. Leimeister. Charting the Evolution and Future of Conversational Agents: A Research Agenda Along Five Waves and New Frontiers.Information Systems Frontiers, 26(2):729–754, Apr. 2024. IS...

work page doi:10.1016/j.tics.2016.07.002 2016
[9]

doi: 10.1016/j.procir.2016.11

ISSN 2212-8271. doi: 10.1016/j.procir.2016.11

work page doi:10.1016/j.procir.2016.11 2016
[10]

URLhttps://www.sciencedirect.com/science/article/ pii/S2212827116312616. M. Tory, L. Bartram, B. Fiore-Gartland, and A. Crisan. Finding Their Data Voice: Practices and Challenges of Dashboard Users.IEEE Computer Graphics and Applications, 43(1): 22–36, Jan. 2023. ISSN 1558-1756. doi: 10.1109/MCG.2021. 3136545. URLhttps://ieeexplore.ieee.org/document/96566...

work page doi:10.1109/mcg.2021 2023
[11]

URLhttps://www

doi: 10.1016/j.ijhcs.2024.103359. URLhttps://www. sciencedirect.com/science/article/pii/S1071581924001423. I. Vessey. Cognitive Fit: A Theory-Based Analysis of the Graphs Versus Tables Literature.Decision Sciences, 22(2):219–240, 1991. ISSN 1540-5915. doi: 10.1111/j. 1540-5915.1991.tb00344.x. URLhttps://onlinelibrary.wiley. com/doi/abs/10.1111/j.1540-5915...

work page doi:10.1016/j.ijhcs.2024.103359 2024
[12]

doi: 10.1109/TVCG.2023.3326525

ISSN 1941-0506. doi: 10.1109/TVCG.2023.3326525. URL https://ieeexplore.ieee.org/document/10296834. Conference Name: IEEE Transactions on Visualization and Computer Graphics. C. Wen, P. Clough, R. Paton, and R. Middleton. Leveraging large language models for thematic analysis: a case study in the charity sector.AI & SOCIETY, 41(1):731–748, Jan. 2026. ISSN ...

work page doi:10.1109/tvcg.2023.3326525 1941
[13]

doi: 10.1016/S0020-7373(84) 80043-7

ISSN 0020-7373. doi: 10.1016/S0020-7373(84) 80043-7. URLhttps://www.sciencedirect.com/science/ article/pii/S0020737384800437. H. Yang, Y. Zeng, H. Xing, and P. Hu. Fatigued by uncertainties: Exploring the cognitive and emotional costs of generative AI usage.International Journal of Information Management, 87: 103010, Apr. 2026. ISSN 0268-4012. doi: 10.101...

work page doi:10.1016/s0020-7373(84 2026

[1] [1]

doi: 10.1207/S15327906MBR3403 2

ISSN 0027-3171. doi: 10.1207/S15327906MBR3403 2. URLhttps://doi.org/10.1207/S15327906MBR3403_2. eprint: https://doi.org/10.1207/S15327906MBR3403 2. S. Colabianchi, F. Costantino, and N. Sabetta. Assessment of a large language model based digital intelligent assistant in assembly manufacturing.Computers in Industry, 162:104129, Nov. 2024. ISSN 0166-3615. d...

work page doi:10.1207/s15327906mbr3403 2024

[2] [2]

doi: 10.1007/s44217-024-00214-7

ISSN 2731-5525. doi: 10.1007/s44217-024-00214-7. URL https://doi.org/10.1007/s44217-024-00214-7. Deloitte. The State of AI in the Enterprise - 2026 AI report,

work page doi:10.1007/s44217-024-00214-7 2026

[3] [3]

URLhttps://www.deloitte.com/us/en/what-we-do/ capabilities/applied-artificial-intelligence/content/ state-of-ai-in-the-enterprise.html. E. Dimara, H. Zhang, M. Tory, and S. Franconeri. The Unmet Data Visualization Needs of Decision Makers Within Organizations.IEEE Transactions on Visualization and Computer Graphics, 28(12):4101–4112, Dec. 2022. ISSN 1941-...

work page doi:10.1109/tvcg.2021.3074023 2022

[4] [4]

doi: 10.1016/S0166-4115(08) 62386-9

ISBN 978-0-444-70388-0. doi: 10.1016/S0166-4115(08) 62386-9. URLhttps://linkinghub.elsevier.com/retrieve/pii/ S0166411508623869. S. Hjelle, P. Mikalef, N. Altwaijry, and V. Parida. Organizational decision making and analytics: An experimental study on dashboard visualizations.Information & Management, 61(6): 104011, Sept. 2024. ISSN 0378-7206. doi: 10.101...

work page doi:10.1016/s0166-4115(08 2024

[5] [5]

doi: 10.1518/hfes.46.1.50 30392

ISSN 0018-7208. doi: 10.1518/hfes.46.1.50 30392. URL https://journals.sagepub.com/action/showAbstract. Y. Lee, L. Sargsyan, S. Choi, and S.-H. Kim. Exploring User Perceptions and Preferences in Voice Assistant Conversation Design: The Role of Linguistic Features.International Journal of Human–Computer Interaction, 42(4):2524–2541, Feb

work page doi:10.1518/hfes.46.1.50

[6] [6]

& Xie, Y

ISSN 1044-7318. doi: 10.1080/10447318.2025.2530058. URLhttps://doi.org/10.1080/10447318.2025.2530058. eprint: https://doi.org/10.1080/10447318.2025.2530058. Q. V. Liao, W. Geyer, M. Muller, and Y. Khazaen. Conversational Interfaces for Information Search. In W. T. Fu and H. van Oostendorp, editors,Understanding and Improving Information Search: A Cognitiv...

work page doi:10.1080/10447318.2025.2530058 2025

[7] [7]

URLhttps://doi.org/ 10.1007/s12063-024-00534-9

doi: 10.1007/s12063-024-00534-9. URLhttps://doi.org/ 10.1007/s12063-024-00534-9. X. Liu, T. Rietz, and A. Maedche. Conversational versus graphical user interfaces: the influence of rational decision style when individuals perform decision-making tasks repeatedly. Universal Access in the Information Society, June 2024. ISSN 1615-5297. doi: 10.1007/s10209-0...

work page doi:10.1007/s12063-024-00534-9 2024

[8] [8]

URLhttps://www

doi: 10.1016/j.tics.2016.07.002. URLhttps://www. sciencedirect.com/science/article/pii/S1364661316300985. S. Sch¨ obel, A. Schmitt, D. Benner, M. Saqr, A. Janson, and J. M. Leimeister. Charting the Evolution and Future of Conversational Agents: A Research Agenda Along Five Waves and New Frontiers.Information Systems Frontiers, 26(2):729–754, Apr. 2024. IS...

work page doi:10.1016/j.tics.2016.07.002 2016

[9] [9]

doi: 10.1016/j.procir.2016.11

ISSN 2212-8271. doi: 10.1016/j.procir.2016.11

work page doi:10.1016/j.procir.2016.11 2016

[10] [10]

URLhttps://www.sciencedirect.com/science/article/ pii/S2212827116312616. M. Tory, L. Bartram, B. Fiore-Gartland, and A. Crisan. Finding Their Data Voice: Practices and Challenges of Dashboard Users.IEEE Computer Graphics and Applications, 43(1): 22–36, Jan. 2023. ISSN 1558-1756. doi: 10.1109/MCG.2021. 3136545. URLhttps://ieeexplore.ieee.org/document/96566...

work page doi:10.1109/mcg.2021 2023

[11] [11]

URLhttps://www

doi: 10.1016/j.ijhcs.2024.103359. URLhttps://www. sciencedirect.com/science/article/pii/S1071581924001423. I. Vessey. Cognitive Fit: A Theory-Based Analysis of the Graphs Versus Tables Literature.Decision Sciences, 22(2):219–240, 1991. ISSN 1540-5915. doi: 10.1111/j. 1540-5915.1991.tb00344.x. URLhttps://onlinelibrary.wiley. com/doi/abs/10.1111/j.1540-5915...

work page doi:10.1016/j.ijhcs.2024.103359 2024

[12] [12]

doi: 10.1109/TVCG.2023.3326525

ISSN 1941-0506. doi: 10.1109/TVCG.2023.3326525. URL https://ieeexplore.ieee.org/document/10296834. Conference Name: IEEE Transactions on Visualization and Computer Graphics. C. Wen, P. Clough, R. Paton, and R. Middleton. Leveraging large language models for thematic analysis: a case study in the charity sector.AI & SOCIETY, 41(1):731–748, Jan. 2026. ISSN ...

work page doi:10.1109/tvcg.2023.3326525 1941

[13] [13]

doi: 10.1016/S0020-7373(84) 80043-7

ISSN 0020-7373. doi: 10.1016/S0020-7373(84) 80043-7. URLhttps://www.sciencedirect.com/science/ article/pii/S0020737384800437. H. Yang, Y. Zeng, H. Xing, and P. Hu. Fatigued by uncertainties: Exploring the cognitive and emotional costs of generative AI usage.International Journal of Information Management, 87: 103010, Apr. 2026. ISSN 0268-4012. doi: 10.101...

work page doi:10.1016/s0020-7373(84 2026