arxiv: 2605.02592 · v1 · submitted 2026-05-04 · 💻 cs.AI

Recognition: 2 theorem links

Foundation-Model-Based Agents in Industrial Automation: Purposes, Capabilities, and Open Challenges

Vincent Henkel , Felix Gehlhoff , David Kube , Asaad Almutareb , Luis Cruz , Bernd Hellingrath , Philip Koch , Christoph Legat

show 8 more authors

Florian Mohr Michael Oberle Felix Ocker Thorsten Schoeler Mario Thron Nico Andre T\"opfer Lucas Vogt Yuchen Xia

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:52 UTC · model grok-4.3

classification 💻 cs.AI

keywords foundation modelsindustrial agentsautomationsystematic reviewtechnology readinessagent capabilitieslimitations

0 comments

The pith

Foundation-model agents for industrial tasks are mostly prototypes, stronger at human interaction and uncertainty than conventional agents but weaker at negotiation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper performs a systematic literature review to determine the current maturity and functional profile of foundation-model-based agents applied to industrial automation such as decision support, monitoring, and process optimization. It establishes that these systems remain largely at early validation stages with limited deployment evidence and shows specific capability shifts relative to traditional industrial agent designs. The survey also catalogs recurring limitations that hinder practical use. A sympathetic reader would care because the findings clarify where development resources can yield the quickest gains and what obstacles must be cleared before these agents move into real production environments. The work further supplies a bridging definition to align agent theory, engineering standards, and foundation-model methods.

Core claim

Through PRISMA screening of 2341 publications and structured coding of 88 selected works, the authors establish that reported foundation-model-based industrial agents sit predominantly at technology readiness levels 4-6, with deployment-oriented evidence at only 9.1 percent. Operational goals concentrate on user assistance, monitoring, and optimization rather than conventional planning and scheduling. Compared with an established baseline for industrial agents, the profile shows gains of 37 percent in human interaction and 35 percent in uncertainty handling but a 39 percent deficit in negotiation. The most frequently reported limitations are lack of generalization, hallucination and output,

What carries the argument

A PRISMA-guided systematic review combined with a structured coding scheme that extracts maturity levels, operational goals, capability differences, and limitations from the literature corpus.

If this is right

Development priorities should shift toward improving generalization and reducing hallucinations to advance systems past the prototype stage.
Agent architectures can exploit foundation models for assistance and monitoring roles while supplementing negotiation tasks with conventional methods.
Reducing inference latency would open real-time production-control applications that currently remain out of reach.
The proposed working definition can support consistent evaluation criteria across manufacturing and process industries.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Initial industrial deployments may succeed fastest when limited to monitoring and assistance tasks where the capability gains are clearest.
Hybrid designs that combine foundation models with established industrial control loops could address both the negotiation deficit and the latency limitation.
Collecting domain-specific industrial datasets could directly target the data-scarcity limitation identified in the review.
Longitudinal studies tracking the same systems over time would reveal whether the current maturity distribution improves as more field data becomes available.

Load-bearing premise

The 88 publications obtained through PRISMA screening form a representative and unbiased sample of the field and the coding scheme consistently measures capabilities and limitations across heterogeneous papers.

What would settle it

A follow-up survey that locates substantially more deployed foundation-model agent systems in live industrial settings or that reports a materially different capability profile would falsify the maturity assessment and the reported differences from conventional agents.

read the original abstract

Foundation models, particularly large language models, are increasingly integrated into agent architectures for industrial tasks such as decision support, process monitoring, and engineering automation. Yet evidence on their purposes, capabilities, and limitations remains fragmented across domains. This work examines how mature foundation-model-based agent systems are in industrial contexts, how their functional profile differs from conventional agent systems, and which limitations persist. A systematic literature survey following the PRISMA 2020 guideline is presented, screening 2,341 publications and synthesising a corpus of 88 publications through a structured coding scheme. The results show that reported systems are predominantly at prototype and early validation stages (75.0% at TRL 4-6), with deployment-oriented evidence remaining rare (9.1%). Operational goals are most frequently positioned in user assistance, monitoring, and process optimisation, while conventional production-control purposes such as planning and scheduling are less prominent. Compared with an established baseline for industrial agent systems, the capability profile reveals substantial gains in human interaction (+37%) and dealing with uncertainty (+35%), but a pronounced deficit in negotiation (-39%). The most widely reported limitations concern lack of generalization, hallucination and output instability, data scarcity, and inference latency. A working definition of foundation-model-based industrial agents is also proposed, bridging conventional agent theory, automation-engineering standards, and the foundation-model paradigm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This review maps the state of foundation-model agents in industry well enough, but its capability comparisons need more on coding reliability.

read the letter

The punchline is that this is a useful literature synthesis on foundation-model-based agents for industrial tasks, but the reported capability differences rest on coding that lacks visible validation. The paper screens 2341 publications down to 88 and applies a structured scheme to extract TRL levels, goals, capabilities, and limitations. It finds 75% at TRL 4-6, rare deployment, emphasis on assistance and monitoring, gains in interaction and uncertainty handling versus a conventional baseline, and shortfalls in negotiation. The limitations section on generalization, hallucinations, data issues, and latency matches what many in the field are seeing. Offering a working definition is a small but helpful addition to pin down the concept. What it does well is follow PRISMA with explicit counts and provide a side-by-side capability view. That gives readers a consolidated picture without having to hunt through dozens of papers themselves. The soft spot is the quantitative capability deltas. Those percentages come from subjective mapping of paper content to categories like human interaction or negotiation. Without reported inter-rater reliability or detailed coding rules in the methods, the +37% and -39% figures are harder to trust at face value. The baseline for comparison also matters a lot here. If the full paper includes more on how the coding was done consistently, that concern shrinks. This paper is for applied researchers and practitioners in industrial automation looking for an overview of the state of the art and open issues. It won't change the field on its own, but it organizes the evidence in a way that makes the gaps visible. I would send this to peer review. The core synthesis holds up, and referees can help tighten the parts around the coding scheme and baseline.

Referee Report

2 major / 2 minor

Summary. This manuscript presents a PRISMA 2020-guided systematic literature review examining foundation-model-based agents in industrial automation. It screens 2,341 publications down to a corpus of 88 papers analyzed via a structured coding scheme. The central claims are that reported systems are predominantly at prototype/early validation stages (75% at TRL 4-6, only 9.1% deployment-oriented), operational goals emphasize user assistance/monitoring/optimization over traditional planning/scheduling, capability profiles show gains versus a conventional-agent baseline in human interaction (+37%) and uncertainty handling (+35%) but a deficit in negotiation (-39%), and the most common limitations are lack of generalization, hallucination/output instability, data scarcity, and inference latency. A bridging definition of such agents is also proposed.

Significance. If the coding scheme proves reliable and the corpus representative, the work would provide a timely, structured snapshot of an emerging subfield at the intersection of foundation models and industrial automation. The quantitative TRL distribution, explicit comparison of capability deltas against an established baseline, and enumeration of persistent limitations offer actionable guidance for prioritizing research directions. The proposed definition that integrates agent theory, automation standards, and the foundation-model paradigm is a constructive contribution that could help standardize terminology.

major comments (2)

[Methods] Methods section (description of the structured coding scheme): The reported capability-profile deltas (+37% human interaction, +35% uncertainty handling, -39% negotiation) and the TRL/limitation tallies are derived from applying the coding scheme to the final 88 papers. The manuscript provides no inter-rater reliability statistics, explicit coding guidelines, or multiple-coder protocol. Without these, the signed differences remain sensitive to individual interpretation and possible baseline mismatch, directly affecting the reliability of the central comparative claims.
[Abstract and Methods] Abstract and Methods section: Key PRISMA elements are omitted or insufficiently detailed, including the precise search strings, database selection criteria, and any inter-coder agreement metrics. These omissions hinder assessment of selection bias and reproducibility of the 88-paper corpus, which underpins all quantitative findings on maturity, goals, and limitations.

minor comments (2)

[Results] The PRISMA flow diagram (presumably Figure 1) would be clearer if exclusion reasons were quantified at each screening stage rather than summarized only in aggregate.
[Results] Tables reporting percentages (e.g., TRL distribution, capability frequencies) should include the corresponding absolute counts (n) alongside percentages to facilitate interpretation of small-sample effects.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thoughtful and constructive review of our systematic literature survey on foundation-model-based agents in industrial automation. The comments highlight key areas for improving methodological transparency, which we address point by point below. We indicate where revisions will be incorporated to strengthen the manuscript while maintaining the integrity of our findings.

read point-by-point responses

Referee: [Methods] Methods section (description of the structured coding scheme): The reported capability-profile deltas (+37% human interaction, +35% uncertainty handling, -39% negotiation) and the TRL/limitation tallies are derived from applying the coding scheme to the final 88 papers. The manuscript provides no inter-rater reliability statistics, explicit coding guidelines, or multiple-coder protocol. Without these, the signed differences remain sensitive to individual interpretation and possible baseline mismatch, directly affecting the reliability of the central comparative claims.

Authors: We appreciate this observation on methodological rigor. The coding scheme was iteratively developed by the author team drawing from established agent taxonomies (e.g., Wooldridge, Russell & Norvig) and industrial automation standards (e.g., ISA-95, RAMI 4.0), with each category accompanied by explicit inclusion rules and examples to reduce ambiguity. Coding of the 88 papers was led by the first author, with co-authors performing independent spot-checks on approximately 20% of the corpus for consistency on TRL assignments, goals, capabilities, and limitations. We agree that full documentation is needed. In the revised manuscript, we will expand the Methods section to include a detailed description of the coding protocol, add an appendix with the complete coding guidelines and category definitions, and explicitly state the single-primary-coder approach with validation steps. We will also clarify how the baseline comparison to conventional agents was aligned with prior surveys to minimize mismatch. However, as the review was not designed with multiple independent coders from the outset, retrospective inter-rater reliability metrics (e.g., Cohen’s kappa) cannot be computed. revision: partial
Referee: [Abstract and Methods] Abstract and Methods section: Key PRISMA elements are omitted or insufficiently detailed, including the precise search strings, database selection criteria, and any inter-coder agreement metrics. These omissions hinder assessment of selection bias and reproducibility of the 88-paper corpus, which underpins all quantitative findings on maturity, goals, and limitations.

Authors: We acknowledge that greater detail on the PRISMA 2020 process would improve reproducibility. The current Methods section outlines the overall screening (2,341 publications to 88) and PRISMA adherence but omits the exact Boolean search strings and database list for brevity. We will revise the Methods section to provide the complete search strings (combinations of terms such as “foundation model” OR “large language model” AND “agent” AND (“industrial automation” OR “manufacturing” OR “process control”)), the databases queried (IEEE Xplore, ACM Digital Library, Scopus, Web of Science), the date range, and the full inclusion/exclusion criteria. A PRISMA flow diagram will be added or expanded. The abstract will be updated to reference these enhancements where space permits. Inter-coder aspects are addressed in the response to the first comment. These changes will directly support evaluation of selection bias and corpus representativeness. revision: yes

standing simulated objections not resolved

Full inter-rater reliability statistics (e.g., Cohen’s kappa) cannot be provided, as the structured coding was performed primarily by a single researcher with co-author spot-checks rather than a multi-coder protocol.

Circularity Check

0 steps flagged

No circularity: survey tallies drawn directly from external literature corpus

full rationale

The paper performs a PRISMA-guided systematic review of 2341 publications, selects 88, and applies a structured coding scheme to extract TRL distributions, operational goals, capability deltas versus a cited baseline, and reported limitations. All quantitative results are explicit counts and comparisons from the screened external papers; no derivations, fitted parameters, predictions, or self-citations function as load-bearing premises that reduce to the paper's own inputs. The proposed working definition is presented as a synthesis bridging existing theories rather than a self-referential result. The methodology is self-contained against the external corpus and external standards (PRISMA 2020), yielding a normal non-finding of circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the application of the PRISMA 2020 review protocol to a curated corpus of publications; no free parameters are fitted and no new entities are postulated.

axioms (1)

domain assumption PRISMA 2020 guideline provides an objective and reproducible method for literature screening and synthesis
Invoked to justify the screening of 2341 publications down to 88 and the subsequent structured coding.

pith-pipeline@v0.9.0 · 5601 in / 1293 out tokens · 39548 ms · 2026-05-08T17:52:48.399772+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

121 extracted references · 92 canonical work pages · 2 internal anchors

[1]

John Wiley & Sons, Chichester (2002)

Wooldridge, M.: An Introduction to Multi Agent Systems. John Wiley & Sons, Chichester (2002)

2002
[2]

Jour- nal of Intelligent Manufacturing 36, 765–800 (2025) https://doi

Reinpold, L.M., Wagner, L.P., Gehlhoff, F., Ramonat, M., Kilthau, M., Gill, M.S., Reif, J.T., Henkel, V., Scholz, L., Fay, A.: Systematic comparison of software agents and digital twins: differ- ences, similarities, and synergies in industrial production. Jour- nal of Intelligent Manufacturing 36, 765–800 (2025) https://doi. org/10.1007/s10845-023-02278-y...

work page doi:10.1007/s10845-023-02278-y 2025
[3]

IEEE Transac- tions on ComputersC-29(12), 1104–1113 (1980) https://doi.org/ 10.1109/TC.1980.1675516

Smith, R.G.: The contract net protocol: High-level communica- tion and control in a distributed problem solver. IEEE Transac- tions on ComputersC-29(12), 1104–1113 (1980) https://doi.org/ 10.1109/TC.1980.1675516

work page doi:10.1109/tc.1980.1675516 1980
[4]

Science China Information Sciences , author =

Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., Zheng, R., Fan, X., Wang, X., Xiong, L., Zhou, Y., Wang, W., Jiang, C., Zou, Y., Liu, X., Yin, Z., Dou, S., Weng, R., Qin, W., Zheng, Y., Qiu, X., Huang, X., Zhang, Q., Gui, T.: The rise and potential of large lan- guage model based agents: a survey. Science Chi...

work page doi:10.1007/s11432-024-4222-0 2025
[5]

Fron- tiers of Computer Science18, 186345 (2024) https://doi.org/10

Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., Zhao, W.X., Wei, Z., Wen, J.: A survey on large language model based autonomous agents. Fron- tiers of Computer Science18, 186345 (2024) https://doi.org/10. 1007/s11704-024-40231-1

2024
[6]

Jour- nal of Manufacturing Systems83, 126–133 (2025) https://doi.org/10

Ren, Y., Liu, Y., Ji, T., Xu, X.: Ai agents and agentic ai– navigating a plethora of concepts for future manufacturing. Jour- nal of Manufacturing Systems83, 126–133 (2025) https://doi.org/10. 1016/j.jmsy.2025.08.017

2025
[7]

Artificial Intelligence Review 59(11) (2026) https://doi.org/10

Ali, M.A., Dornaika, F., Charafed- dine, J.: Agentic ai: a compre- hensive survey of architectures, applications, and future direc- tions. Artificial Intelligence Review 59(11) (2026) https://doi.org/10. 1007/s10462-025-11422-4

2026
[8]

IAS-Forschungsberichte, vol

Xia, Y.: Integrating Large Lan- guage Model Agents with Digital Twins for Industrial Autonomous Systems. IAS-Forschungsberichte, vol. 2026,2. Dissertation of Univer- sity of Stuttgart, Shaker Verlag, (2026)

2026
[11]

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Guo, T., Chen, X., Wang, Y., Chang, R., Pei, S., Chawla, N.V., Wiest, O., Zhang, X.: Large lan- guage model based multi-agents: A survey of progress and chal- lenges. In: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelli- gence (IJCAI-24), Survey Track, pp. 8048–8057 (2024). https://doi. org/10.48550/arXiv.2402.01680

work page internal anchor Pith review doi:10.48550/arxiv.2402.01680 2024
[12]

Cazorla, Cristofer Englund, Markus Tauber, George Nikolakopoulos, and Jose Luis Flores

P´ erez-Cerrolaza, J., Abella, J., Borg, M., Donzella, C., Cerquides, J., Cazorla, F.J., Englund, C., Tauber, M., Nikolakopoulos, G., Flores, J.L.: Artificial intelligence for safety-critical systems in indus- trial and transportation domains: A survey. ACM Computing Surveys 56(7), 176–117640 (2024) https:// doi.org/10.1145/3626314

work page doi:10.1145/3626314 2024
[13]

DIN Media GmbH, Berlin (2018)

VDI/VDE 2653-1 - Multi-agent systems in industrial automation: Fundamentals. DIN Media GmbH, Berlin (2018). https://www. dinmedia.de/de/technische-regel/ vdi-vde-2653-blatt-1/282864028

2018
[14]

From llms to llm- based agents for software engineering: A survey of current, challenges and future,

Jin, H., Huang, L., Cai, H., Yan, J., Li, B., Chen, H.: From LLMs to LLM-based agents for soft- ware engineering: A survey of cur- rent, challenges and future. arXiv preprint (2024) arXiv:2408.02479. Preprint

work page arXiv 2024
[15]

In: arXiv Preprint (2024)

Zhou, J., Lu, Q., Chen, J., Zhu, L., Xu, X., Xing, Z., Harrer, S.: A taxonomy of architecture options for foundation model-based agents: Analysis and decision model. In: arXiv Preprint (2024). https://doi. org/10.48550/arXiv.2408.02920 . Preprint

work page doi:10.48550/arxiv.2408.02920 2024
[16]

Journal of Artificial Intelligence Research84, 29 (2025) https://doi.org/10.1613/ jair.1.18675

Plaat, A., Duijn, M., Stein, N., Preuss, M., Putten, P., Baten- burg, K.J.: Agentic large lan- guage models, a survey. Journal of Artificial Intelligence Research84, 29 (2025) https://doi.org/10.1613/ jair.1.18675

2025
[17]

Robotics and Computer-Integrated Manufactur- ing92, 102883 (2025) https://doi

Zhang, C., Xu, Q., Yu, Y., Zhou, G., Zeng, K., Chang, F., Ding, K.: A survey on potentials, pathways and challenges of large language models in new-generation intelli- gent manufacturing. Robotics and Computer-Integrated Manufactur- ing92, 102883 (2025) https://doi. org/10.1016/j.rcim.2024.102883

work page doi:10.1016/j.rcim.2024.102883 2025
[18]

In: Proceedings of the 31st ACM SIGKDD Conference on Knowl- edge Discovery and Data Mining

Mohammadi, M., Li, Y., Lo, J., Yip, W.: Evaluation and bench- marking of LLM agents: A survey. In: Proceedings of the 31st ACM SIGKDD Conference on Knowl- edge Discovery and Data Mining. ACM, (2025). https://doi.org/10. 1145/3711896.3736570

work page arXiv 2025
[20]

In: Proceedings of the 32nd ACM International Confer- ence on Advances in Geographic Information Systems, pp

Kalantari, S., Wang, Y., Sun, S., Wang, X.: Fleetwiz: An intelli- gent platform for spatio-temporal 24 multi-resource truckload fleet dis- patching. In: Proceedings of the 32nd ACM International Confer- ence on Advances in Geographic Information Systems, pp. 665–668. ACM, (2024). https://doi.org/10. 1145/3678717.3691272

work page arXiv 2024
[21]

In: IFAC- PapersOnLine, vol

Greis, N.P., Cherukuri, H.P., Out- eiro, J.C.M.: Multi-agent systems for manufacturing digital twins: A perspective on agency and large language models. In: IFAC- PapersOnLine, vol. 59, pp. 1612– 1617 (2025). https://doi.org/10. 1016/j.ifacol.2025.09.271

2025
[22]

Advanced Engineer- ing Informatics65, 103888 (2026) https://doi.org/10.1016/j.aei.2025

Zhao, Z., Tang, D., Liu, C., Wang, L., Zhang, Z., Zhu, H., Chen, K., Nie, Q., Ji, Y.: A large language model-based multi-agent manu- facturing system for intelligent shopfloors. Advanced Engineer- ing Informatics65, 103888 (2026) https://doi.org/10.1016/j.aei.2025. 103888

work page doi:10.1016/j.aei.2025 2026
[23]

1080/22348972.2017.1348890

Xie, J., Liu, C.-C.: Multi-agent sys- tems and their applications7(1), 188–197 (2017) https://doi.org/10. 1080/22348972.2017.1348890

work page arXiv 2017
[24]

In: Silva, F.J.G., Pereira, A.B., Campilho, R.D.S.G

Huckert, J.L., Sidorenko, A., Wag- ner, A.: Analysis and assess- ment of multi-agent systems for production planning and control. In: Silva, F.J.G., Pereira, A.B., Campilho, R.D.S.G. (eds.) Flexible Automation and Intelligent Man- ufacturing: Establishing Bridges for More Sustainable Manufactur- ing Systems. Lecture Notes in Mechanical Engineering, pp. 687–
[25]

https://doi.org/10

Springer Nature Switzerland, Cham (2023). https://doi.org/10. 1007/978-3-031-38241-3 77

2023
[26]

PhD thesis, Helmut- Schmidt-Universit¨ at / Universit¨ at der Bundeswehr Hamburg (2023)

Gehlhoff, F.: Agent-based decen- tralised architecture for integrated process planning and schedul- ing of transport and production processes. PhD thesis, Helmut- Schmidt-Universit¨ at / Universit¨ at der Bundeswehr Hamburg (2023). https://doi.org/10.24405/15181

work page doi:10.24405/15181 2023
[27]

The International Journal of Advanced Manufac- turing Technology134, 529–544 (2024) https://doi.org/10.1007/ s00170-024-14112-7

Massouh, B., Danielsson, F., Lennartson, B., Ramasamy, S., Khabbazi, M.: Safe and reconfig- urable manufacturing: safety aware multi-agent control for plug & produce system. The International Journal of Advanced Manufac- turing Technology134, 529–544 (2024) https://doi.org/10.1007/ s00170-024-14112-7

2024
[28]

at – Automatisierungstechnik70(6), 580–598 (2022) https://doi.org/10

Cruz Salazar, L.A., Vogel-Heuser, B.: A CPPS-architecture and work- flow for bringing agent-based tech- nologies as a form of artifi- cial intelligence into practice. at – Automatisierungstechnik70(6), 580–598 (2022) https://doi.org/10. 1515/auto-2022-0008

2022
[29]

Pearson, Hoboken, NJ (2020)

Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 4th edn. Pearson, Hoboken, NJ (2020)

2020
[31]

Robotics and Computer-Integrated Manufactur- ing94, 102982 (2025) https://doi

Chen, C., Zhao, K., Leng, J., Liu, C., Fan, J., Zheng, P.: Integrat- ing large language model and dig- ital twins in the context of indus- try 5.0: Framework, challenges 25 and opportunities. Robotics and Computer-Integrated Manufactur- ing94, 102982 (2025) https://doi. org/10.1016/j.rcim.2025.102982

work page doi:10.1016/j.rcim.2025.102982 2025
[32]

In: DTCO and Com- putational Patterning IV, p

Jee, T.K.: Llm-based overlay issue classification and solution opti- mization in semiconductor manu- facturing. In: DTCO and Com- putational Patterning IV, p. 48. SPIE, (2025). https://doi.org/10. 1117/12.3050976

2025
[33]

at - Automatisierungstechnik69(1), 3– 13 (2021) https://doi.org/10.1515/ auto-2020-0131

M¨ uller, M., M¨ uller, T., Talkhestani, B.A., Marks, P., Jazdi, N., Weyrich, M.: Industrial autonomous sys- tems: a survey on definitions, characteristics and abilities. at - Automatisierungstechnik69(1), 3– 13 (2021) https://doi.org/10.1515/ auto-2020-0131

2021
[34]

Theoretical Issues in Ergonomics Science19(4), 406–430 (2018) https://doi.org/ 10.1080/1463922X.2017.1363314 https://doi.org/10.1080/1463922X.2017.1363314

Kaber, D.B.: A conceptual framework of autonomous and automated agents. Theoretical Issues in Ergonomics Science19(4), 406–430 (2018) https://doi.org/ 10.1080/1463922X.2017.1363314 https://doi.org/10.1080/1463922X.2017.1363314

work page doi:10.1080/1463922x.2017.1363314 2018
[35]

The PRISMA 2020 statement: an updated guideline for reporting systematic reviews

Page, M.J., McKenzie, J.E., Bossuyt, P.M., Boutron, I., Hoffmann, T.C., Mulrow, C.D.,et al.: The prisma 2020 statement: an updated guide- line for reporting systematic reviews. BMJ372, 71 (2021) https://doi.org/10.1136/bmj.n71

work page doi:10.1136/bmj.n71 2020
[36]

Technical report, NASA, Office of Space Access and Technology (April 1995)

Mankins, J.C.: Technology readi- ness levels: A white paper. Technical report, NASA, Office of Space Access and Technology (April 1995). Advanced Concepts Office

1995
[37]

Technical report, Plattform Industrie 4.0 (2018)

Bundesministerium f¨ ur Wirtschaft und Energie (BMWi): Fortschrei- bung der anwendungsszenarien der plattform Industrie 4.0. Technical report, Plattform Industrie 4.0 (2018). AG Forschung und Innova- tion. https://www.plattform-i40. de/IP/Redaktion/DE/Downloads/ Publikation/hm-2018-fb-landkarte. pdf

2018
[38]

In: 2025 IEEE International Symposium on Cir- cuits and Systems (ISCAS), pp

Zhang, W., Yao, T., Zhou, F., Jin, H., Liu, J., Wan, Z., Liu, C., Wang, Y., Chai, B., Chen, X.: A con- versational agent based on large language models for fault recovery planning generation. In: 2025 IEEE International Symposium on Cir- cuits and Systems (ISCAS), pp. 1–

2025
[39]

https://doi.org/ 10.1109/iscas56072.2025.11043523

IEEE, (2025). https://doi.org/ 10.1109/iscas56072.2025.11043523

work page doi:10.1109/iscas56072.2025.11043523 2025
[40]

IEEE Transactions on Smart Grid16(4), 3419–3431 (2025) https://doi.org/ 10.1109/tsg.2025.3568226

Yang, X., Lin, C., Liu, H., Wu, W.: Rl2: Reinforce large language model to assist safe reinforcement learning for energy management of active distribution networks. IEEE Transactions on Smart Grid16(4), 3419–3431 (2025) https://doi.org/ 10.1109/tsg.2025.3568226

work page doi:10.1109/tsg.2025.3568226 2025
[41]

In: 2024 Con- ference of Young Researchers in Electrical and Electronic Engineer- ing (ElCon), pp

Arnautov, K.V., Akimov, D.A.: Application of large language mod- els for optimization of electric power system states. In: 2024 Con- ference of Young Researchers in Electrical and Electronic Engineer- ing (ElCon), pp. 314–317. IEEE, (2024). https://doi.org/10.1109/ elcon61730.2024.10468377

work page arXiv 2024
[43]

Energy Engineering122(7), 2767–2800 (2025) https://doi.org/ 10.32604/ee.2025.065600

Gao, Q., Shen, L., Shi, J., Gu, X., Gu, S., Ge, Y., Xie, Y., Zhu, X., Zang, B., Zhang, M., Nazir, M.S., Ji, J.: Transformer-enhanced intelligent microgrid self-healing: Integrating large language mod- els and adaptive optimization for real-time fault detection and recov- ery. Energy Engineering122(7), 2767–2800 (2025) https://doi.org/ 10.32604/ee.2025.065600

work page doi:10.32604/ee.2025.065600 2025
[44]

Preprint (2025)

Badmus, E.O., Sang, P., Sta- moulis, D., Pandey, A.: Power- Chain: A Verifiable Agentic AI System for Automating Distri- bution Grid Analyses. Preprint (2025). https://doi.org/10.48550/ arXiv.2508.17094

work page arXiv 2025
[45]

In: 2024 IEEE 8th Conference on Energy Internet and Energy System Inte- gration (EI2), pp

Ou, P., Wang, Y., Lin, W., Wu, J.: An llm-based modeling and deci- sion optimization for user-centric electric vehicle charging. In: 2024 IEEE 8th Conference on Energy Internet and Energy System Inte- gration (EI2), pp. 4078–4083. IEEE, (2024). https://doi.org/10. 1109/ei264398.2024.10991378

work page arXiv 2024
[46]

Preprint (2024)

Mongaillard, T., Lasaulce, S., Hicheur, O., Zhang, C., Bariah, L., Varma, V.S., Zou, H., Zhao, Q., Debbah, M.: Large Language Models for Power Scheduling: A User-Centric Approach. Preprint (2024). https://doi.org/10.48550/ arxiv.2407.00476

work page arXiv 2024
[47]

Energies17(8), 1935 (2024) https: //doi.org/10.3390/en17081935

Matharaarachchi, A., Mendis, W., Randunu, K., Silva, D.D., Gamage, G., Moraliyage, H., Mills, N., Jennings, A.: Opti- mizing generative ai chatbots for net-zero emissions energy internet-of-things infrastructure. Energies17(8), 1935 (2024) https: //doi.org/10.3390/en17081935

work page doi:10.3390/en17081935 1935
[48]

In: 2022 15th Interna- tional Conference on Human Sys- tem Interaction (HSI), pp

Gamage, G., Mills, N., Rathnayaka, P., Jennings, A., Alahakoon, D.: Cooee: An artificial intelligence chatbot for complex energy envi- ronments. In: 2022 15th Interna- tional Conference on Human Sys- tem Interaction (HSI), pp. 1–5. IEEE, (2022). https://doi.org/10. 1109/hsi55341.2022.9869464

work page arXiv 2022
[49]

In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Man- agement, pp

Wang, Z., Liu, Z., Zhang, Y., Zhong, A., Wang, J., Yin, F., Fan, L., Wu, L., Wen, Q.: Rca- gent: Cloud root cause analysis by autonomous agents with tool- augmented large language models. In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Man- agement, pp. 4966–4974. ACM, (2024). https://doi.org/10.1145/ 3627673.3680016

work page arXiv 2024
[50]

Assetopsbench: Benchmarking ai agents for task automation in industrial asset operations and maintenance, 2025

Patel, D., Lin, S., Rayfield, J., Zhou, N., Vaculin, R., Martinez, N., O’donncha, F., Kalagnanam, J.: AssetOpsBench: Benchmark- ing AI Agents for Task Automa- tion in Industrial Asset Opera- tions and Maintenance. Preprint (2025). https://doi.org/10.48550/ arXiv.2506.03828

work page arXiv 2025
[51]

In: Proceedings of the 31st ACM SIGKDD Confer- ence on Knowledge Discovery and Data Mining V.2, pp

Shi, B., Luo, Y., Wang, J., Zhao, Y., Zhang, S., Hao, B., Zhao, C., Sun, Y., Zhang, Z., Sun, R., Li, H., Song, W., Chen, X., Miao, J., Pei, D.: Flowxpert: Expertizing trou- bleshooting workflow orchestration 27 with knowledge base and multi- agent coevolution. In: Proceedings of the 31st ACM SIGKDD Confer- ence on Knowledge Discovery and Data Mining V.2, ...

work page arXiv 2025
[52]

Preprint (2025)

Jeong, C., Sim, S., Cho, H., Kim, S., Shin, B.: E2E Process Automation Leveraging Generative AI and IDP-Based Automation Agent: A Case Study on Corpo- rate Expense Processing. Preprint (2025). https://doi.org/10.48550/ arXiv.2505.20733

work page arXiv 2025
[53]

In: 2025 Emerging Technologies for Intel- ligent Systems (ETIS), pp

Paulose, R., Neelanath, V., George, M.: Domain agnostic agentic ai: Enabling autonomous automation with smartgenie copilot. In: 2025 Emerging Technologies for Intel- ligent Systems (ETIS), pp. 1–6. IEEE, (2025). https://doi.org/10. 1109/etis64005.2025.10961403

work page arXiv 2025
[54]

Drones 9(3), 213 (2025) https://doi.org/ 10.3390/drones9030213

Sezgin, A.: Scenario-driven evalu- ation of autonomous agents: Inte- grating large language model for uav mission reliability. Drones 9(3), 213 (2025) https://doi.org/ 10.3390/drones9030213

work page doi:10.3390/drones9030213 2025
[55]

Transportation Research Part E: Logistics and Transporta- tion Review200, 104142 (2025) https://doi.org/10.1016/j.tre.2025

Yu, J., Wang, Y., Ma, W.: Large language model-enhanced rein- forcement learning for generic bus holding control strategies. Transportation Research Part E: Logistics and Transporta- tion Review200, 104142 (2025) https://doi.org/10.1016/j.tre.2025. 104142

work page doi:10.1016/j.tre.2025 2025
[56]

IEEE Intelligent Trans- portation Systems Magazine17(4), 96–111 (2025) https://doi.org/10

Xu, Z., Chen, T., Huang, Z., Xing, Y., Chen, S.: Personaliz- ing driver agent using large lan- guage models for driving safety and smarter human–machine inter- actions. IEEE Intelligent Trans- portation Systems Magazine17(4), 96–111 (2025) https://doi.org/10. 1109/mits.2025.3551736

work page arXiv 2025
[57]

Preprint (2024)

Yang, H., Siew, M., Joe-Wong, C.: An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems. Preprint (2024). https:// doi.org/10.48550/arxiv.2403.16809

work page doi:10.48550/arxiv.2403.16809 2024
[58]

Data-Centric Engineering6, 31 (2025) https: //doi.org/10.1017/dce.2025.10010

Sawada, T., Mizuno, M., Hasegawa, T., Yokoyama, K., Kono, M.: Office-in-the-loop: an investigation into agentic ai for advanced building hvac control systems. Data-Centric Engineering6, 31 (2025) https: //doi.org/10.1017/dce.2025.10010

work page doi:10.1017/dce.2025.10010 2025
[59]

IEEE Transactions on Cognitive Commu- nications and Networking11(6), 4313–4327 (2025) https://doi.org/ 10.1109/tccn.2025.3548615

Wang, Y., Afzal, M.M., Li, Z., Zhou, J., Feng, C., Guo, S., Quek, T.Q.S.: Large language model as a catalyst: A paradigm shift in base station siting optimization. IEEE Transactions on Cognitive Commu- nications and Networking11(6), 4313–4327 (2025) https://doi.org/ 10.1109/tccn.2025.3548615

work page doi:10.1109/tccn.2025.3548615 2025
[60]

Electronics13(13), 2529 (2024) https://doi.org/10

Wang, D., Wang, Y., Jiang, X., Zhang, Y., Pang, Y., Zhang, M.: When large language models meet optical networks: Paving the way for automation. Electronics13(13), 2529 (2024) https://doi.org/10. 3390/electronics13132529

2024
[61]

Advanced Robotics38(18), 1335–1348 (2024) https://doi.org/ 10.1080/01691864.2024.2366974

Wanna, S., Parra, F., Valner, R., Kruusam¨ ae, K., Pryor, M.: 28 Unlocking underrepresented use- cases for large language model- driven human-robot task plan- ning. Advanced Robotics38(18), 1335–1348 (2024) https://doi.org/ 10.1080/01691864.2024.2366974

work page doi:10.1080/01691864.2024.2366974 2024
[62]

In: 2024 IEEE International Conference on Indus- trial Technology (ICIT), pp

Gamage, G., Mills, N., Silva, D.D., Manic, M., Moraliyage, H., Jen- nings, A., Alahakoon, D.: Multi- agent rag chatbot architecture for decision support in net-zero emis- sion energy systems. In: 2024 IEEE International Conference on Indus- trial Technology (ICIT), pp. 1–6. IEEE, (2024). https://doi.org/10. 1109/icit58233.2024.10540920

work page arXiv 2024
[63]

Frontiers in Physics 13, 1613499 (2025) https://doi

Chen, R., He, C.: Fostering col- lective intelligence in cpss: an llm- driven multi-agent cooperative tun- ing framework. Frontiers in Physics 13, 1613499 (2025) https://doi. org/10.3389/fphy.2025.1613499

work page doi:10.3389/fphy.2025.1613499 2025
[64]

In: Proceedings of the 32nd ACM International Conference on Mul- timedia, pp

Wang, S., Liang, C., Gao, Y., Liu, Y., Li, J., Wang, H.: Decod- ing urban industrial complexity: Enhancing knowledge-driven insights via industryscopegpt. In: Proceedings of the 32nd ACM International Conference on Mul- timedia, pp. 4757–4765. ACM, (2024). https://doi.org/10.1145/ 3664647.3681705

work page arXiv 2024
[65]

In: Findings of the Associa- tion for Computational Linguistics: NAACL 2024, pp

Su, J., Cardie, C., Nakov, P.: Adapting fake news detection to the era of large language mod- els. In: Findings of the Associa- tion for Computational Linguistics: NAACL 2024, pp. 1473–1490. Asso- ciation for Computational Linguis- tics, (2024). https://doi.org/10. 18653/v1/2024.findings-naacl.95

2024
[66]

In: 2024 IEEE 35th International Symposium on Software Reli- ability Engineering Workshops (ISSREW), pp

Wu, Y., Wang, H., Zhang, Y., Li, X., Wu, H., Fan, M., Liu, T.: Business compliance detection of smart contracts in electricity and carbon trading scenarios. In: 2024 IEEE 35th International Symposium on Software Reli- ability Engineering Workshops (ISSREW), pp. 177–178. IEEE, (2024). https://doi.org/10.1109/ issrew63542.2024.00074

work page arXiv 2024
[67]

In: Proceedings of the 47th Inter- national ACM SIGIR Conference on Research and Development in Information Retrieval, pp

Zhou, R., Yang, Y., Wen, M., Wen, Y., Wang, W., Xi, C., Xu, G., Yu, Y., Zhang, W.: Trad: Enhancing llm agents with step-wise thought retrieval and aligned decision. In: Proceedings of the 47th Inter- national ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–13. ACM, (2024). https://doi.org/10. 1145/3626772.3657788

work page arXiv 2024
[68]

In: Pro- ceedings of the 11th International Conference on Information Sys- tems Security and Privacy, pp

Mai, K., Ghate, N., Lee, J., Beu- ran, R.: Llm-based fine-grained abac policy generation. In: Pro- ceedings of the 11th International Conference on Information Sys- tems Security and Privacy, pp. 204–212. SCITEPRESS - Sci- ence and Technology Publications, (2025). https://doi.org/10.5220/ 0013225500003899

2025
[69]

Fine-tuning and prompt engineering of llms, for the creation of multi-agent ai for addressing sustainable protein production challenges.arXiv preprint arXiv:2506.20598, 2025

Kalian, A.D., Lee, J., Johannes- son, S.P., Otte, L., Hogstrand, C., Guo, M.: Fine-Tuning and Prompt Engineering of LLMs, for the Creation of Multi-Agent AI for Addressing Sustainable Protein Production Challenges. Preprint (2025). https://doi.org/10.48550/ arXiv.2506.20598 29

work page arXiv 2025
[70]

Preprint (2024)

Chen, X., Zhang, L.: Revolution- izing Bridge Operation and Main- tenance with LLM-based Agents: An Overview of Applications and Insights. Preprint (2024). https:// doi.org/10.48550/arxiv.2407.10064

work page doi:10.48550/arxiv.2407.10064 2024
[71]

In: 2024 IEEE 9th Inter- national Conference for Conver- gence in Technology (I2CT), pp

Kar, I., Ralte, Z., Shivakumara, M., Roy, R., Kumari, A.: Agents are all you need: Elevating trading dynamics with advanced generative ai-driven conversational llm agents and tools. In: 2024 IEEE 9th Inter- national Conference for Conver- gence in Technology (I2CT), pp. 1–

2024
[72]

https://doi.org/ 10.1109/i2ct61223.2024.10543356

IEEE, (2024). https://doi.org/ 10.1109/i2ct61223.2024.10543356

work page doi:10.1109/i2ct61223.2024.10543356 2024
[73]

In: 2023 ACM/IEEE 5th Workshop on Machine Learn- ing for CAD (MLCAD), pp

He, Z., Wu, H., Zhang, X., Yao, X., Zheng, S., Zheng, H., Yu, B.: Chateda: A large language model powered autonomous agent for eda. In: 2023 ACM/IEEE 5th Workshop on Machine Learn- ing for CAD (MLCAD), pp. 1–6. IEEE, (2023). https://doi.org/10. 1109/mlcad58807.2023.10299852

work page arXiv 2023
[74]

IEEE Transactions on Computer- Aided Design of Integrated Cir- cuits and Systems45(1), 31– 44 (2026) https://doi.org/10.1109/ tcad.2025.3573228

Shen, J., Chen, Z., Zhuang, J., Huang, J., Yang, F., Shang, L., Bi, Z., Yan, C., Zhou, D., Zeng, X.: Atelier: An automated analog cir- cuit design framework via multiple large language model-based agents. IEEE Transactions on Computer- Aided Design of Integrated Cir- cuits and Systems45(1), 31– 44 (2026) https://doi.org/10.1109/ tcad.2025.3573228

work page arXiv 2026
[75]

In: Proceedings of the 2025 International Symposium on Phys- ical Design, pp

Chang, C.-C., Ho, C.-T., Li, Y., Chen, Y., Ren, H.: Drc-coder: Automated drc checker code gen- eration using llm autonomous agent. In: Proceedings of the 2025 International Symposium on Phys- ical Design, pp. 143–151. ACM, (2025). https://doi.org/10.1145/ 3698364.3705347

work page arXiv 2025
[76]

IEEE Transactions on Computer-Aided Design of Inte- grated Circuits and Systems44(8), 3126–3139 (2025) https://doi.org/ 10.1109/tcad.2025.3529805

Liu, B., Zhang, H., Gao, X., Kong, Z., Tang, X., Lin, Y., Wang, R., Huang, R.: Layoutcopilot: An llm- powered multiagent collaborative framework for interactive analog layout design. IEEE Transactions on Computer-Aided Design of Inte- grated Circuits and Systems44(8), 3126–3139 (2025) https://doi.org/ 10.1109/tcad.2025.3529805

work page doi:10.1109/tcad.2025.3529805 2025
[77]

Divergent thoughts toward one goal: Llm-based multi- agent collaboration system for electronic design automa- tion.arXiv preprint arXiv:2502.10857,

Wu, H., Zheng, H., He, Z., Yu, B.: Divergent Thoughts toward One Goal: LLM-based Multi-Agent Col- laboration System for Electronic Design Automation. Preprint (2025). https://doi.org/10.48550/ arxiv.2502.10857

work page arXiv 2025
[78]

Lykov et al

Lykov, A., Dronova, M., Naglov, N., Litvinov, M., Satsevich, S., Bazhenov, A., Berman, V., Shcherbak, A., Tsetserukou, D.: LLM-MARS: Large Language Model for Behavior Tree Genera- tion and NLP-enhanced Dialogue in Multi-Agent Robot Systems. Preprint (2023). https://doi.org/ 10.48550/arxiv.2312.09348

work page doi:10.48550/arxiv.2312.09348 2023
[79]

Preprint (2024)

Xia, Y., Dittler, D., Jazdi, N., Chen, H., Weyrich, M.: LLM exper- iments with simulation: Large Lan- guage Model Multi-Agent System for Simulation Model Parametriza- tion in Digital Twins. Preprint (2024). https://doi.org/10.48550/ arxiv.2405.18092

work page arXiv 2024
[80]

Scientific Reports 15(1), 12474 (2025) https://doi

Kim, S., Yu, Y., Seo, H.: Arti- ficial intelligence orchestration for 30 text-based ultrasonic simulation via self-review by multi-large language model agents. Scientific Reports 15(1), 12474 (2025) https://doi. org/10.1038/s41598-025-97498-y

work page doi:10.1038/s41598-025-97498-y 2025
[81]

Theoretical and Applied Mechanics Letters 15(3), 100594 (2025) https://doi

Dong, Z., Lu, Z., Yang, Y.: Fine- tuning a large language model for automating computational fluid dynamics simulations. Theoretical and Applied Mechanics Letters 15(3), 100594 (2025) https://doi. org/10.1016/j.taml.2025.100594

work page doi:10.1016/j.taml.2025.100594 2025
[82]

IEEE Trans- actions on Smart Grid16(6), 5556–5572 (2025) https://doi.org/ 10.1109/tsg.2025.3589114

Jia, M., Cui, Z., Hug, G.: Enhanc- ing llms for power system simu- lations: A feedback-driven multi- agent framework. IEEE Trans- actions on Smart Grid16(6), 5556–5572 (2025) https://doi.org/ 10.1109/tsg.2025.3589114

work page doi:10.1109/tsg.2025.3589114 2025
[83]

Preprint (2024)

Liu, J., Lin, F., Li, X., Lim, K.H., Zhao, S.: Physics-Informed Autonomous LLM Agents for Explainable Power Electronics Modulation Design. Preprint (2024). https://doi.org/10.48550/ arxiv.2411.14214

work page arXiv 2024
[84]

ACS Nano19(26), 23840– 23858 (2025) https://doi.org/10

Lin, J., Zhao, D., Lu, S., Li, R., Xu, X., Wang, Z., Li, W., Ji, Y., Zhang, C., Shi, L., Jin, X., Gao, H., Wang, G.: Conversational large- language-model artificial intelli- gence agent for accelerated syn- thesis of metal–organic frame- works catalysts in olefin hydrogena- tion. ACS Nano19(26), 23840– 23858 (2025) https://doi.org/10. 1021/acsnano.5c04880

2025
[85]

In: Volume 3B: 50th Design Automation Conference (DAC), pp

Ataei, M., Cheong, H., Grandi, D., Wang, Y., Morris, N., Tessier, A.: Elicitron: A framework for simulating design requirements elicitation using large language model agents. In: Volume 3B: 50th Design Automation Conference (DAC), pp. 03–03056. American Society of Mechanical Engineers, (2024). https://doi.org/10.1115/ detc2024-143598

2024

Showing first 80 references.