A Technical Taxonomy of LLM Agent Communication Protocols
Pith reviewed 2026-06-26 18:32 UTC · model grok-4.3
The pith
A taxonomy of LLM agent communication protocols identifies five dimensions and shows hybrid payloads with session-state persistence in all sampled agent-to-agent cases.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Following an established iterative method with five rounds on nine actively maintained open-source protocols, the authors define a taxonomy whose five dimensions are counterparty, payload, interaction state, discovery mechanism, and schema flexibility; classification shows that all sampled agent-to-agent protocols combine hybrid payloads with session-state persistence, most support multiple predefined schemas while two negotiate schemas at runtime, and decentralized discovery remains rare, leading to the suggestion of short-term convergence toward unified agent-to-agent and agent-to-context communication but long-term evolution toward a federated layered protocol stack.
What carries the argument
The five taxonomy dimensions (counterparty, payload, interaction state, discovery mechanism, schema flexibility) that classify protocols by how they handle communication targets, data formats, state management, partner location, and schema handling.
If this is right
- Short-term development will favor protocols that unify agent-to-agent and agent-to-context communication.
- No single protocol will simultaneously maximize versatility, efficiency, and portability.
- The field will evolve toward a federated, layered protocol stack rather than one dominant standard.
- Protocol selection can be guided by the five dimensions to match specific use cases.
- Research gaps remain in areas such as privacy and policy enforcement within these protocols.
Where Pith is reading between the lines
- If the taxonomy dimensions prove stable, future protocol designers could use them as a checklist to ensure coverage of discovery and schema flexibility.
- The observed trend toward runtime schema negotiation may reduce the need for upfront standardization across different agent ecosystems.
- Extending the taxonomy to include closed-source or enterprise protocols could test whether the patterns of hybrid payloads and persistent sessions hold beyond open-source examples.
- The rarity of decentralized discovery suggests that current systems still rely on central registries, which could create single points of failure in large-scale agent networks.
Load-bearing premise
The nine actively maintained open-source protocols with demonstrable adoption sufficiently represent the broader space of LLM agent communication protocols.
What would settle it
Discovery of one or more widely adopted protocols that use only non-hybrid payloads or lack session-state persistence while still qualifying as agent-to-agent communication.
read the original abstract
As large language models (LLMs) advance and multi-agent systems aim to overcome the limits of standalone agents, robust communication protocols are becoming essential infrastructure for distributed agent networks. Nonetheless, the fragmented protocol landscape presents a significant interoperability challenge. This study develops a technical taxonomy to classify and analyze LLM agent communication protocols. Following an established iterative method, we defined the taxonomy's purpose, meta-characteristic, and ending conditions, then performed five iterations, three empirical-to-conceptual and two conceptual-to-empirical, on nine actively maintained open-source protocols with demonstrable adoption. The taxonomy comprises five dimensions: counterparty, payload, interaction state, discovery mechanism, and schema flexibility. Classification reveals recurring architectural patterns: all sampled agent-to-agent protocols combine hybrid payloads with session-state persistence; most protocols support multiple predefined schemas, and two negotiate schemas at runtime, indicating a trend toward schema flexibility; decentralized discovery remains rare. Analysis suggests short-term convergence pressure toward protocols unifying agent-to-agent and agent-to-context (tool and data) communication. Long-term, however, no single protocol is likely to maximize versatility, efficiency, and portability simultaneously. The field will more likely evolve toward a federated, layered protocol stack. The framework guides protocol selection and highlights open research gaps such as privacy and policy enforcement.}
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a technical taxonomy for LLM agent communication protocols by applying an established iterative taxonomy construction method (five iterations: three empirical-to-conceptual, two conceptual-to-empirical) to nine actively maintained open-source protocols with demonstrable adoption. The resulting taxonomy has five dimensions (counterparty, payload, interaction state, discovery mechanism, schema flexibility). Classification of the sample reveals that all agent-to-agent protocols use hybrid payloads with session-state persistence, most support multiple predefined schemas (with two enabling runtime negotiation), and decentralized discovery is rare. The analysis infers short-term convergence pressure toward unified agent-to-agent and agent-to-context protocols and long-term evolution toward a federated layered stack, while positioning the taxonomy as a guide for protocol selection and a source of research gaps (e.g., privacy, policy enforcement).
Significance. If the taxonomy dimensions and observed patterns are robust, the work supplies a grounded, bottom-up classification framework for a fragmented area of multi-agent systems research. The explicit use of an established iterative method and the absence of fitted parameters or circular derivations are strengths that increase the classification's credibility. The framework could usefully inform protocol design choices and surface under-explored issues such as privacy and policy enforcement.
major comments (2)
- [Methodology / protocol selection description] The section describing protocol selection states that the nine protocols were chosen as 'actively maintained open-source protocols with demonstrable adoption' but supplies no explicit search protocol, inclusion/exclusion criteria, or saturation argument. Because the central claims about recurring architectural patterns and field-level trends (hybrid payloads + session-state persistence in all agent-to-agent cases; trend toward schema flexibility; short-term convergence) rest on the representativeness of this sample, the lack of a documented selection procedure limits the strength of the generalizations.
- [Results / classification patterns] Results section on classification patterns: the claim that 'all sampled agent-to-agent protocols combine hybrid payloads with session-state persistence' is presented as a key regularity, yet the manuscript does not report how borderline cases were resolved across the five iterations or whether any protocol required reclassification after dimension refinement. This detail is load-bearing for the reliability of the pattern and the subsequent convergence inference.
minor comments (2)
- [Methods] The abstract and methods paragraph list the iteration counts and types but would benefit from a compact table summarizing the purpose, meta-characteristic, and ending conditions for each iteration to improve traceability.
- [Taxonomy dimensions] The term 'hybrid payloads' is used repeatedly in the results but receives only a brief definition in the taxonomy dimension section; an expanded example or diagram would aid readers unfamiliar with the protocols.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. The two major comments identify genuine opportunities to strengthen methodological transparency and reporting of the iterative classification process. We address each point below and will incorporate revisions in the next version of the manuscript.
read point-by-point responses
-
Referee: The section describing protocol selection states that the nine protocols were chosen as 'actively maintained open-source protocols with demonstrable adoption' but supplies no explicit search protocol, inclusion/exclusion criteria, or saturation argument. Because the central claims about recurring architectural patterns and field-level trends (hybrid payloads + session-state persistence in all agent-to-agent cases; trend toward schema flexibility; short-term convergence) rest on the representativeness of this sample, the lack of a documented selection procedure limits the strength of the generalizations.
Authors: We agree that the absence of an explicit search protocol and documented inclusion/exclusion criteria is a limitation that weakens the strength of the generalizations drawn from the sample. The protocols were identified through a combination of literature review and community knowledge of actively maintained projects with visible adoption (e.g., GitHub stars, integrations, and citations), but this process was not formalized in the manuscript. In revision we will add a dedicated subsection under Methodology that states the practical selection criteria used, lists the specific indicators of adoption considered, and explains how the five-iteration process itself served as an informal saturation check. This addition will not alter the sample but will make the selection rationale reproducible and will qualify the scope of the observed patterns. revision: yes
-
Referee: Results section on classification patterns: the claim that 'all sampled agent-to-agent protocols combine hybrid payloads with session-state persistence' is presented as a key regularity, yet the manuscript does not report how borderline cases were resolved across the five iterations or whether any protocol required reclassification after dimension refinement. This detail is load-bearing for the reliability of the pattern and the subsequent convergence inference.
Authors: We accept that the manuscript should have reported the handling of borderline cases and any reclassifications. In the actual execution of the five iterations, no protocol required reclassification after dimension refinement, and the agent-to-agent subset exhibited unambiguous hybrid payloads together with session-state persistence from the second iteration onward; no borderline cases arose that needed explicit resolution rules. We will revise the Results section to include a short paragraph summarizing the iteration outcomes, noting the stability of the key pattern and the absence of reclassifications. This addition will increase transparency without changing the reported classifications. revision: yes
Circularity Check
No circularity: taxonomy constructed bottom-up from direct protocol inspection
full rationale
The paper develops a taxonomy via an established iterative method applied to nine protocols, defining dimensions (counterparty, payload, interaction state, discovery mechanism, schema flexibility) and classifying observed patterns such as hybrid payloads with session-state persistence. No mathematical derivations, fitted parameters, predictions, or self-citations appear as load-bearing steps in the provided text. The classification reduces directly to the inspected protocols rather than to any prior fitted quantities or author-defined uniqueness theorems. The representativeness concern raised by the skeptic is a question of external validity, not a circular reduction within the derivation chain itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption An established iterative taxonomy development method (purpose, meta-characteristic, ending conditions, alternating empirical-conceptual iterations) is appropriate and sufficient for classifying communication protocols.
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2402.03578 (2024)
Han, S., Zhang, Q., Yao, Y., Jin, W., Xu, Z.: LLM multi-agent systems: Challenges and open problems. arXiv preprint arXiv:2402.03578 (2024)
Pith/arXiv arXiv 2024
-
[2]
arXiv preprint arXiv:2404.11584 (2024)
Masterman, T., Besen, S., Sawtell, M., Chao, A.: The landscape of emerging AI agent architectures for reasoning, planning, and tool calling: A survey. arXiv preprint arXiv:2404.11584 (2024)
Pith/arXiv arXiv 2024
-
[3]
: The rise and potential of large language model based agents: A survey
Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., et al. : The rise and potential of large language model based agents: A survey. Science China Information Sciences 68(2), 121101 (2025)
2025
-
[4]
In: International Conference on Learning Representations, vol
Yu, J., Wang, X., Tu, S., Cao, S., Zhang-Li, D., Lv, X., Peng, H., Yao, Z., Zhang, X., Li, H., et al.: Kola: Carefully benchmarking world knowledge of large language models. In: International Conference on Learning Representations, vol. 2024, pp. 44594–44637 (2024)
2024
-
[5]
arXiv preprint arXiv:2303.08774 (2023)
Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023)
Pith/arXiv arXiv 2023
-
[6]
: A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity
Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., Lovenia, H., Ji, Z., Yu, T., Chung, W., et al. : A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity. In: Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-pacific Cha...
2023
-
[7]
Advances in Neural Information Processing Systems 36, 38975–38987 (2023)
Valmeekam, K., Marquez, M., Olmo, A., Sreedharan, S., Kambhampati, S.: Planbench: An extensible benchmark for evaluating large language models on planning and reasoning about change. Advances in Neural Information Processing Systems 36, 38975–38987 (2023)
2023
-
[8]
: Agentbench: Evaluating LLMs as agents
Liu, X., Yu, H., Zhang, H., Xu, Y., Lei, X., Lai, H., Gu, Y., Ding, H., Men, K., Yang, K., et al. : Agentbench: Evaluating LLMs as agents. In: International Conference on Learning Representations, vol. 2024, pp. 52989–53046 (2024) 24
2024
-
[9]
Advances in neural information processing systems 36, 58202–58245 (2023)
Sun, H., Zhuang, Y., Kong, L., Dai, B., Zhang, C.: Adaplanner: Adaptive planning from feedback with language models. Advances in neural information processing systems 36, 58202–58245 (2023)
2023
-
[10]
In: AAAI 2025 Workshop LM4Plan (2025)
Hsiao, V., Fine-Morris, M., Roberts, M., Smith, L.N., Hiatt, L.M.: A critical assessment of LLMs for solving multi-step problems: Preliminary results. In: AAAI 2025 Workshop LM4Plan (2025)
2025
-
[11]
arXiv preprint arXiv:2402.01680 (2024)
Guo, T., Chen, X., Wang, Y., Chang, R., Pei, S., Chawla, N.V., Wiest, O., Zhang, X.: Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680 (2024)
Pith/arXiv arXiv 2024
-
[12]
arXiv preprint arXiv:2306.03314 (2023)
Talebirad, Y., Nadiri, A.: Multi-agent collaboration: Harnessing the power of intelligent LLM agents. arXiv preprint arXiv:2306.03314 (2023)
Pith/arXiv arXiv 2023
-
[13]
MIT press, Cambridge, MA (2015)
Malone, T.W., Bernstein, M.: Handbook of Collective Intelligence. MIT press, Cambridge, MA (2015)
2015
-
[14]
: Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors
Chen, W., Su, Y., Zuo, J., Yang, C., Yuan, C., Chan, C.-M., Yu, H., Lu, Y., Hung, Y.-H., Qian, C., et al. : Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors. In: International Conference on Learning Representations, vol. 2024, pp. 20094–20136 (2024)
2024
-
[15]
: MetaGPT: Meta programming for a multi-agent collaborative framework
Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y., Wang, J., Zhang, C., Yau, S., Lin, Z., Zhou, L., et al. : MetaGPT: Meta programming for a multi-agent collaborative framework. In: International Conference on Learning Representations, vol. 2024, pp. 23247–23275 (2024)
2024
-
[16]
arXiv preprint arXiv:2502.14321 (2025)
Yan, B., Zhou, Z., Zhang, L., Zhang, L., Zhou, Z., Miao, D., Li, Z., Li, C., Zhang, X.: Beyond self-talk: A communication-centric survey of LLM-based multi-agent systems. arXiv preprint arXiv:2502.14321 (2025)
Pith/arXiv arXiv 2025
-
[17]
Microsoft: AutoGen. GitHub. Accessed 21 Jul 2025 (2023)
2025
-
[18]
: Autogen: Enabling next-gen LLM applications via multi-agent conversations
Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E., Jiang, L., Zhang, X., Zhang, S., Liu, J., et al. : Autogen: Enabling next-gen LLM applications via multi-agent conversations. In: First Conference on Language Modeling (2024)
2024
-
[19]
CrewAI: CrewAI. GitHub. Accessed 21 Jul 2025 (2023)
2025
-
[20]
CAMEL: CAMEL. GitHub. Accessed 21 Jul 2025 (2023)
2025
-
[21]
https://arxiv.org/abs/2303.17760
Li, G., Hammoud, H.A.A.K., Itani, H., Khizbullin, D., Ghanem, B.: CAMEL: Communicative Agents for ”Mind” Exploration of Large Language Model Society (2023). https://arxiv.org/abs/2303.17760
Pith/arXiv arXiv 2023
-
[22]
LangChain: LangGraph. GitHub. Accessed 21 Jul 2025 (2024) 25
2025
-
[23]
Cemri, M., Pan, M.Z., Yang, S., Agrawal, L.A., Chopra, B., Tiwari, R., Keutzer, K., Parameswaran, A., Klein, D., Ramchandran, K., et al.: Why do multi-agent LLM systems fail? arXiv preprint arXiv:2503.13657 (2025)
Pith/arXiv arXiv 2025
-
[24]
arXiv preprint arXiv:2506.19676 (2025)
Kong, D., Lin, S., Xu, Z., Wang, Z., Li, M., Li, Y., Zhang, Y., Peng, H., Chen, X., Sha, Z., et al.: A survey of LLM-driven AI agent communication: Protocols, security risks, and defense countermeasures. arXiv preprint arXiv:2506.19676 (2025)
arXiv 2025
-
[25]
arXiv preprint arXiv:2505.02279 (2025)
Ehtesham, A., Singh, A., Gupta, G.K., Kumar, S.: A survey of agent inter- operability protocols: Model Context Protocol (MCP), Agent Communication Protocol (ACP), Agent-to-Agent protocol (A2A), and Agent Network Protocol (ANP). arXiv preprint arXiv:2505.02279 (2025)
arXiv 2025
-
[26]
arXiv preprint arXiv:2410.11905 (2024)
Marro, S., La Malfa, E., Wright, J., Li, G., Shadbolt, N., Wooldridge, M., Torr, P.: A scalable communication protocol for networks of large language models. arXiv preprint arXiv:2410.11905 (2024)
arXiv 2024
-
[27]
arXiv preprint arXiv:2510.13821 (2025)
Li, X., Liu, M., Yuen, C.: LLM agent communication protocol (LACP) requires urgent standardization: A telecom-inspired protocol is necessary. arXiv preprint arXiv:2510.13821 (2025)
arXiv 2025
-
[28]
In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp
Russell, S.: Learning agents for uncertain environments. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 101–103 (1998)
1998
-
[29]
The knowledge engineering review 10(2), 115–152 (1995)
Wooldridge, M., Jennings, N.R.: Intelligent agents: Theory and practice. The knowledge engineering review 10(2), 115–152 (1995)
1995
-
[30]
Oxford University Press, New York, NY (1995)
Mele, A.R.: Autonomous Agents: From Self-control to Autonomy. Oxford University Press, New York, NY (1995)
1995
-
[31]
Autonomous agents and multi-agent systems 1(1), 7–38 (1998)
Jennings, N.R., Sycara, K., Wooldridge, M.: A roadmap of agent research and development. Autonomous agents and multi-agent systems 1(1), 7–38 (1998)
1998
-
[32]
The knowledge engineering review 11(3), 205–244 (1996)
Nwana, H.S.: Software agents: An overview. The knowledge engineering review 11(3), 205–244 (1996)
1996
-
[33]
arXiv preprint arXiv:2412.19437 (2024)
Liu, A., Feng, B., Xue, B., Wang, B., Wu, B., Lu, C., Zhao, C., Deng, C., Zhang, C., Ruan, C., et al.: Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437 (2024)
Pith/arXiv arXiv 2024
-
[34]
https://www
Anthropic: System Card: Claude Opus 4 & Claude Sonnet 4. https://www. anthropic.com/claude-4-system-card . Accessed 21 Jul 2025 (2025)
2025
-
[35]
In: International Conference on Learning Representations, vol
Chen, X., Lin, M., Schärli, N., Zhou, D.: Teaching large language models to self- debug. In: International Conference on Learning Representations, vol. 2024, pp. 26 8746–8825 (2024)
2024
-
[36]
: Self-refine: Iterative refine- ment with self-feedback
Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., Alon, U., Dziri, N., Prabhumoye, S., Yang, Y., et al. : Self-refine: Iterative refine- ment with self-feedback. Advances in neural information processing systems 36, 46534–46594 (2023)
2023
-
[37]
Advances in neural information processing systems 35, 22199–22213 (2022)
Kojima, T., Gu, S.S., Reid, M., Matsuo, Y., Iwasawa, Y.: Large language models are zero-shot reasoners. Advances in neural information processing systems 35, 22199–22213 (2022)
2022
-
[38]
arXiv preprint arXiv:2503.00946 (2025)
Li, S., Padilla, S., Bras, P.L., Dong, J., Chantler, M.: A review of LLM-assisted ideation. arXiv preprint arXiv:2503.00946 (2025)
arXiv 2025
-
[39]
arXiv preprint arXiv:2303.18223 (2023)
Zhao, W.X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., et al.: A survey of large language models. arXiv preprint arXiv:2303.18223 (2023)
Pith/arXiv arXiv 2023
-
[40]
: Language models are few-shot learners
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. : Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020)
1901
-
[41]
arXiv preprint arXiv:2305.17812 (2023)
Jin, Z., Lu, W.: Tab-cot: Zero-shot tabular chain of thought. arXiv preprint arXiv:2305.17812 (2023)
arXiv 2023
-
[42]
: TPTU: Task planning and tool usage of large language model-based ai agents
Ruan, J., Chen, Y., Zhang, B., Xu, Z., Bao, T., Mao, H., Li, Z., Zeng, X., Zhao, R., et al. : TPTU: Task planning and tool usage of large language model-based ai agents. In: NeurIPS 2023 Foundation Models for Decision Making Workshop (2023)
2023
-
[43]
In: Proceed- ings of the AAAI Conference on Artificial Intelligence, vol
Besta, M., Blach, N., Kubicek, A., Gerstenberger, R., Podstawski, M., Giani- nazzi, L., Gajda, J., Lehmann, T., Niewiadomski, H., Nyczyk, P., et al.: Graph of thoughts: Solving elaborate problems with large language models. In: Proceed- ings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 17682–17690 (2024)
2024
-
[44]
In: International Conference on Learning Representations (ICLR) (2023)
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y.: ReAct: Synergizing reasoning and acting in language models. In: International Conference on Learning Representations (ICLR) (2023)
2023
-
[45]
Advances in neural information processing systems 36, 11809–11822 (2023)
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., Narasimhan, K.: Tree of thoughts: Deliberate problem solving with large language models. Advances in neural information processing systems 36, 11809–11822 (2023)
2023
-
[46]
arXiv preprint arXiv:2304.06488 (2023)
Zhang, C., Zhang, C., Li, C., Qiao, Y., Zheng, S., Dam, S.K., Zhang, M., Kim, J.U., Kim, S.T., Choi, J., et al.: One small step for generative AI, one giant 27 leap for AGI: A complete survey on ChatGPT in AIGC era. arXiv preprint arXiv:2304.06488 (2023)
arXiv 2023
-
[47]
: A survey on large language model based autonomous agents
Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., et al. : A survey on large language model based autonomous agents. Frontiers of Computer Science 18(6), 186345 (2024)
2024
-
[48]
arXiv preprint arXiv:2510.09244 (2025)
Castrillo, V.d.L., Gidey, H.K., Lenz, A., Knoll, A.: Fundamentals of building autonomous LLM agents. arXiv preprint arXiv:2510.09244 (2025)
arXiv 2025
-
[49]
Cognition 49(1-2), 165–187 (1993)
Evans, J.S.B., Over, D.E., Manktelow, K.I.: Reasoning, decision making and rationality. Cognition 49(1-2), 165–187 (1993)
1993
-
[50]
arXiv preprint arXiv:2402.02716 (2024)
Huang, X., Liu, W., Chen, X., Wang, X., Wang, H., Lian, D., Wang, Y., Tang, R., Chen, E.: Understanding the planning of LLM agents: A survey. arXiv preprint arXiv:2402.02716 (2024)
Pith/arXiv arXiv 2024
-
[51]
: Chain-of-thought prompting elicits reasoning in large language models
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D., et al. : Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 35, 24824–24837 (2022)
2022
-
[52]
Advances in Neural Information Processing Systems 36, 8634–8652 (2023)
Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., Yao, S.: Reflex- ion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems 36, 8634–8652 (2023)
2023
-
[53]
arXiv preprint arXiv:2304.12773 (2023)
Gidey, H.K., Marmsoler, D., Ascher, D.: Modeling adaptive self-healing systems. arXiv preprint arXiv:2304.12773 (2023)
arXiv 2023
-
[54]
Petroni, F., Rocktäschel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A.H., Riedel, S.: Language models as knowledge bases? arXiv preprint arXiv:1909.01066 (2019)
Pith/arXiv arXiv 1909
-
[55]
arXiv preprint arXiv:2404.13501 (2024)
Zhang, Z., Bo, X., Ma, C., Li, R., Chen, X., Dai, Q., Zhu, J., Dong, Z., Wen, J.- R.: A survey on the memory mechanism of large language model based agents. arXiv preprint arXiv:2404.13501 (2024)
Pith/arXiv arXiv 2024
-
[56]
In: Proceedings of the AAAI Symposium Series, vol
Hatalis, K., Christou, D., Myers, J., Jones, S., Lambert, K., Amos-Binks, A., Dannenhauer, Z., Dannenhauer, D.: Memory matters: The need to improve long- term memory in LLM-agents. In: Proceedings of the AAAI Symposium Series, vol. 2, pp. 277–280 (2023)
2023
-
[57]
: Retrieval-augmented generation for knowledge-intensive NLP tasks
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küt- tler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., et al. : Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in neural information processing systems 33, 9459–9474 (2020) 28
2020
-
[58]
arXiv preprint arXiv:2412.15266 (2024)
Zeng, R., Fang, J., Liu, S., Meng, Z.: On the structural memory of LLM agents. arXiv preprint arXiv:2412.15266 (2024)
arXiv 2024
-
[59]
In: International Conference on Intelligent Systems and Pattern Recognition, pp
Gidey, H.K., Kesseler, M., Stangl, P., Hillmann, P., Karcher, A.: Document- based knowledge discovery with microservices architecture. In: International Conference on Intelligent Systems and Pattern Recognition, pp. 146–161 (2022). Springer International Publishing Cham
2022
-
[60]
arXiv preprint arXiv:2409.18807 (2024)
Shen, Z.: LLM with tools: A survey. arXiv preprint arXiv:2409.18807 (2024)
arXiv 2024
-
[61]
arXiv preprint arXiv:2510.24459 (2025)
Gidey, H.K., Huber, N., Lenz, A., Knoll, A.: Affordance representation and recognition for autonomous agents. arXiv preprint arXiv:2510.24459 (2025)
arXiv 2025
-
[62]
arXiv preprint arXiv:2212.10846 (2022)
Guo, J., Li, J., Li, D., Tiong, A.M.H., Li, B., Tao, D., Hoi, S.C.: From images to textual prompts: Zero-shot visual question with frozen large language models. arXiv preprint arXiv:2212.10846 (2022)
arXiv 2022
-
[63]
arXiv preprint arXiv:2306.14824 (2023)
Peng, Z., Wang, W., Dong, L., Hao, Y., Huang, S., Ma, S., Wei, F.: Kosmos- 2: Grounding multimodal large language models to the world. arXiv preprint arXiv:2306.14824 (2023)
Pith/arXiv arXiv 2023
-
[64]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Pi, R., Yao, L., Gao, J., Zhang, J., Zhang, T.: PerceptionGPT: Effectively fusing visual perception into LLM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 27124–27133 (2024)
2024
-
[65]
arXiv preprint arXiv:2604.28001 (2026)
Gidey, H.K., Lenz, A., Knoll, A.: A pattern language for resilient visual agents. arXiv preprint arXiv:2604.28001 (2026)
Pith/arXiv arXiv 2026
-
[66]
https://arxiv.org/abs/2307.16789
Qin, Y., Liang, S., Ye, Y., Zhu, K., Yan, L., Lu, Y., Lin, Y., Cong, X., Tang, X., Qian, B., Zhao, S., Hong, L., Tian, R., Xie, R., Zhou, J., Gerstein, M., Li, D., Liu, Z., Sun, M.: ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (2023). https://arxiv.org/abs/2307.16789
Pith/arXiv arXiv 2023
-
[67]
In: Inter- national Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar), pp
Macedo, J., Gidey, H.K., Rebuli, K.B., Machado, P.: Evolving user interfaces: A neuroevolution approach for natural human-machine interaction. In: Inter- national Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar), pp. 246–264 (2024). Springer Nature Switzerland Cham
2024
-
[68]
Proceedings of the ACM on Software Engineering 2(ISSTA), 1054–1076 (2025)
Bouzenia, I., Pradel, M.: You name it, I run it: An LLM agent to execute tests of arbitrary projects. Proceedings of the ACM on Software Engineering 2(ISSTA), 1054–1076 (2025)
2025
-
[69]
Machine Learning with Applications 17, 100570 (2024) 29
Cao, C., Wang, F., Lindley, L., Wang, Z.: Managing Linux servers with LLM- based AI agents: An empirical evaluation with GPT4. Machine Learning with Applications 17, 100570 (2024) 29
2024
-
[70]
arXiv preprint arXiv:2307.07924 (2023)
Qian, C., Liu, W., Liu, H., Chen, N., Dang, Y., Li, J., Yang, C., Chen, W., Su, Y., Cong, X., et al.: Chatdev: Communicative agents for software development. arXiv preprint arXiv:2307.07924 (2023)
Pith/arXiv arXiv 2023
-
[71]
arXiv preprint arXiv:2501.16150 (2025)
Sager, P.J., Meyer, B., Yan, P., Wartburg-Kottler, R., Etaiwi, L., Enayati, A., Nobel, G., Abdulkadir, A., Grewe, B.F., Stadelmann, T.: AI agents for com- puter use: A review of instruction-based computer control, GUI automation, and operator assistants. arXiv preprint arXiv:2501.16150 (2025)
Pith/arXiv arXiv 2025
-
[72]
https://arxiv.org/abs/ 2409.05556
Ghafarollahi, A., Buehler, M.J.: SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning (2024). https://arxiv.org/abs/ 2409.05556
arXiv 2024
-
[73]
https://arxiv.org/abs/2312.07559
Lála, J., O’Donoghue, O., Shtedritski, A., Cox, S., Rodriques, S.G., White, A.D.: PaperQA: Retrieval-Augmented Generative Agent for Scientific Research (2023). https://arxiv.org/abs/2312.07559
arXiv 2023
-
[74]
Katz, U., Levy, M., Goldberg, Y.: Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature (2024). https://arxiv. org/abs/2408.15836
arXiv 2024
-
[75]
In: International Conference on Machine Learning, Opti- mization, and Data Science, pp
Gidey, H.K., Hillmann, P., Karcher, A., Knoll, A.: User-like bots for cognitive automation: A survey. In: International Conference on Machine Learning, Opti- mization, and Data Science, pp. 388–402 (2023). Springer Nature Switzerland Cham
2023
-
[76]
Critical Care 27(1), 120 (2023)
Azamfirei, R., Kudchadkar, S.R., Fackler, J.: Large language models and the perils of their hallucinations. Critical Care 27(1), 120 (2023)
2023
-
[77]
Hsieh, C.-P., Sun, S., Kriman, S., Acharya, S., Rekesh, D., Jia, F., Zhang, Y., Ginsburg, B.: RULER: What’s the real context size of your long-context language models? arXiv preprint arXiv:2404.06654 (2024)
Pith/arXiv arXiv 2024
-
[78]
Transactions of the association for computational linguistics 12, 157–173 (2024)
Liu, N.F., Lin, K., Hewitt, J., Paranjape, A., Bevilacqua, M., Petroni, F., Liang, P.: Lost in the middle: How language models use long contexts. Transactions of the association for computational linguistics 12, 157–173 (2024)
2024
-
[79]
In: International Conference on Machine Learning, pp
Shi, F., Chen, X., Misra, K., Scales, N., Dohan, D., Chi, E.H., Schärli, N., Zhou, D.: Large language models can be easily distracted by irrelevant context. In: International Conference on Machine Learning, pp. 31210–31227 (2023). PMLR
2023
-
[80]
Advances in Neural Information Processing Systems 36, 39648–39677 (2023)
Kim, G., Baldi, P., McAleer, S.: Language models can solve computer tasks. Advances in Neural Information Processing Systems 36, 39648–39677 (2023)
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.