arxiv: 2309.07864 · v3 · submitted 2023-09-14 · 💻 cs.AI · cs.CL

Recognition: 3 theorem links

The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi , Wenxiang Chen , Xin Guo , Wei He , Yiwen Ding , Boyang Hong , Ming Zhang , Junzhe Wang

show 21 more authors

Senjie Jin Enyu Zhou Rui Zheng Xiaoran Fan Xiao Wang Limao Xiong Yuhao Zhou Weiran Wang Changhao Jiang Yicheng Zou Xiangyang Liu Zhangyue Yin Shihan Dou Rongxiang Weng Wensen Cheng Qi Zhang Wenjuan Qin Yongyan Zheng Xipeng Qiu Xuanjing Huang Tao Gui

Authors on Pith no claims yet

Pith reviewed 2026-05-11 10:42 UTC · model grok-4.3

classification 💻 cs.AI cs.CL

keywords LLM-based agentsAI agentslarge language modelsmulti-agent systemsagent societiesartificial general intelligencesurvey

0 comments

The pith

Large language models provide a versatile foundation for building AI agents adaptable to diverse scenarios.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey traces the origins of the agent concept and argues that LLMs' broad capabilities make them suitable bases for general-purpose agents rather than narrow task-specific systems. It proposes a modular framework with three components that handle reasoning, sensing, and acting, which can be adjusted for different uses. The paper then reviews progress in single-agent applications, multi-agent interactions, human-agent teams, and the social behaviors that arise when many agents interact. A reader would care because the work consolidates early efforts into a map of how current models might scale toward more flexible intelligence without starting from scratch each time.

Core claim

The paper claims that LLMs can serve as foundations for general AI agents because of their demonstrated versatility in reasoning, language, and knowledge. It presents a general framework with brain, perception, and action components that can be tailored for different applications. The survey covers extensive uses in single-agent scenarios for tasks like planning and tool use, multi-agent scenarios for collaboration and competition, and human-agent cooperation. It further examines agent societies for emergent behaviors and personality traits, drawing parallels to human society, and identifies key open problems in the field.

What carries the argument

The brain-perception-action framework for LLM-based agents, which structures the model to handle decision-making, environment input processing, and output actions in a customizable way.

If this is right

Single-agent systems gain the ability to plan, use tools, and maintain memory for complex individual tasks.
Multi-agent setups can simulate cooperation, debate, and competition to solve problems collectively.
Human-agent cooperation can boost performance in domains such as coding, decision-making, and creative work.
Agent societies may display emergent social phenomena that provide insights into human group dynamics.
Mapping open problems helps direct research toward filling gaps in reliability and scalability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the framework proves robust, development efforts could shift from creating separate models for each task toward reusable agent templates, lowering engineering costs.
Simulations of agent societies could become tools for testing social theories or policy ideas before real-world trials.
Addressing gaps like consistency in long interactions may require combining the framework with external memory or verification modules not detailed in the survey.
Deployment in physical environments would likely need additional layers for safety and grounding that current text-based agents lack.

Load-bearing premise

The demonstrated capabilities of current LLMs are general and sufficient to serve as a reliable starting point for agents that adapt to many different real-world scenarios.

What would settle it

A controlled test in which LLM-based agents built with the proposed framework fail to adapt to a new class of real-world tasks without requiring major new model training or architectural changes beyond the framework adjustments.

read the original abstract

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many researchers have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. In this paper, we perform a comprehensive survey on LLM-based agents. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents. Building upon this, we present a general framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored for different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge from an agent society, and the insights they offer for human society. Finally, we discuss several key topics and open problems within the field. A repository for the related papers at https://github.com/WooooDyy/LLM-Agent-Paper-List.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This survey organizes the LLM agent literature with a simple brain-perception-action framework and broad coverage of applications plus open problems.

read the letter

This survey on LLM-based agents is a useful map of a fast-moving area. It starts with the history of agent ideas, makes the case for LLMs as a strong base, and introduces a three-part framework of brain, perception, and action that can be adapted to different uses. The work stands out for its scope. It reviews single-agent applications, multi-agent setups, human cooperation, and even looks at societies formed by these agents and what they might reveal about our own. They close with open problems and maintain a GitHub list of related papers, which adds practical value. The framework is not a major technical advance but serves as a good way to organize ideas that were scattered across papers. There are no new experiments or formal results, as expected from a survey, and the AGI potential is framed as a community perspective supported by references. A small drawback is that the breadth means some areas are touched on lightly, so the paper works best as a starting point rather than a deep dive. The citations appear solid and cover the key recent contributions. This is the sort of paper that helps both new researchers and those already in the field get oriented quickly. It deserves to go through peer review because it can act as a reference and help identify where more work is needed.

Referee Report

0 major / 3 minor

Summary. The manuscript surveys LLM-based agents: it traces agent concepts from philosophical origins to AI, motivates LLMs as suitable foundations due to their versatile capabilities and potential as AGI sparks, proposes a general three-component framework (brain, perception, action) that can be tailored for applications, catalogs applications across single-agent, multi-agent, and human-agent cooperation settings, examines behaviors, personalities, and emergent phenomena in agent societies along with insights for human society, and discusses key topics and open problems, supported by a linked GitHub repository of related papers.

Significance. This survey consolidates a rapidly expanding body of work on LLM-based agents into a coherent structure, providing a descriptive framework that can aid comparison and development of new systems. The coverage of applications and agent societies synthesizes practical progress and broader implications, while the repository offers a concrete, maintainable resource that enhances accessibility and reproducibility of the cited literature.

minor comments (3)

Abstract and framework introduction: the claim that the three-component framework 'can be tailored for different applications' would benefit from an explicit cross-reference or table in the applications sections showing how brain/perception/action are instantiated in at least one single-agent and one multi-agent example.
Agent societies section: the distinction between observed emergent behaviors in LLM agents and the insights claimed for human society should be stated more explicitly to avoid conflating simulation results with direct applicability.
Repository link: the paper could briefly describe the curation criteria or update process for the GitHub list to clarify its scope and maintenance.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their detailed and positive summary of our manuscript, for highlighting its contributions in consolidating the rapidly growing literature on LLM-based agents, and for recommending acceptance. We are pleased that the three-component framework, the coverage of single- and multi-agent applications, agent societies, and the linked GitHub repository were viewed as useful resources for the community.

Circularity Check

0 steps flagged

No significant circularity: survey with no derivations or fitted predictions

full rationale

This manuscript is a literature survey. It traces historical concepts of agents, motivates LLMs via cited external capabilities, proposes a descriptive three-component framework (brain/perception/action) as an organizational lens, catalogs applications, and lists open problems. No equations, no parameter fitting, no predictions that reduce to inputs by construction, and no load-bearing self-citations that substitute for independent evidence. All substantive claims reference prior external work; the AGI-spark narrative is presented as community perspective rather than a derived result internal to the paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The survey rests on the domain assumption that current LLMs possess sufficiently versatile reasoning and generation capabilities to serve as general agent foundations; no free parameters or new invented entities are introduced.

axioms (1)

domain assumption Large language models possess versatile capabilities that make them suitable foundations for general AI agents.
Stated explicitly in the abstract and introduction as the premise for building the survey.

pith-pipeline@v0.9.0 · 5722 in / 1191 out tokens · 44420 ms · 2026-05-11T10:42:46.035355+00:00 · methodology

discussion (0)

Forward citations

Cited by 54 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Revisable by Design: A Theory of Streaming LLM Agent Execution
cs.LG 2026-04 unverdicted novelty 8.0

LLM agents achieve greater flexibility during execution by classifying actions via a reversibility taxonomy and using an Earliest-Conflict Rollback algorithm that matches full-restart quality while wasting far less co...
Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback
cs.LG 2026-05 unverdicted novelty 7.0

A multi-dimensional behavioral scoring system using LLM judges evaluates agentic forecast processes and closes the loop with RL penalties, yielding an 11.5% MAPE reduction in offline backtests on 2017-2025 data.
Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback
cs.LG 2026-05 unverdicted novelty 7.0

LLM judges score agentic stock predictors on six behavioral dimensions; feeding deficiencies back as RL penalties yields 11.5% MAPE reduction, 3-point directional accuracy gain, and 18% Sharpe improvement in offline 2...
Catching the Infection Before It Spreads: Foresight-Guided Defense in Multi-Agent Systems
cs.AI 2026-05 unverdicted novelty 7.0

A foresight-based local purification method using multi-persona simulations and recursive diagnosis reduces infectious jailbreak spread in multi-agent systems from over 95% to below 5.47% while matching benign perform...
MAD-OPD: Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate
cs.CL 2026-05 unverdicted novelty 7.0

MAD-OPD recasts on-policy distillation teachers as a debating collective to supply better supervision, lifting agentic and code performance over single-teacher OPD across multiple model sizes.
OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory
cs.CL 2026-04 unverdicted novelty 7.0

OCR-Memory encodes agent trajectories as images with visual anchors and retrieves verbatim text via locate-and-transcribe, yielding gains on long-horizon benchmarks under strict context limits.
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework
cs.CR 2026-04 unverdicted novelty 7.0

A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
Dr.Sai: An agentic AI for real-world physics analysis at BESIII
hep-ex 2026-04 unverdicted novelty 7.0

Dr.Sai autonomously executed full physics analysis pipelines on real BESIII data to re-measure ten J/psi decay branching fractions, matching established benchmarks without any manual coding.
Feedback-Driven Execution for LLM-Based Binary Analysis
cs.CR 2026-04 unverdicted novelty 7.0

FORGE uses a reasoning-action-observation loop and Dynamic Forest of Agents to perform scalable LLM-based binary analysis, finding 1,274 vulnerabilities across 591 of 3,457 real-world firmware binaries at 72.3% precis...
SAGE: A Service Agent Graph-guided Evaluation Benchmark
cs.AI 2026-04 unverdicted novelty 7.0

SAGE is a new multi-agent benchmark that formalizes service SOPs as dynamic dialogue graphs to measure LLM agents on logical compliance and path coverage, uncovering an execution gap and empathy resilience across 27 m...
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions
cs.CR 2025-03 unverdicted novelty 7.0

MCP lifecycle is defined with four phases and 16 activities; a threat taxonomy of 16 scenarios is constructed, validated via case studies, and paired with phase-specific safeguards.
CHAL: Council of Hierarchical Agentic Language
cs.AI 2026-05 unverdicted novelty 6.0

CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.
From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World
cs.AI 2026-05 unverdicted novelty 6.0

A practical evaluation protocol for AI pentesting agents that uses validated vulnerability discovery, LLM semantic matching, and bipartite scoring to assess performance in realistic, complex targets.
OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces
cs.AI 2026-05 unverdicted novelty 6.0

OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem
cs.SE 2026-05 unverdicted novelty 6.0

MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.
SOD: Step-wise On-policy Distillation for Small Language Model Agents
cs.CL 2026-05 unverdicted novelty 6.0

SOD reweights on-policy distillation strength step-by-step using divergence to stabilize tool use in small language model agents, yielding up to 20.86% gains and 26.13% on AIME 2025 for a 0.6B model.
LoopTrap: Termination Poisoning Attacks on LLM Agents
cs.CR 2026-05 unverdicted novelty 6.0

LoopTrap is an automated red-teaming framework that crafts termination-poisoning prompts to amplify LLM agent steps by 3.57x on average (up to 25x) across 8 agents.
Agentic Retrieval-Augmented Generation for Financial Document Question Answering
cs.AI 2026-05 unverdicted novelty 6.0

FinAgent-RAG achieves 76.81-78.46% execution accuracy on financial QA benchmarks by combining contrastive retrieval, program-of-thought code generation, and adaptive strategy routing, outperforming baselines by 5.62-9...
Catching the Infection Before It Spreads: Foresight-Guided Defense in Multi-Agent Systems
cs.AI 2026-05 unverdicted novelty 6.0

A foresight-based local purification method simulates future agent interactions, detects infections via response diversity across personas, and applies targeted rollback or recursive diagnosis to cut maximum infection...
Catching the Infection Before It Spreads: Foresight-Guided Defense in Multi-Agent Systems
cs.AI 2026-05 unverdicted novelty 6.0

FLP uses multi-persona foresight simulation to detect infections via response diversity and applies local purification to reduce maximum cumulative infection rates in multi-agent systems from over 95% to below 5.47%.
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills
cs.CL 2026-04 unverdicted novelty 6.0

SSL representation disentangles skill scheduling, structure, and logic using an LLM normalizer, improving skill discovery MRR@50 from 0.649 to 0.729 and risk assessment macro F1 from 0.409 to 0.509 over text baselines.
LLM-Steered Power Allocation for Parallel QPSK-AWGN Channels
cs.IT 2026-04 unverdicted novelty 6.0

LLM interprets natural-language policies to steer a projected-gradient power allocator in 8 parallel QPSK-AWGN channels, producing policy-dependent allocations and 60% lower mutual-information spread after abrupt chan...
When Agents Go Quiet: Output Generation Capacity and Format-Cost Separation for LLM Document Synthesis
cs.AI 2026-04 unverdicted novelty 6.0

LLM agents avoid output stalling and reduce generation tokens by 48-72% via deferred template rendering guided by Output Generation Capacity and a Format-Cost Separation Theorem.
Numerical Instability and Chaos: Quantifying the Unpredictability of Large Language Models
cs.AI 2026-04 unverdicted novelty 6.0

Large language models display three universal scale-dependent regimes of behavior—stable, chaotic, and signal-dominated—driven by floating-point rounding errors that produce an avalanche effect in early layers.
Policy-Invisible Violations in LLM-Based Agents
cs.AI 2026-04 unverdicted novelty 6.0

LLM agents commit policy-invisible violations when policy facts are hidden from their context; a graph-simulation enforcer reaches 93% accuracy vs 68.8% for content-only baselines on a new 600-trace benchmark.
Generative AI Agent Empowered Power Allocation for HAP Propulsion and Communication Systems
cs.NI 2026-04 unverdicted novelty 6.0

A generative AI agent creates a realistic HAP propulsion power model including aerodynamic interference and enables a Q3E beamforming algorithm that improves QoS and energy efficiency.
Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
cs.CR 2026-04 conditional novelty 6.0

Poisoning any single CIK dimension of an AI agent raises average attack success rate from 24.6% to 64-74% across models, and tested defenses leave substantial residual risk.
Agentless: Demystifying LLM-based Software Engineering Agents
cs.SE 2024-07 conditional novelty 6.0

Agentless, a basic three-phase LLM pipeline for bug localization, repair, and validation, outperforms complex open-source agents on SWE-bench Lite with 32% success rate at $0.70 cost.
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
cs.SE 2024-03 unverdicted novelty 6.0

LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
cs.AI 2026-05 conditional novelty 5.0

The survey proposes the LIFE framework to unify fragmented research on collaboration, failure attribution, and self-evolution in LLM multi-agent systems into a progression toward self-organizing intelligence.
The Semantic Training Gap: Ontology-Grounded Tool Architectures for Industrial AI Agent Systems
cs.AI 2026-05 unverdicted novelty 5.0

Ontology-grounded tool architectures eliminate hallucination of domain identifiers in industrial AI agents by enforcing semantic constraints through a typed relational configuration and three-operation interface.
Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization
cs.AI 2026-05 unverdicted novelty 5.0

SCM-GRPO grounds multi-hop fact verification in structural causal models and applies GRPO reinforcement learning to optimize reasoning chain length, outperforming baselines on HoVer and EX-FEVER.
Think Before You Act -- A Neurocognitive Governance Model for Autonomous AI Agents
cs.AI 2026-04 unverdicted novelty 5.0

A neurocognitive governance model formalizes a Pre-Action Governance Reasoning Loop that consults global, workflow, agent, and situational rules before each action, yielding 95% compliance accuracy with zero false esc...
KISS Sorcar: A Stupidly-Simple General-Purpose and Software Engineering AI Assistant
cs.SE 2026-04 unverdicted novelty 5.0

KISS Sorcar introduces a simple layered agent framework and VS Code IDE that reaches 62.2% pass rate on Terminal Bench 2.0 by combining ReAct execution, summarization-based continuation, parallel tools, persistent his...
ARMove: Learning to Predict Human Mobility through Agentic Reasoning
cs.MA 2026-04 unverdicted novelty 5.0

ARMove is a transferable framework for human mobility prediction that combines agentic LLM reasoning, feature management, and large-small model synergy to outperform baselines on several metrics while improving interp...
Safe and Policy-Compliant Multi-Agent Orchestration for Enterprise AI
cs.AI 2026-04 unverdicted novelty 5.0

CAMCO enforces policy constraints on multi-agent AI at deployment time via convex projection, risk-weighted Lagrangian shaping, and bounded-convergence negotiation, yielding zero violations and 92-97% utility in teste...
Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents
cs.AI 2026-04 unverdicted novelty 5.0

Layered mutability framework claims governance difficulty in persistent self-modifying agents rises with rapid mutation, strong downstream coupling, weak reversibility, and low observability, producing compositional d...
Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents
cs.AI 2026-04 unverdicted novelty 5.0

Persistent self-modifying AI agents exhibit compositional drift from mismatches across five mutability layers, with governance difficulty rising under rapid mutation, strong coupling, weak reversibility, and low obser...
Agent Mentor: Framing Agent Knowledge through Semantic Trajectory Analysis
cs.AI 2026-04 unverdicted novelty 5.0

Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
cs.SE 2026-04 accept novelty 5.0

LLM agent progress depends on externalizing cognitive functions into memory, skills, protocols, and harness engineering that coordinates them reliably.
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent
cs.LG 2026-04 unverdicted novelty 5.0

AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-...
LanG -- A Governance-Aware Agentic AI Platform for Unified Security Operations
cs.CR 2026-04 unverdicted novelty 5.0

LanG presents a governance-aware agentic AI platform for unified security operations that reports strong performance on incident correlation, rule generation, attack reconstruction, and AI safety guardrails in an open...
AIVV: Neuro-Symbolic LLM Agent-Integrated Verification and Validation for Trustworthy Autonomous Systems
cs.AI 2026-04 unverdicted novelty 5.0

AIVV deploys LLM agents in a council to semantically validate anomalies in time-series data against natural-language requirements, automating human-in-the-loop verification for autonomous systems.
EconAI: Dynamic Persona Evolution and Memory-Aware Agents in Evolving Economic Environments
cs.MA 2026-05 unverdicted novelty 4.0

EconAI adds memory weighting and economic sentiment indexing to LLM agents so they adapt short-term actions to long-term goals inside a single macro/micro simulation loop.
A Multi-Agent Orchestration Framework for Venture Capital Due Diligence
cs.MA 2026-05 unverdicted novelty 4.0

A multi-agent orchestration framework automates VC due diligence using LLMs, web retrieval, and a programmatic pipeline to extract and parse official Greek business registry filings while flagging data gaps.
From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms
cs.AI 2026-05 unverdicted novelty 4.0

LLM agent memory is organized into Storage (preserving trajectories), Reflection (refining them), and Experience (abstracting into reusable knowledge) stages driven by needs for long-range consistency, dynamic adaptat...
Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization
cs.AI 2026-05 unverdicted novelty 4.0

The SCM-GRPO framework models multi-hop fact verification as causal inference and applies reinforcement learning to optimize reasoning depth, reporting outperformance on HoVer and EX-FEVER.
Vibe Medicine: Redefining Biomedical Research Through Human-AI Co-Work
cs.AI 2026-04 unverdicted novelty 4.0

Vibe Medicine proposes directing AI agents via natural language for end-to-end biomedical workflows using LLMs, agent frameworks, and a curated collection of over 1,000 medical skills.
Governance-Aware Agent Telemetry for Closed-Loop Enforcement in Multi-Agent AI Systems
cs.MA 2026-04 unverdicted novelty 4.0

GAAT is a proposed architecture extending OpenTelemetry with governance schemas, OPA-based detection, graduated enforcement, and trusted provenance to close the observe-but-do-not-act gap in multi-agent systems.
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
cs.AI 2025-01 unverdicted novelty 4.0

The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and ident...
A Survey on Large Language Models for Code Generation
cs.CL 2024-06 unverdicted novelty 3.0

A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark...
A Survey on Efficient Inference for Large Language Models
cs.CL 2024-04 accept novelty 3.0

The paper surveys techniques to speed up and reduce the resource needs of LLM inference, organized by data-level, model-level, and system-level changes, with comparative experiments on representative methods.
A Survey on the Memory Mechanism of Large Language Model based Agents
cs.AI 2024-04 accept novelty 3.0

A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
Large Language Models: A Survey
cs.CL 2024-02 accept novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

Reference graph

Works this paper leans on

298 extracted references · 298 canonical work pages · cited by 49 Pith papers · 32 internal anchors

[1]

Russell, S. J. Artificial intelligence a modern approach. Pearson Education, Inc., 2010

work page 2010
[2]

Diderot’s early philosophical works

Diderot, D. Diderot’s early philosophical works. 4. Open Court, 1911

work page 1911
[3]

Turing, A. M. Computing machinery and intelligence. Springer, 2009

work page 2009
[4]

Wooldridge, M. J., N. R. Jennings. Intelligent agents: theory and practice. Knowl. Eng. Rev., 10(2):115–152, 1995

work page 1995
[5]

Schlosser, M. Agency. In E. N. Zalta, ed., The Stanford Encyclopedia of Philosophy. Meta- physics Research Lab, Stanford University, Winter 2019 edn., 2019

work page 2019
[6]

Agha, G. A. Actors: a Model of Concurrent Computation in Distributed Systems (Parallel Processing, Semantics, Open, Programming Languages, Artificial Intelligence). Ph.D. thesis, University of Michigan, USA, 1985

work page 1985
[7]

Hurst, B

Green, S., L. Hurst, B. Nangle, et al. Software agents: A review. Department of Computer Science, Trinity College Dublin, Tech. Rep. TCS-CS-1997-06, 1997

work page 1997
[8]

Genesereth, M. R., S. P. Ketchpel. Software agents. Commun. ACM, 37(7):48–53, 1994

work page 1994
[9]

Formalizing properties of agents

Goodwin, R. Formalizing properties of agents. J. Log. Comput., 5(6):763–781, 1995

work page 1995
[10]

Winikoff

Padgham, L., M. Winikoff. Developing intelligent agent systems: A practical guide . John Wiley & Sons, 2005

work page 2005
[11]

Agent oriented programming

Shoham, Y . Agent oriented programming. In M. Masuch, L. Pólos, eds.,Knowledge Repre- sentation and Reasoning Under Uncertainty, Logic at Work [International Conference Logic at Work, Amsterdam, The Netherlands, December 17-19, 1992], vol. 808 of Lecture Notes in Computer Science, pages 123–129. Springer, 1992

work page 1992
[12]

Universal artificial intelligence: Sequential decisions based on algorithmic probability

Hutter, M. Universal artificial intelligence: Sequential decisions based on algorithmic probability. Springer Science & Business Media, 2004

work page 2004
[13]

Fikes, R., N. J. Nilsson. STRIPS: A new approach to the application of theorem proving to problem solving. In D. C. Cooper, ed., Proceedings of the 2nd International Joint Confer- ence on Artificial Intelligence. London, UK, September 1-3, 1971, pages 608–620. William Kaufmann, 1971

work page 1971
[14]

Sacerdoti, E. D. Planning in a hierarchy of abstraction spaces. In N. J. Nilsson, ed.,Proceedings of the 3rd International Joint Conference on Artificial Intelligence. Standford, CA, USA, August 20-23, 1973, pages 412–422. William Kaufmann, 1973

work page 1973
[15]

Brooks, R. A. Intelligence without representation. Artificial intelligence, 47(1-3):139–159, 1991

work page 1991
[16]

Designing autonomous agents: Theory and practice from biology to engineering and back

Maes, P. Designing autonomous agents: Theory and practice from biology to engineering and back. MIT press, 1990

work page 1990
[17]

Reinforcement learning agents

Ribeiro, C. Reinforcement learning agents. Artificial intelligence review, 17:223–250, 2002

work page 2002
[18]

Kaelbling, L. P., M. L. Littman, A. W. Moore. Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237–285, 1996

work page 1996
[19]

Guha, R. V ., D. B. Lenat. Enabling agents to work together. Communications of the ACM, 37(7):126–142, 1994. 49

work page 1994
[20]

P., et al

Kaelbling, L. P., et al. An architecture for intelligent reactive systems. Reasoning about actions and plans, pages 395–410, 1987

work page 1987
[21]

Sutton, R. S., A. G. Barto. Reinforcement learning: An introduction. MIT press, 2018

work page 2018
[22]

Park, J. S., J. C. O’Brien, C. J. Cai, et al. Generative agents: Interactive simulacra of human behavior. CoRR, abs/2304.03442, 2023

work page internal anchor Pith review arXiv 2023
[23]

Interactive natural language processing

Wang, Z., G. Zhang, K. Yang, et al. Interactive natural language processing. CoRR, abs/2305.13246, 2023

work page arXiv 2023
[24]

Ouyang, L., J. Wu, X. Jiang, et al. Training language models to follow instructions with human feedback. In NeurIPS. 2022

work page 2022
[25]

GPT-4 Technical Report

OpenAI. GPT-4 technical report. CoRR, abs/2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[26]

Wei, J., Y . Tay, R. Bommasani, et al. Emergent abilities of large language models. Trans. Mach. Learn. Res., 2022, 2022

work page 2022
[27]

Liu, R., R. Yang, C. Jia, et al. Training socially aligned language models in simulated human society. CoRR, abs/2305.16960, 2023

work page arXiv 2023
[28]

Sumers, T. R., S. Yao, K. Narasimhan, et al. Cognitive architectures for language agents. CoRR, abs/2309.02427, 2023

work page arXiv 2023
[29]

Llm-powered autonomous agents

Weng, L. Llm-powered autonomous agents. lilianweng.github.io, 2023

work page 2023
[30]

Holtzman, J

Bisk, Y ., A. Holtzman, J. Thomason, et al. Experience grounds language. In B. Webber, T. Cohn, Y . He, Y . Liu, eds.,Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 8718–

work page 2020
[31]

Association for Computational Linguistics, 2020

work page 2020
[32]

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Bubeck, S., V . Chandrasekaran, R. Eldan, et al. Sparks of artificial general intelligence: Early experiments with GPT-4. CoRR, abs/2303.12712, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[33]

Anscombe, G. E. M. Intention. Harvard University Press, 2000

work page 2000
[34]

Actions, reasons, and causes

Davidson, D. Actions, reasons, and causes. The Journal of Philosophy, 60(23):685–700, 1963

work page 1963
[35]

—. I. agency. In A. Marras, R. N. Bronaugh, R. W. Binkley, eds., Agent, Action, and Reason, pages 1–37. University of Toronto Press, 1971

work page 1971
[36]

Dennett, D. C. Précis of the intentional stance. Behavioral and brain sciences, 11(3):495–505, 1988

work page 1988
[37]

Barandiaran, X. E., E. Di Paolo, M. Rohde. Defining agency: Individuality, normativity, asymmetry, and spatio-temporality in action. Adaptive Behavior, 17(5):367–386, 2009

work page 2009
[38]

Ascribing mental qualities to machines

McCarthy, J. Ascribing mental qualities to machines. Stanford University. Computer Science Department, 1979

work page 1979
[39]

Rosenschein, S. J., L. P. Kaelbling. The synthesis of digital machines with provable epistemic properties. In Theoretical aspects of reasoning about knowledge, pages 83–98. Elsevier, 1986

work page 1986
[40]

Narasimhan, T

Radford, A., K. Narasimhan, T. Salimans, et al. Improving language understanding by generative pre-training. OpenAI, 2018

work page 2018
[41]

Radford, A., J. Wu, R. Child, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019

work page 2019
[42]

Brown, T. B., B. Mann, N. Ryder, et al. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin, eds., Advances in Neural In- formation Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. 2020. 50

work page 2020
[43]

Jaech, X

Lin, C., A. Jaech, X. Li, et al. Limitations of autoregressive models and their alternatives. In K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tür, I. Beltagy, S. Bethard, R. Cotterell, T. Chakraborty, Y . Zhou, eds., Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language ...

work page 2021
[44]

Constructing a language: A usage-based theory of language acquisition

Tomasello, M. Constructing a language: A usage-based theory of language acquisition . Harvard university press, 2005

work page 2005
[45]

How children learn the meanings of words

Bloom, P. How children learn the meanings of words. MIT press, 2002

work page 2002
[46]

Zwaan, R. A., C. J. Madden. Embodied sentence comprehension. Grounding cognition: The role of perception and action in memory, language, and thinking, 22, 2005

work page 2005
[47]

Language models as agent models

Andreas, J. Language models as agent models. In Y . Goldberg, Z. Kozareva, Y . Zhang, eds., Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022 , pages 5769–5779. Association for Computational Linguistics, 2022

work page 2022
[48]

and Goodman, Noah D

Wong, L., G. Grand, A. K. Lew, et al. From word models to world models: Translating from natural language to the probabilistic language of thought. CoRR, abs/2306.12672, 2023

work page arXiv 2023
[49]

Learning to Generate Reviews and Discovering Sentiment

Radford, A., R. Józefowicz, I. Sutskever. Learning to generate reviews and discovering sentiment. CoRR, abs/1704.01444, 2017

work page Pith review arXiv 2017
[50]

Li, B. Z., M. I. Nye, J. Andreas. Implicit representations of meaning in neural language models. In C. Zong, F. Xia, W. Li, R. Navigli, eds., Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual E...

work page 2021
[51]

Mukhopadhyay, U., L. M. Stephens, M. N. Huhns, et al. An intelligent system for document retrieval in distributed office environments. J. Am. Soc. Inf. Sci., 37(3):123–135, 1986

work page 1986
[52]

Situated agents can have goals

Maes, P. Situated agents can have goals. Robotics Auton. Syst., 6(1-2):49–70, 1990

work page 1990
[53]

Nilsson, N. J. Toward agent programs with circuit semantics. Tech. rep., 1992

work page 1992
[54]

Müller, J. P., M. Pischel. Modelling interacting agents in dynamic environments. In Proceed- ings of the 11th European Conference on Artificial Intelligence, pages 709–713. 1994

work page 1994
[55]

A robust layered control system for a mobile robot

Brooks, R. A robust layered control system for a mobile robot. IEEE journal on robotics and automation, 2(1):14–23, 1986

work page 1986
[56]

Brooks, R. A. Intelligence without reason. In The artificial life route to artificial intelligence, pages 25–81. Routledge, 2018

work page 2018
[57]

Newell, A., H. A. Simon. Computer science as empirical inquiry: Symbols and search. Commun. ACM, 19(3):113–126, 1976

work page 1976
[58]

Ginsberg, M. L. Essentials of Artificial Intelligence. Morgan Kaufmann, 1993

work page 1993
[59]

Wilkins, D. E. Practical planning - extending the classical AI planning paradigm. Morgan Kaufmann series in representation and reasoning. Morgan Kaufmann, 1988

work page 1988
[60]

Action and agency in cognitive science

Shardlow, N. Action and agency in cognitive science. Ph.D. thesis, Master’s thesis, Department of Psychlogy, University of Manchester, Oxford . . . , 1990

work page 1990
[61]

Sacerdoti, E. D. The nonlinear nature of plans. In Advance Papers of the Fourth International Joint Conference on Artificial Intelligence, Tbilisi, Georgia, USSR, September 3-8, 1975, pages 206–214. 1975

work page 1975
[62]

Russell, S. J., E. Wefald. Do the right thing: studies in limited rationality. MIT press, 1991. 51

work page 1991
[63]

Universal plans for reactive robots in unpredictable environments

Schoppers, M. Universal plans for reactive robots in unpredictable environments. In J. P. Mc- Dermott, ed., Proceedings of the 10th International Joint Conference on Artificial Intelligence. Milan, Italy, August 23-28, 1987, pages 1039–1046. Morgan Kaufmann, 1987

work page 1987
[64]

Brooks, R. A. A robust layered control system for a mobile robot. IEEE J. Robotics Autom., 2(1):14–23, 1986

work page 1986
[65]

Steps toward artificial intelligence

Minsky, M. Steps toward artificial intelligence. Proceedings of the IRE, 49(1):8–30, 1961

work page 1961
[66]

Isbell, C., C. R. Shelton, M. Kearns, et al. A social reinforcement learning agent. In Proceedings of the fifth international conference on Autonomous agents, pages 377–384. 2001

work page 2001
[67]

Watkins, C. J. C. H. Learning from delayed rewards, 1989

work page 1989
[68]

Rummery, G. A., M. Niranjan. On-line Q-learning using connectionist systems , vol. 37. University of Cambridge, Department of Engineering Cambridge, UK, 1994

work page 1994
[69]

Temporal difference learning and td-gammon

Tesauro, G., et al. Temporal difference learning and td-gammon. Communications of the ACM, 38(3):58–68, 1995

work page 1995
[70]

Deep reinforcement learning: An overview

Li, Y . Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274, 2017

work page arXiv 2017
[71]

Huang, C

Silver, D., A. Huang, C. J. Maddison, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016

work page 2016
[72]

Playing Atari with Deep Reinforcement Learning

Mnih, V ., K. Kavukcuoglu, D. Silver, et al. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[73]

Farebrother, J., M. C. Machado, M. Bowling. Generalization and regularization in DQN. CoRR, abs/1810.00123, 2018

work page arXiv 2018
[74]

arXiv preprint arXiv:1804.06893 , year=

Zhang, C., O. Vinyals, R. Munos, et al. A study on overfitting in deep reinforcement learning. CoRR, abs/1804.06893, 2018

work page arXiv 2018
[75]

Justesen, N., R. R. Torrado, P. Bontrager, et al. Illuminating generalization in deep rein- forcement learning through procedural level generation. arXiv preprint arXiv:1806.10729, 2018

work page arXiv 2018
[76]

Levine, D

Dulac-Arnold, G., N. Levine, D. J. Mankowitz, et al. Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Mach. Learn., 110(9):2419–2468, 2021

work page 2021
[77]

Rahme, A

Ghosh, D., J. Rahme, A. Kumar, et al. Why generalization in RL is difficult: Epistemic pomdps and implicit partial observability. In M. Ranzato, A. Beygelzimer, Y . N. Dauphin, P. Liang, J. W. Vaughan, eds.,Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 20...

work page 2021
[78]

Harutyunyan, M

Brys, T., A. Harutyunyan, M. E. Taylor, et al. Policy transfer using reward shaping. In G. Weiss, P. Yolum, R. H. Bordini, E. Elkind, eds.,Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015, Istanbul, Turkey, May 4-8, 2015, pages 181–188. ACM, 2015

work page 2015
[79]

Parisotto, E., J. L. Ba, R. Salakhutdinov. Actor-mimic: Deep multitask and transfer reinforce- ment learning. arXiv preprint arXiv:1511.06342, 2015

work page Pith review arXiv 2015
[80]

Zhu, Z., K. Lin, J. Zhou. Transfer learning in deep reinforcement learning: A survey. CoRR, abs/2009.07888, 2020

work page arXiv 2009

Showing first 80 references.