pith. sign in

arxiv: 2504.01990 · v2 · pith:KNVO3BIUnew · submitted 2025-03-31 · 💻 cs.AI

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Pith reviewed 2026-05-22 21:37 UTC · model grok-4.3

classification 💻 cs.AI
keywords foundation agentsbrain-inspired architecturesmodular designself-enhancementmulti-agent systemsAI safetycognitive modulescontinual learning
0
0 comments X

The pith

Foundation agents gain structure from modular architectures that map directly onto human brain functions for reasoning, adaptation, collaboration, and safety.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The survey frames advanced agents built on large language models as systems whose components can be organized by direct analogy to brain regions and processes. It divides the field into four linked areas: individual module design for memory, world modeling, goals and emotion; autonomous mechanisms that let agents improve themselves over time; interactions among multiple agents that produce collective behaviors; and built-in safeguards against misuse or failure. If this mapping holds, research can proceed by refining each module while preserving overall alignment with human-like cognition and social needs. Readers would care because the approach supplies a single scaffold for turning scattered advances into coordinated progress toward reliable, evolving agents.

Core claim

Intelligent agents are productively understood through modular, brain-inspired architectures that integrate cognitive science and computational principles, with core components including memory, world modeling, reward processing, goal setting, and emotion; these architectures support self-enhancement via automated optimization, collective intelligence through multi-agent interactions, and safety via intrinsic and extrinsic mitigation strategies.

What carries the argument

modular, brain-inspired architectures that map cognitive, perceptual, and operational modules onto human brain functionalities

If this is right

  • Agents can achieve continual learning by autonomously refining capabilities through automated optimization paradigms.
  • Collective intelligence emerges when multiple agents interact, cooperate, and form societal structures.
  • Security threats can be addressed by combining intrinsic mechanisms with extrinsic alignment and robustness techniques.
  • Research efforts can be directed toward harmonizing module-level advances with overall societal benefit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same modular breakdown could be used to design new benchmarks that test each brain-like function separately rather than end-to-end performance alone.
  • Neuroscience findings on specific brain processes could be translated into concrete module upgrades without requiring full biological fidelity.
  • Real-world deployment pipelines might adopt the four-part structure to audit agents for safety before scaling.

Load-bearing premise

Mapping agent modules onto human brain functionalities supplies a productive organizing framework for future agent research.

What would settle it

A controlled comparison in which non-brain-mapped modular agent designs consistently outperform brain-mapped ones on the same tasks and benchmarks would falsify the framework's utility.

Figures

Figures reproduced from arXiv: 2504.01990 by Bang Liu, Boyan Li, Chenglin Wu, Chi Wang, Dekun Wu, Fengwei Teng, Glen Berseth, Haibo Jin, Haochen Shi, Haohan Wang, Hongzhang Liu, Huan Sun, Huan Zhang, Ian Foster, Jian Pei, Jianyun Nie, Jiaqi Chen, Jiawei Xu, Jiaxuan You, Jiayi Zhang, Jinlin Wang, Jinyu Xiang, Kaitao Song, Kunlun Zhu, Logan Ward, Mingchen Zhuge, Ollie Liu, Peiyan Zhang, Qiang Yang, Qingyun Wu, Shaokun Zhang, Sirui Hong, Suyuchen Wang, Tanjin He, Tianming Liu, Tongliang Liu, Xiangru Tang, Xiaojun Jia, Xiaoliang Qi, Xiaoqiang Wang, Xinbing Liang, Xinfeng Li, Yizhang Lin, Yu Gu, Yuheng Cheng, Yu Su, Yuyu Luo, Zhaoyang Yu.

Figure 1.1
Figure 1.1. Figure 1.1: Illustration of key human brain functionalities grouped by major brain regions, annotated [PITH_FULL_IMAGE:figures/full_fig_p020_1_1.png] view at source ↗
Figure 1.2
Figure 1.2. Figure 1.2: An overview of our general framework for describing an intelligent agent loop and agent society. [PITH_FULL_IMAGE:figures/full_fig_p032_1_2.png] view at source ↗
Figure 2.1
Figure 2.1. Figure 2.1: A taxonomy of research on cognition covering different learning and reasoning paradigms. [PITH_FULL_IMAGE:figures/full_fig_p041_2_1.png] view at source ↗
Figure 2.2
Figure 2.2. Figure 2.2: Learning as optimisation under three competing forces. The mental state [PITH_FULL_IMAGE:figures/full_fig_p043_2_2.png] view at source ↗
Figure 2.3
Figure 2.3. Figure 2.3: Comparison between full mental state learning and partial mental state learning in intelligent [PITH_FULL_IMAGE:figures/full_fig_p045_2_3.png] view at source ↗
Figure 2.4
Figure 2.4. Figure 2.4: Three core learning objectives in intelligent agents. From left to right: [PITH_FULL_IMAGE:figures/full_fig_p048_2_4.png] view at source ↗
Figure 2.5
Figure 2.5. Figure 2.5: Comparison of reasoning paradigms in LLM-based agents. [PITH_FULL_IMAGE:figures/full_fig_p052_2_5.png] view at source ↗
Figure 2.6
Figure 2.6. Figure 2.6: Reasoning as a scheduler-controlled sequence of atomic rewrites. From the mental state [PITH_FULL_IMAGE:figures/full_fig_p054_2_6.png] view at source ↗
Figure 3.1
Figure 3.1. Figure 3.1: The hierarchical taxonomy of the human memory system. [PITH_FULL_IMAGE:figures/full_fig_p065_3_1.png] view at source ↗
Figure 3.2
Figure 3.2. Figure 3.2: Atkinson-Shiffrin three-stage model of human memory [ [PITH_FULL_IMAGE:figures/full_fig_p067_3_2.png] view at source ↗
Figure 3.3
Figure 3.3. Figure 3.3: Baddeley’s model of working memory [246]. (verbal) and the visuospatial sketchpad (visual/spatial). A subsequent refinement introduced the episodic buffer to integrate material from these subsystems with long-term memory [247] [PITH_FULL_IMAGE:figures/full_fig_p067_3_3.png] view at source ↗
Figure 3.4
Figure 3.4. Figure 3.4: The Serial-Parallel Independent (SPI) model of human memory [ [PITH_FULL_IMAGE:figures/full_fig_p067_3_4.png] view at source ↗
Figure 3.5
Figure 3.5. Figure 3.5: An abstraction of the most important processes in the ACT-R model [ [PITH_FULL_IMAGE:figures/full_fig_p068_3_5.png] view at source ↗
Figure 3.6
Figure 3.6. Figure 3.6: A taxonomy of selected research works about different memory modules in intelligent agents. [PITH_FULL_IMAGE:figures/full_fig_p071_3_6.png] view at source ↗
Figure 3.7
Figure 3.7. Figure 3.7: Illustration of sensory memory formation in intelligent agents. Multimodal sensory inputs—such [PITH_FULL_IMAGE:figures/full_fig_p072_3_7.png] view at source ↗
Figure 3.8
Figure 3.8. Figure 3.8: Illustration of short-term and working memory in action. The figure depicts an agent navigating [PITH_FULL_IMAGE:figures/full_fig_p074_3_8.png] view at source ↗
Figure 3.9
Figure 3.9. Figure 3.9: Illustration of long-term memory in cognition-inspired intelligent agents. The figure depicts [PITH_FULL_IMAGE:figures/full_fig_p075_3_9.png] view at source ↗
Figure 3.10
Figure 3.10. Figure 3.10: Illustration of the memory lifecycle. The memory retention process involves three sequential [PITH_FULL_IMAGE:figures/full_fig_p076_3_10.png] view at source ↗
Figure 3.11
Figure 3.11. Figure 3.11: Illustration of the contrast between a retrieval-based memory and a generative neural memory. [PITH_FULL_IMAGE:figures/full_fig_p083_3_11.png] view at source ↗
Figure 4.1
Figure 4.1. Figure 4.1: Humans can use their brain’s model of the world to predict the consequences of their actions. For [PITH_FULL_IMAGE:figures/full_fig_p088_4_1.png] view at source ↗
Figure 4.2
Figure 4.2. Figure 4.2: A two-dimensional layout of AI world-model methods. The horizontal axis indicates [PITH_FULL_IMAGE:figures/full_fig_p091_4_2.png] view at source ↗
Figure 4.3
Figure 4.3. Figure 4.3: Four paradigms of world modeling: (a) implicit, (b) explicit, (c) simulator-based, and (d) [PITH_FULL_IMAGE:figures/full_fig_p096_4_3.png] view at source ↗
Figure 5.1
Figure 5.1. Figure 5.1: A taxonomy of selected research works about reward systems. [PITH_FULL_IMAGE:figures/full_fig_p102_5_1.png] view at source ↗
Figure 5.2
Figure 5.2. Figure 5.2: The diagram of common reward pathways in the human brain. The structures include VTA [PITH_FULL_IMAGE:figures/full_fig_p104_5_2.png] view at source ↗
Figure 5.3
Figure 5.3. Figure 5.3: The interaction loop between agent and environment in a Markov Decision Process. The [PITH_FULL_IMAGE:figures/full_fig_p105_5_3.png] view at source ↗
Figure 5.4
Figure 5.4. Figure 5.4: Illustration of reward paradigms in AI agents, including extrinsic, intrinsic, hybrid, and hierarchical [PITH_FULL_IMAGE:figures/full_fig_p106_5_4.png] view at source ↗
Figure 6.1
Figure 6.1. Figure 6.1: Visualization and examples of major emotion theory categories. (a) Categorical Theories: Ek￾man’s six basic emotions [541] showing discrete emotional states. (b) Dimensional Models: Russell’s Cir￾cumplex [542] representing emotions as coordinates in continuous space. (c) Hybrid/Componential Frame￾works: Plutchik’s Wheel [543] combining intensity gradients with categorical emotions. (d) Neurocognitive Per… view at source ↗
Figure 7.1
Figure 7.1. Figure 7.1: A taxonomy of selected research works about perception models or systems. [PITH_FULL_IMAGE:figures/full_fig_p121_7_1.png] view at source ↗
Figure 7.2
Figure 7.2. Figure 7.2: Comparison of common perceptual types between human beings and AI agents. [PITH_FULL_IMAGE:figures/full_fig_p123_7_2.png] view at source ↗
Figure 7.3
Figure 7.3. Figure 7.3: Illustration of unimodal, cross-modal, and multimodal perception. Each model begins with raw [PITH_FULL_IMAGE:figures/full_fig_p124_7_3.png] view at source ↗
Figure 7.4
Figure 7.4. Figure 7.4: Representative cross-modal and multimodal applications in intelligent agents. This figure [PITH_FULL_IMAGE:figures/full_fig_p127_7_4.png] view at source ↗
Figure 8.1
Figure 8.1. Figure 8.1: Illustration of several concepts related to action and action execution. In this pipeline, the cognitive [PITH_FULL_IMAGE:figures/full_fig_p135_8_1.png] view at source ↗
Figure 8.2
Figure 8.2. Figure 8.2: Illustrative taxonomy of human actions, showing both mental and physical facets. [PITH_FULL_IMAGE:figures/full_fig_p136_8_2.png] view at source ↗
Figure 8.3
Figure 8.3. Figure 8.3: A taxonomy of selected research works about action system, including action space and learning [PITH_FULL_IMAGE:figures/full_fig_p139_8_3.png] view at source ↗
Figure 8.4
Figure 8.4. Figure 8.4: A taxonomy of tool systems in AI agents, including tool category and learning paradigm. [PITH_FULL_IMAGE:figures/full_fig_p145_8_4.png] view at source ↗
Figure 8.5
Figure 8.5. Figure 8.5: (a) Compare the brain from “outside-in” and “inside-out”. (b) Illustration of the schematic of the corollary discharge mechanism. A motor command (efferent signal) travels from motor areas to the eye muscles, while a corollary discharge (dashed arrow) is routed to a comparator in the sensory system. The comparator uses this internal signal to modulate or subtract external (exafferent) input. Additionally… view at source ↗
Figure 9.1
Figure 9.1. Figure 9.1: Illustration of the multi-layered optimization framework for intelligent agents. [PITH_FULL_IMAGE:figures/full_fig_p157_9_1.png] view at source ↗
Figure 9.2
Figure 9.2. Figure 9.2: The prompt optimization cycle consists of three core functions: Optimize, Execute, and Evaluate. [PITH_FULL_IMAGE:figures/full_fig_p159_9_2.png] view at source ↗
Figure 9.3
Figure 9.3. Figure 9.3: Framework for optimizing agentic workflows composed of LLM-invoking nodes and edges. The [PITH_FULL_IMAGE:figures/full_fig_p162_9_3.png] view at source ↗
Figure 9.4
Figure 9.4. Figure 9.4: Tool optimization operates through two strategies: Tool Learning (optimizing existing tools via [PITH_FULL_IMAGE:figures/full_fig_p164_9_4.png] view at source ↗
Figure 10.1
Figure 10.1. Figure 10.1: Three core iterative LLM-based optimization strategies: (Left) Random Search samples and [PITH_FULL_IMAGE:figures/full_fig_p172_10_1.png] view at source ↗
Figure 11.1
Figure 11.1. Figure 11.1: An illustration of self-improvement under three different utilization scenarios, including Online, [PITH_FULL_IMAGE:figures/full_fig_p180_11_1.png] view at source ↗
Figure 12.1
Figure 12.1. Figure 12.1: Schematic representation of agent intelligence and knowledge discovery. The agent’s intelligence, measured by the KL divergence DK between predictions and real-world probability distributions, evolves from fluid intelligence (zero-shot predictions for new problems) to crystallized intelligence (knowledge￾augmented predictions after learning) as it accumulates data in its memory Mmem t over time t. Given… view at source ↗
Figure 12.2
Figure 12.2. Figure 12.2: Closed-loop knowledge discovery for sustainable self-evolution of an AI scientist. The agent aims to iteratively enhance its intelligence IQagent t through hypothesis generation and testing, as well as through data analysis and implication derivation. When interacting with the physical world W, the agent generates hypotheses as an explicitly or implicitly predicted distribution (Pθ) of unknown informati… view at source ↗
Figure 12.3
Figure 12.3. Figure 12.3: A taxonomy of research works about LLM-based Multi-Agent Systems. [PITH_FULL_IMAGE:figures/full_fig_p206_12_3.png] view at source ↗
Figure 13.1
Figure 13.1. Figure 13.1: An overview of three major collaboration types in LLM-based MAS: [PITH_FULL_IMAGE:figures/full_fig_p208_13_1.png] view at source ↗
Figure 13.2
Figure 13.2. Figure 13.2: An LLM-based robot doctor interacts with a robot patient, illustrating the simulation of clinical [PITH_FULL_IMAGE:figures/full_fig_p212_13_2.png] view at source ↗
Figure 13.3
Figure 13.3. Figure 13.3: An LLM-based robot doctor interacts with a robot patient, illustrating the simulation of clinical [PITH_FULL_IMAGE:figures/full_fig_p214_13_3.png] view at source ↗
Figure 13.4
Figure 13.4. Figure 13.4: Three LLM-based agents, each assigned a distinct role (programmer, architect, and project [PITH_FULL_IMAGE:figures/full_fig_p215_13_4.png] view at source ↗
Figure 14.1
Figure 14.1. Figure 14.1: Illustration of collaborative and competitive multi-agent dynamics. [PITH_FULL_IMAGE:figures/full_fig_p218_14_1.png] view at source ↗
Figure 14.2
Figure 14.2. Figure 14.2: Three types of multi-agent teams: Homogeneous agents share identical capabilities and collaborate through uniform behavior; Heterogeneous agents bring complementary roles, perceptions, and actions to address complex tasks; and Emergent specialization arises when identical agents evolve distinct roles through repeated interaction and adaptation. 14.1.1 Homogeneous Agents Homogeneous agents share identica… view at source ↗
Figure 14.3
Figure 14.3. Figure 14.3: Different types of topological structures for multi-agent collaboration. [PITH_FULL_IMAGE:figures/full_fig_p221_14_3.png] view at source ↗
Figure 14.4
Figure 14.4. Figure 14.4: An overview of four agent-agent collaboration types in LLM-based MAS: [PITH_FULL_IMAGE:figures/full_fig_p226_14_4.png] view at source ↗
Figure 14.5
Figure 14.5. Figure 14.5: Three paradigms of human-agent collaboration. ( [PITH_FULL_IMAGE:figures/full_fig_p231_14_5.png] view at source ↗
Figure 14.6
Figure 14.6. Figure 14.6: Two principal paradigms of decision-making in multi-agent systems. ( [PITH_FULL_IMAGE:figures/full_fig_p232_14_6.png] view at source ↗
Figure 14.7
Figure 14.7. Figure 14.7: Illustration of the two foundational communication protocols in multi-agent systems. The [PITH_FULL_IMAGE:figures/full_fig_p236_14_7.png] view at source ↗
Figure 15.1
Figure 15.1. Figure 15.1: Collective intelligence emerges when multiple agents coordinate, complement, and refine each [PITH_FULL_IMAGE:figures/full_fig_p241_15_1.png] view at source ↗
Figure 16.1
Figure 16.1. Figure 16.1: The Brain (LLM) faces safety threats like hallucination (§ [PITH_FULL_IMAGE:figures/full_fig_p256_16_1.png] view at source ↗
Figure 17.1
Figure 17.1. Figure 17.1: Agent Intrinsic Safety: Threats on LLM Brain. [PITH_FULL_IMAGE:figures/full_fig_p258_17_1.png] view at source ↗
Figure 17.2
Figure 17.2. Figure 17.2: Illustration of White-box and Black-box Jailbreak Methods: (1) White-box: The adversary has [PITH_FULL_IMAGE:figures/full_fig_p259_17_2.png] view at source ↗
Figure 17.3
Figure 17.3. Figure 17.3: Illustration of Direct and Indirect Prompt Injection Methods: (1) Direct: The adversary directly [PITH_FULL_IMAGE:figures/full_fig_p262_17_3.png] view at source ↗
Figure 17.4
Figure 17.4. Figure 17.4: Illustration of Knowledge-Conflict and Context-Conflict Hallucinations: (1) Knowledge-Conflict: [PITH_FULL_IMAGE:figures/full_fig_p263_17_4.png] view at source ↗
Figure 17.5
Figure 17.5. Figure 17.5: Illustration of Goal-Misguided and Capability-Misused Misalignment: (1) Goal-Misguided [PITH_FULL_IMAGE:figures/full_fig_p265_17_5.png] view at source ↗
Figure 17.6
Figure 17.6. Figure 17.6: Illustration of Model Poisoning and Data Poisoning: (1) Model Poisoning: The attacker injects a [PITH_FULL_IMAGE:figures/full_fig_p267_17_6.png] view at source ↗
Figure 17.7
Figure 17.7. Figure 17.7: Illustration of Membership Inference and Data Extraction Attack Methods: (1) Membership [PITH_FULL_IMAGE:figures/full_fig_p269_17_7.png] view at source ↗
Figure 17.8
Figure 17.8. Figure 17.8: Illustration of System and User Prompt Stealing Methods: (1) System Prompt Stealing: The [PITH_FULL_IMAGE:figures/full_fig_p270_17_8.png] view at source ↗
Figure 18.1
Figure 18.1. Figure 18.1: Agent Intrinsic Safety: Threats on LLM Non-Brains. [PITH_FULL_IMAGE:figures/full_fig_p275_18_1.png] view at source ↗
Figure 19.1
Figure 19.1. Figure 19.1: Agent Extrinsic Safety: Threats on agent-memory, agent-environment, and agent-agent [PITH_FULL_IMAGE:figures/full_fig_p280_19_1.png] view at source ↗
Figure 20.1
Figure 20.1. Figure 20.1: Performance and safety analysis of LLMs. (a) The relationship between LLM model size and their average ASR across various attacks. The data are sourced from experimental results of a study assessing the robustness of LLMs against adversarial attacks [355]. (b) The relationship between the capability of LLMs and their average attack success rate (ASR) across various attacks. The LLM capability data are d… view at source ↗
read the original abstract

The advent of large language models (LLMs) has catalyzed a transformative shift in artificial intelligence, paving the way for advanced intelligent agents capable of sophisticated reasoning, robust perception, and versatile action across diverse domains. As these agents increasingly drive AI research and practical applications, their design, evaluation, and continuous improvement present intricate, multifaceted challenges. This book provides a comprehensive overview, framing intelligent agents within modular, brain-inspired architectures that integrate principles from cognitive science, neuroscience, and computational research. We structure our exploration into four interconnected parts. First, we systematically investigate the modular foundation of intelligent agents, systematically mapping their cognitive, perceptual, and operational modules onto analogous human brain functionalities and elucidating core components such as memory, world modeling, reward processing, goal, and emotion. Second, we discuss self-enhancement and adaptive evolution mechanisms, exploring how agents autonomously refine their capabilities, adapt to dynamic environments, and achieve continual learning through automated optimization paradigms. Third, we examine multi-agent systems, investigating the collective intelligence emerging from agent interactions, cooperation, and societal structures. Finally, we address the critical imperative of building safe and beneficial AI systems, emphasizing intrinsic and extrinsic security threats, ethical alignment, robustness, and practical mitigation strategies necessary for trustworthy real-world deployment. By synthesizing modular AI architectures with insights from different disciplines, this survey identifies key research challenges and opportunities, encouraging innovations that harmonize technological advancement with meaningful societal benefit.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The paper is a survey that provides a comprehensive overview of foundation agents by framing them within modular, brain-inspired architectures integrating cognitive science, neuroscience, and computational research. It is structured into four parts: (1) modular foundations mapping agent components such as memory, world modeling, reward processing, goal, and emotion onto human brain functionalities; (2) self-enhancement, adaptive evolution, and continual learning mechanisms; (3) multi-agent systems and collective intelligence from interactions; and (4) safety, ethical alignment, robustness, and mitigation strategies. The central claim is that this synthesis identifies key research challenges and opportunities to advance agents while ensuring societal benefit.

Significance. If the synthesis is balanced and representative, the survey offers a structured interdisciplinary lens that could help organize the rapidly growing literature on agent architectures. It explicitly highlights the value of modular designs and the imperative for safe systems. The brain-inspired mapping is used as an organizational scaffold rather than a falsifiable hypothesis, so the reader's noted concern about its productivity as a framework does not constitute a load-bearing flaw for the survey's contribution. No new theorems, empirical results, or parameter-free derivations are claimed, which is appropriate for this genre but limits the strength of the significance assessment.

minor comments (1)
  1. [Abstract] Abstract: The text refers to 'this book provides a comprehensive overview' while the work is presented as an arXiv paper (arXiv:2504.01990). Clarifying the intended format (survey paper vs. book chapter) would avoid potential reader confusion about scope and publication venue.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and constructive summary of our survey. We appreciate the recommendation for minor revision and the recognition that the brain-inspired modular framing serves as an organizational scaffold rather than a falsifiable hypothesis, which aligns with the survey genre. Since no specific major comments were enumerated in the report, we have no point-by-point revisions to address at this stage but remain ready to incorporate any editorial suggestions.

Circularity Check

0 steps flagged

No significant circularity; survey is organizational synthesis only

full rationale

The paper is a survey that structures existing literature around a brain-inspired modular framework without asserting new derivations, equations, quantitative predictions, or theorems. The abstract and structure describe an organizational scaffold for reviewing cognitive modules, self-enhancement, multi-agent systems, and safety, with no load-bearing steps that reduce to fitted parameters, self-definitions, or self-citation chains. No equations or falsifiable claims are present that could exhibit the required reduction to inputs by construction. The mapping to brain functionalities functions as a descriptive lens rather than a premise whose validity is internally derived or assumed without external support.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

As an abstract-only survey, the ledger records the high-level domain assumptions the framing rests upon; no new free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Principles from cognitive science, neuroscience, and computational research can be integrated into modular architectures for intelligent agents.
    Invoked when the abstract states the exploration is framed within brain-inspired architectures that integrate these fields.

pith-pipeline@v0.9.0 · 5968 in / 1143 out tokens · 21973 ms · 2026-05-22T21:37:36.193441+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 31 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems

    cs.AI 2026-05 unverdicted novelty 7.0

    A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.

  2. Harnessing Agentic Evolution

    cs.AI 2026-05 unverdicted novelty 7.0

    AEvo introduces a meta-agent that edits the evolution procedure or agent context based on accumulated state, outperforming baselines by 26% relative improvement on agentic benchmarks and achieving SOTA on open-ended tasks.

  3. AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems

    cs.CL 2026-05 unverdicted novelty 7.0

    AgentForesight trains a 7B model to perform online auditing of multi-agent LLM trajectories, detecting early decisive errors and outperforming larger models on custom and external benchmarks.

  4. AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems

    cs.CL 2026-05 unverdicted novelty 7.0

    AgentForesight introduces an online auditor model that predicts decisive errors in multi-agent trajectories at the earliest step using a coarse-to-fine reinforcement learning recipe on a new curated dataset AFTraj-2K.

  5. Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

    cs.AI 2026-05 conditional novelty 7.0

    Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

  6. ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation

    cs.LG 2026-04 unverdicted novelty 7.0

    ReCast repairs all-zero groups and uses contrastive updates on strongest positives and hardest negatives to improve RL in generative recommendation, yielding up to 36.6% better Pass@1 with only 4.1% of baseline rollou...

  7. Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

    cs.CV 2026-01 unverdicted novelty 7.0

    VideoDR is a new benchmark for open-web video deep research that tests multimodal models on cross-frame visual anchor extraction, interactive retrieval, and multi-hop reasoning over joint video-web evidence.

  8. FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning

    cs.LG 2026-01 conditional novelty 7.0

    FOREVER aligns replay intervals in LLM continual learning with a model-centric time based on optimizer update magnitudes and an Ebbinghaus-inspired forgetting curve to reduce catastrophic forgetting.

  9. Mem-$\pi$: Adaptive Memory through Learning When and What to Generate

    cs.CL 2026-05 unverdicted novelty 6.0

    Mem-π is a framework using a dedicated model and decision-content decoupled RL to generate context-specific guidance on demand for LLM agents, outperforming retrieval baselines by over 30% on web navigation.

  10. Rewarding Beliefs, Not Actions: Consistency-Guided Credit Assignment for Long-Horizon Agents

    cs.CL 2026-05 unverdicted novelty 6.0

    ReBel uses belief-consistency supervision and belief-aware grouping to improve credit assignment in long-horizon RL for LLM agents, achieving up to 20.4 percentage points higher success and 2.1x better sample efficien...

  11. LPDS: Evaluating LLM Robustness Through Logic-Preserving Difficulty Scaling

    cs.LG 2026-05 conditional novelty 6.0

    LPDS quantifies difficulty of logic-preserving problem variations and searches for the hardest ones, producing up to 5x larger performance drops than random sampling and better robustness gains from fine-tuning on dif...

  12. FuzzAgent: Multi-Agent System for Evolutionary Library Fuzzing

    cs.SE 2026-05 conditional novelty 6.0

    FuzzAgent deploys specialized agents that collaborate on harness generation, execution, and crash triage to evolve fuzzing campaigns, delivering 45-191% more branch coverage than four baselines on 20 C/C++ libraries a...

  13. CHAL: Council of Hierarchical Agentic Language

    cs.AI 2026-05 unverdicted novelty 6.0

    CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.

  14. MemORAI: Memory Organization and Retrieval via Adaptive Graph Intelligence for LLM Conversational Agents

    cs.CL 2026-05 unverdicted novelty 6.0

    MemORAI combines selective filtering, provenance tracking in multi-relational graphs, and dynamic weighted PageRank retrieval to achieve state-of-the-art memory retrieval and personalized responses in LLM agents on LO...

  15. Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems

    cs.AI 2026-04 unverdicted novelty 6.0

    DiffMAS jointly optimizes latent communication and reasoning in multi-agent LLM systems via parameter-efficient supervised training on trajectories, yielding consistent gains over baselines on math, science, and code ...

  16. Do LLMs Need to See Everything? A Benchmark and Study of Failures in LLM-driven Smartphone Automation using Screentext vs. Screenshots

    cs.HC 2026-04 unverdicted novelty 6.0

    A new benchmark shows LLM smartphone agents achieve comparable success with screen text alone as with screenshots, but both fail often due to UI accessibility and reasoning gaps.

  17. Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models

    cs.NE 2026-04 unverdicted novelty 6.0

    Agent-GWO uses collaborative grey-wolf-inspired agents to jointly optimize LLM prompts and decoding settings, yielding higher accuracy and stability than prior single-agent prompt optimization methods on math and hybr...

  18. ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

    cs.CR 2026-04 unverdicted novelty 6.0

    ADAM extracts data from LLM agent memory with up to 100% attack success rate by estimating data distribution and selecting queries via entropy guidance.

  19. Memory in the Age of AI Agents

    cs.CL 2025-12 unverdicted novelty 6.0

    The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.

  20. The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

    cs.AI 2025-09 accept novelty 6.0

    Survey that defines agentic RL for LLMs via POMDPs, introduces a taxonomy of planning/tool-use/memory/reasoning capabilities and domains, and compiles open environments from over 500 papers.

  21. Scalable Environments Drive Generalizable Agents

    cs.AI 2026-05 unverdicted novelty 5.0

    Generalizable agents require environment scaling via diverse executable rule-sets, distinguished from trajectory and task scaling in a new taxonomy.

  22. Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems

    cs.AI 2026-05 conditional novelty 5.0

    The survey proposes the LIFE framework to unify fragmented research on collaboration, failure attribution, and self-evolution in LLM multi-agent systems into a progression toward self-organizing intelligence.

  23. SynthAgent: Adapting Web Agents with Synthetic Supervision

    cs.LG 2025-11 unverdicted novelty 5.0

    SynthAgent uses dual refinement of synthetic tasks and trajectories to produce higher-quality training data that improves web agent adaptation to target environments.

  24. UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

    cs.AI 2025-09 conditional novelty 5.0

    UI-TARS-2 reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld while attaining 59.8 mean normalized score on a 15-game suite through multi-turn RL and scalable data generation.

  25. A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

    cs.AI 2025-08 unverdicted novelty 5.0

    A comprehensive review of self-evolving AI agents that improve themselves over time, organized via a framework of inputs, agent system, environment, and optimizers, with domain-specific and safety discussions.

  26. Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning

    cs.CL 2025-05 unverdicted novelty 5.0

    Mujica-MyGo decomposes multi-turn RAG interactions via multi-agent workflows and applies minimalist policy gradient optimization to improve performance on QA benchmarks while avoiding long-context problems.

  27. Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

    cs.CL 2025-03 accept novelty 5.0

    A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

  28. Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures

    cs.AI 2026-04 unverdicted novelty 4.0

    A survey comparing classical multi-agent systems with large foundation model-enabled multi-agent systems, showing how the latter enables semantic-level collaboration and greater adaptability.

  29. Perspective on Bias in Biomedical AI: Preventing Downstream Healthcare Disparities

    cs.AI 2026-04 conditional novelty 4.0

    Omics datasets show low ancestry reporting and strong European bias, which biomedical foundation models risk perpetuating into downstream healthcare disparities unless addressed through provenance, openness, and evalu...

  30. Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

    cs.AI 2025-10 unverdicted novelty 4.0

    A survey that taxonomizes threats to agentic AI, reviews benchmarks and evaluation methods, discusses technical and governance defenses, and identifies open challenges.

  31. Bridging Brains and Machines: A Unified Frontier in Neuroscience, Artificial Intelligence, and Neuromorphic Systems

    q-bio.NC 2025-07 unverdicted novelty 2.0

    A position and survey paper that identifies convergence between neuroscience, AGI, and neuromorphic computing and outlines four key integration challenges.

Reference graph

Works this paper leans on

299 extracted references · 299 canonical work pages · cited by 29 Pith papers · 40 internal anchors

  1. [1]

    Springer, 2009

    Alan M Turing.Computing machinery and intelligence. Springer, 2009

  2. [2]

    Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

  3. [3]

    Russell and Peter Norvig.Artificial Intelligence: A Modern Approach

    Stuart J. Russell and Peter Norvig.Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs, NJ, 1 edition, 1995. ISBN 0-13-103805-2

  4. [4]

    Gps, a program that simulates human thought.Rand Corporation Santa Monica, CA, 1961

    Allen Newell and Herbert Alexander Simon. Gps, a program that simulates human thought.Rand Corporation Santa Monica, CA, 1961

  5. [5]

    A robust layered control system for a mobile robot.IEEE journal on robotics and automation, 2(1):14–23, 1986

    Rodney Brooks. A robust layered control system for a mobile robot.IEEE journal on robotics and automation, 2(1):14–23, 1986

  6. [6]

    John wiley & sons, 2009

    Michael Wooldridge.An introduction to multiagent systems. John wiley & sons, 2009

  7. [7]

    Introducing chatgpt.https://openai.com/blog/chatgpt/, 2022

    OpenAI. Introducing chatgpt.https://openai.com/blog/chatgpt/, 2022

  8. [8]

    DeepSeek-V3 Technical Report

    Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng,ChenyuZhang,ChongRuan,etal. Deepseek-v3technicalreport. arXivpreprintarXiv:2412.19437 , 2024

  9. [9]

    Claude: The next step in helpful ai.https://www.anthropic.com, 2023

    Anthropic. Claude: The next step in helpful ai.https://www.anthropic.com, 2023. Accessed: 2024-12-01

  10. [10]

    AnYang,BaosongYang,BeichenZhang,BinyuanHui,BoZheng,BowenYu,ChengyuanLi,Dayiheng Liu, Fei Huang, Haoran Wei, et al. Qwen2. 5 technical report.arXiv preprint arXiv:2412.15115, 2024

  11. [11]

    LLaMA: Open and Efficient Foundation Language Models

    Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023

  12. [12]

    Trainingahelpfulandharmless assistant with rlhf.OpenAI Technical Report, 2022

    YuntaoBai,SauravKadavath,SandipanKundu,AmandaAskell,etal. Trainingahelpfulandharmless assistant with rlhf.OpenAI Technical Report, 2022

  13. [13]

    Principles of neural science, 2013

    Eric R Kandel, James H Schwartz, Thomas Jessell, Steven A Siegelbaum, and AJ Hudspeth. Principles of neural science, 2013

  14. [14]

    De Boeck Supérieur, 2019

    Dale Purves, George J Augustine, David Fitzpatrick, William Hall, Anthony-Samuel LaMantia, and Leonard White.Neurosciences. De Boeck Supérieur, 2019

  15. [15]

    Phineas Gage

    Wikipedia contributors. Phineas Gage. https://en.wikipedia.org/wiki/Phineas_Gage, 2025. Accessed: June 8, 2025

  16. [16]

    Organization of reward and movement signals in the basal ganglia and cerebellum.Nature Communications, 15(1):2119, 2024

    Noga Larry, Gil Zur, and Mati Joshua. Organization of reward and movement signals in the basal ganglia and cerebellum.Nature Communications, 15(1):2119, 2024. 296 Bibliography

  17. [17]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems (NeurIPS), volume 30, pages 5998–6008. Curran Associates, Inc., 2017

  18. [18]

    Attention mechanisms in computer vision: A survey.Computational visual media, 8(3):331–368, 2022

    Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R Martin, Ming-Ming Cheng, and Shi-Min Hu. Attention mechanisms in computer vision: A survey.Computational visual media, 8(3):331–368, 2022

  19. [19]

    Control of goal-directed and stimulus-driven attention in the brain.Nature reviews neuroscience, 3(3):201–215, 2002

    Maurizio Corbetta and Gordon L Shulman. Control of goal-directed and stimulus-driven attention in the brain.Nature reviews neuroscience, 3(3):201–215, 2002

  20. [20]

    Deep learning.nature, 521(7553):436–444, 2015

    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.nature, 521(7553):436–444, 2015

  21. [21]

    Wernicke’s area.https://en.wikipedia.org/wiki/Wernicke%27s_area,

    Wikipedia contributors. Wernicke’s area.https://en.wikipedia.org/wiki/Wernicke%27s_area,

  22. [22]

    Accessed: June 8, 2025

  23. [23]

    Cocktail party effect.https://en.wikipedia.org/wiki/Cocktail_party_ effect, 2025

    Wikipedia contributors. Cocktail party effect.https://en.wikipedia.org/wiki/Cocktail_party_ effect, 2025. Accessed: June 8, 2025

  24. [24]

    Chapter 5: Cerebellum

    James Knierim. Chapter 5: Cerebellum. https://nba.uth.tmc.edu/neuroscience/m/s3/ chapter05.html, 2020. Last review: October 20, 2020; Accessed: June 8, 2025

  25. [25]

    Torgeir Moberget and Richard B Ivry. Cerebellar contributions to motor control and language comprehension: searching for common computational principles.Annals of the New York Academy of Sciences, 1369(1):154–171, 2016

  26. [26]

    What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural networks, 12(7-8):961–974, 1999

    Kenji Doya. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural networks, 12(7-8):961–974, 1999

  27. [27]

    Brainstem: What It Is, Function, Anatomy & Location

    Cleveland Clinic. Brainstem: What It Is, Function, Anatomy & Location. https://my. clevelandclinic.org/health/body/21598-brainstem, 2024. Last reviewed: June 12, 2024; Ac- cessed: June 8, 2025

  28. [28]

    Thalamus: What It Is, Function & Disorders.https://my.clevelandclinic.org/ health/body/22652-thalamus, 2022

    Cleveland Clinic. Thalamus: What It Is, Function & Disorders.https://my.clevelandclinic.org/ health/body/22652-thalamus, 2022. Last reviewed: March 30, 2022; Accessed: June 8, 2025

  29. [29]

    Thalamic control of sensory selection in divided attention.Nature, 526(7575):705–709, 2015

    Ralf D Wimmer, L Ian Schmitt, Thomas J Davidson, Miho Nakajima, Karl Deisseroth, and Michael M Halassa. Thalamic control of sensory selection in divided attention.Nature, 526(7575):705–709, 2015

  30. [30]

    Limbic system.https://en.wikipedia.org/wiki/Limbic_system, 2025

    Wikipedia contributors. Limbic system.https://en.wikipedia.org/wiki/Limbic_system, 2025. Accessed: June 8, 2025

  31. [31]

    Sanchez Jimenez and Orlando De Jesus

    Jose G. Sanchez Jimenez and Orlando De Jesus. Hypothalamic Dysfunction.StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan–, 2023. URLhttps://www.ncbi.nlm.nih.gov/ books/NBK560743/. Last update: August 23, 2023; Accessed: June 8, 2025

  32. [32]

    Superintelligent agents pose catastrophic risks: Can scientist ai offer a safer path?arXiv preprint arXiv:2502.15657, 2025

    Yoshua Bengio, Michael Cohen, Damiano Fornasiere, Joumana Ghosn, Pietro Greiner, Matt MacDer- mott,SörenMindermann,AdamOberman,JesseRichardson,OliverRichardson,etal. Superintelligent agents pose catastrophic risks: Can scientist ai offer a safer path?arXiv preprint arXiv:2502.15657, 2025

  33. [33]

    Chatgpt (gpt-4)

    OpenAI. Chatgpt (gpt-4). urlhttps://chat.openai.com/, 2023. Large language model

  34. [34]

    React: Synergizing reasoning and acting in language models

    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models. InInternational Conference on Learning Representations (ICLR), 2023

  35. [35]

    Cognitive architectures for language agents.Transactions on Machine Learning Research, 2024

    Theodore Sumers, Shunyu Yao, Karthik Narasimhan, and Thomas Griffiths. Cognitive architectures for language agents.Transactions on Machine Learning Research, 2024. Bibliography 297

  36. [36]

    Generative agents: Interactive simulacra of human behavior

    Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th annual acm symposium on user interface software and technology, pages 1–22, 2023

  37. [37]

    Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13): 3521–3526, 2017

    James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13): 3521–3526, 2017

  38. [38]

    MIT press Cambridge, 1998

    Richard S Sutton, Andrew G Barto, et al.Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998

  39. [39]

    World Models

    David Ha and Jürgen Schmidhuber. World models.arXiv preprint arXiv:1803.10122, 2018

  40. [40]

    Artificial life meets entertainment: lifelike autonomous agents.Communications of the ACM, 38(11):108–114, 1995

    Pattie Maes. Artificial life meets entertainment: lifelike autonomous agents.Communications of the ACM, 38(11):108–114, 1995

  41. [41]

    Isitanagent,orjustaprogram?: Ataxonomyforautonomousagents

    StanFranklinandArtGraesser. Isitanagent,orjustaprogram?: Ataxonomyforautonomousagents. In International workshop on agent theories, architectures, and languages, pages 21–35. Springer, 1997

  42. [42]

    Concrete Problems in AI Safety

    Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. Concrete problems in ai safety.arXiv preprint arXiv:1606.06565, 2016

  43. [43]

    Superintelligence: Paths, dangers, strategies, 2014

    Bostrom Nick. Superintelligence: Paths, dangers, strategies, 2014

  44. [44]

    Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019

    German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019

  45. [45]

    Simon and Schuster, 1988

    Marvin Minsky.Society of mind. Simon and Schuster, 1988

  46. [46]

    Oxford University Press, USA, 2019

    Gyorgy Buzsaki.The brain from inside out. Oxford University Press, USA, 2019

  47. [47]

    Action and behavior: a free-energy formulation

    Karl J Friston, Jean Daunizeau, James Kilner, and Stefan J Kiebel. Action and behavior: a free-energy formulation. Biological cybernetics, 102:227–260, 2010

  48. [48]

    Position: Foundation agents as the paradigm shift for decision making.arXiv preprint arXiv:2405.17009, 2024

    Xiaoqian Liu, Xingzhou Lou, Jianbin Jiao, and Junge Zhang. Position: Foundation agents as the paradigm shift for decision making.arXiv preprint arXiv:2405.17009, 2024

  49. [49]

    Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans

    Larry R Squire. Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. Psychological review, 99(2):195, 1992

  50. [50]

    Jones & Bartlett Learning, 2020

    Mark Bear, Barry Connors, and Michael A Paradiso.Neuroscience: exploring the brain, enhanced edition: exploring the brain. Jones & Bartlett Learning, 2020

  51. [51]

    Predictivecodinginthevisualcortex: afunctionalinterpretation of some extra-classical receptive-field effects.Nature neuroscience, 2(1):79–87, 1999

    RajeshPNRaoandDanaHBallard. Predictivecodinginthevisualcortex: afunctionalinterpretation of some extra-classical receptive-field effects.Nature neuroscience, 2(1):79–87, 1999

  52. [52]

    The emotional brain: The mysterious underpinnings of emotional life

    Joseph E LeDoux. The emotional brain: The mysterious underpinnings of emotional life. Simon and Schuster, 1998

  53. [53]

    Damasio.Descartes’ Error: Emotion, Reason, and the Human Brain

    Antonio R. Damasio.Descartes’ Error: Emotion, Reason, and the Human Brain. Putnam, 1994

  54. [54]

    An integrative theory of prefrontal cortex function.Annual review of neuroscience, 24(1):167–202, 2001

    Earl K Miller and Jonathan D Cohen. An integrative theory of prefrontal cortex function.Annual review of neuroscience, 24(1):167–202, 2001

  55. [55]

    Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes

    David Badre. Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes. Trends in cognitive sciences, 12(5):193–200, 2008

  56. [56]

    A neural substrate of prediction and reward

    Wolfram Schultz, Peter Dayan, and P Read Montague. A neural substrate of prediction and reward. Science, 275(5306):1593–1599, 1997

  57. [57]

    Academic Press, 4th edition, 2008

    Joaquin M Fuster.The Prefrontal Cortex. Academic Press, 4th edition, 2008

  58. [58]

    The organisation of mind.Oxford Psychology Series, 32, 2011

    Tim Shallice and Richard P Cooper. The organisation of mind.Oxford Psychology Series, 32, 2011. 298 Bibliography

  59. [59]

    Mindstorms in natural language-based societies of mind.arXiv preprint arXiv:2305.17066, 2023

    Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R Ashley, Róbert Csordás, Anand Gopalakr- ishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, et al. Mindstorms in natural language-based societies of mind.arXiv preprint arXiv:2305.17066, 2023

  60. [60]

    Agent AI: Surveying the Horizons of Multimodal Interaction

    Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, and Jianfeng Gao. AGENT AI: SURVEYING THE HORIZONS OF MULTIMODAL INTERACTION.arXiv preprint arXiv:2401.03568, 2024

  61. [61]

    PositionPaper: Agent AI Towards a HolisticIntelligence, 2024

    Qiuyuan Huang, Naoki Wake, Bidipta Sarkar, Zane Durante, Ran Gong, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Noboru Kuno, Ade Famoti, Ashley Llorens, John Langford, Hoi Vo, Li Fei-Fei, Katsu Ikeuchi, and Jianfeng Gao. PositionPaper: Agent AI Towards a HolisticIntelligence, 2024. URL http://arxiv.org/abs/2403.00833

  62. [62]

    The rise and potential of large language model based agents: A survey, 2023

    Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. The rise and potential of large language model based agents: A survey, 2023

  63. [63]

    A Survey on Large Language Model based Autonomous Agents

    Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, and Ji-Rong Wen. A Survey on Large Language Model based Autonomous Agents, 2023. URLhttp://arxiv.org/abs/2308.11432

  64. [64]

    Language agents: Foundations, prospects, and risks

    Yu Su, Diyi Yang, Shunyu Yao, and Tao Yu. Language agents: Foundations, prospects, and risks. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, pages 17–24, Miami, Florida, USA, November 2024. Association for Computational Linguistics. URL https://aclanthology.org/2024.emnlp-tutorials.3

  65. [65]

    The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey

    Tula Masterman, Sandi Besen, Mason Sawtell, and Alex Chao. The landscape of emerging ai agent architectures for reasoning, planning, and tool calling: A survey.arXiv preprint arXiv:2404.11584, 2024

  66. [66]

    Chawla, Olaf Wiest, and Xiangliang Zhang

    TaichengGuo,XiuyingChen,YaqiWang,RuidiChang,ShichaoPei,NiteshV.Chawla,OlafWiest,and XiangliangZhang. Largelanguagemodelbasedmulti-agents: Asurveyofprogressandchallenges. In Kate Larson, editor,Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24, pages 8048–8057. International Joint Conferences on Artificial I...

  67. [67]

    A Survey on the Memory Mechanism of Large Language Model based Agents

    Zeyu Zhang, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Quanyu Dai, Jieming Zhu, Zhenhua Dong, and Ji-Rong Wen. A survey on the memory mechanism of large language model based agents.arXiv preprint arXiv:2404.13501, 2024

  68. [68]

    A survey on trustworthy llm agents: Threats and countermeasures.arXiv preprint arXiv:2503.09648, 2025

    Miao Yu, Fanci Meng, Xinyun Zhou, Shilong Wang, Junyuan Mao, Linsey Pang, Tianlong Chen, Kun Wang, Xinfeng Li, Yongfeng Zhang, Bo An, and Qingsong Wen. A survey on trustworthy llm agents: Threats and countermeasures.arXiv preprint arXiv:2503.09648, 2025

  69. [69]

    Finetuned Language Models Are Zero-Shot Learners

    Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le. Finetuned language models are zero-shot learners.arXiv preprint arXiv:2109.01652, 2021

  70. [70]

    Parameter-efficient transfer learning for nlp

    NeilHoulsby,AndreiGiurgiu,StanislawJastrzebski,BrunaMorrone,QuentinDeLaroussilhe,Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-efficient transfer learning for nlp. In International conference on machine learning, pages 2790–2799. PMLR, 2019

  71. [71]

    Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730–27744, 2022

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback.Advances in neural information processing systems, 35:27730–27744, 2022. Bibliography 299

  72. [72]

    Reft: Representation finetuning for language models

    TrungQuocLuong,XinboZhang,ZhanmingJie,PengSun,XiaoranJin,andHangLi. Reft: Reasoning withreinforcedfine-tuning. In Proceedingsofthe62ndAnnualMeetingoftheAssociationforComputational Linguistics, 2024. URLhttps://arxiv.org/abs/2404.03592

  73. [73]

    R1-searcher: Incentivizing the search capability in llms via reinforcement learning,

    Huatong Song, Jinhao Jiang, Yingqian Min, Jie Chen, Zhipeng Chen, Wayne Xin Zhao, Lei Fang, and Ji-Rong Wen. R1-searcher: Incentivizing the search capability in llms via reinforcement learning,

  74. [74]

    URL https://arxiv.org/abs/2503.05592

  75. [75]

    Chi, Quoc V

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neura...

  76. [76]

    Voyager: An Open-Ended Embodied Agent with Large Language Models

    Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. Voyager: An open-ended embodied agent with large language models.arXiv preprint arXiv:2305.16291, 2023

  77. [77]

    Reflexion: language agents with verbal reinforcement learning

    Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: language agents with verbal reinforcement learning. InNeural Information Processing Systems, 2023. URLhttps://api.semanticscholar.org/CorpusID:258833055

  78. [79]

    Learning transferable visual models from natural language supervision

    AlecRadford, JongWookKim, ChrisHallacy, AdityaRamesh, GabrielGoh, SandhiniAgarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InICML, volume 139 ofProceedings of Machine Learning Research, pages 8748–8763. PMLR, 2021

  79. [80]

    Visual instruction tuning

    Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning. InNeurIPS, 2023

  80. [81]

    Cogvlm: Visual expert for pretrained language models.Advances in Neural Information Processing Systems, 37:121475–121499, 2025

    Weihan Wang, Qingsong Lv, Wenmeng Yu, Wenyi Hong, Ji Qi, Yan Wang, Junhui Ji, Zhuoyi Yang, Lei Zhao, Song XiXuan, et al. Cogvlm: Visual expert for pretrained language models.Advances in Neural Information Processing Systems, 37:121475–121499, 2025

Showing first 80 references.