citation dossier

A survey on large language model based autonomous agents

[Wang et al · 2023 · arXiv 2308.11432

17Pith papers citing it

17reference links

cs.AItop field · 10 papers

UNVERDICTEDtop verdict bucket · 14 papers

This arXiv-backed work is queued for full Pith review when it crosses the high-inbound sweep. That review runs reader · skeptic · desk-editor · referee · rebuttal · circularity · lean confirmation · RS check · pith extraction.

read on arXiv PDF

why this work matters in Pith

Pith has found this work in 17 reviewed papers. Its strongest current cluster is cs.AI (10 papers). The largest review-status bucket among citing papers is UNVERDICTED (14 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

cs.CR · 2026-04-09 · unverdicted · novelty 8.0

Malicious LLM API routers actively perform payload injection and secret exfiltration, with 9 of 428 tested routers showing malicious behavior and further poisoning risks from leaked credentials.

Agent-First Tool API: A Semantic Interface Paradigm for Enterprise AI Agent Systems

cs.AI · 2026-05-11 · unverdicted · novelty 7.0

The Agent-First Tool API paradigm raises AI agent task success from 64% to 88% and cuts human interventions by 72.7% through semantic phases, structured contracts, and risk governance in a production enterprise system.

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.

EvoMAS: Learning Execution-Time Workflows for Multi-Agent Systems

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

EvoMAS trains a workflow adapter with policy gradients to dynamically instantiate stage-specific multi-agent workflows from a fixed agent pool, using explicit task-state construction and terminal success signals, and outperforms static baselines on GAIA, HLE, and DeepResearcher.

Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

cs.CR · 2026-05-01 · unverdicted · novelty 6.0

ASPO combines multi-agent LLM proposals with deterministic enforcement in a MAPE-K loop to select conflict-free, resource-feasible security patterns for IoT, delivering 100% safety invariants and 21-23% tail latency/energy reductions on testbed workloads.

An AI Agent Execution Environment to Safeguard User Data

cs.CR · 2026-04-21 · unverdicted · novelty 6.0

GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.

Visual Inception: Compromising Long-term Planning in Agentic Recommenders via Multimodal Memory Poisoning

cs.CR · 2026-04-18 · unverdicted · novelty 6.0

Visual Inception poisons images to hijack long-term memory in agentic recommenders and steer planning, while CognitiveGuard reduces success to about 10% via perceptual sanitization and reasoning verification.

In-situ process monitoring for defect detection in wire-arc additive manufacturing: an agentic AI approach

cs.AI · 2026-04-10 · unverdicted · novelty 6.0

A multi-agent AI framework using processing and acoustic agents achieves 91.6% accuracy and 0.821 F1 score for in-situ porosity defect detection in wire-arc additive manufacturing.

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

cs.CL · 2024-10-30 · unverdicted · novelty 6.0

OS-Atlas, trained on the largest open-source cross-platform GUI grounding corpus of 13 million elements, outperforms prior open-source models on six benchmarks across mobile, desktop, and web platforms.

Agent Mentor: Framing Agent Knowledge through Semantic Trajectory Analysis

cs.AI · 2026-04-12 · unverdicted · novelty 5.0

Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.

Toward Explanatory Equilibrium: Verifiable Reasoning as a Coordination Mechanism under Asymmetric Information

cs.MA · 2026-04-10 · unverdicted · novelty 5.0

Structured reasoning artifacts enable coordination in LLM multi-agent systems by preventing approval and welfare collapse under asymmetric information while keeping bad-approval rates low across audit regimes.

SciFi: A Safe, Lightweight, User-Friendly, and Fully Autonomous Agentic AI Workflow for Scientific Applications

cs.AI · 2026-04-14 · unverdicted · novelty 4.0

SciFi is a safe, lightweight agentic AI framework that automates structured scientific tasks with minimal human intervention via isolated environments and layered self-assessing agents.

Aethon: A Reference-Based Replication Primitive for Constant-Time Instantiation of Stateful AI Agents

cs.AI · 2026-04-13 · unverdicted · novelty 4.0

Aethon enables near-constant-time instantiation of stateful AI agents via reference-based replication over compositional views, layered memory, and copy-on-write semantics.

OpenKedge: Governing Agentic Mutation with Execution-Bound Safety and Evidence Chains

cs.AI · 2026-04-07 · unverdicted · novelty 4.0

OpenKedge redefines AI agent state mutations as a governed process using intent proposals, policy-evaluated execution contracts, and cryptographic evidence chains to enable safe, auditable agentic behavior.

Understanding the planning of LLM agents: A survey

cs.AI · 2024-02-05 · accept · novelty 4.0

A survey that provides a taxonomy of methods for improving planning in LLM-based agents across task decomposition, plan selection, external modules, reflection, and memory.

The Rise and Potential of Large Language Model Based Agents: A Survey

cs.AI · 2023-09-14 · accept · novelty 4.0

The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.

Large Language Models: A Survey

cs.CL · 2024-02-09 · accept · novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

citing papers explorer

Showing 17 of 17 citing papers.

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain cs.CR · 2026-04-09 · unverdicted · none · ref 52
Malicious LLM API routers actively perform payload injection and secret exfiltration, with 9 of 428 tested routers showing malicious behavior and further poisoning risks from leaked credentials.
Agent-First Tool API: A Semantic Interface Paradigm for Enterprise AI Agent Systems cs.AI · 2026-05-11 · unverdicted · none · ref 23
The Agent-First Tool API paradigm raises AI agent task success from 64% to 88% and cuts human interventions by 72.7% through semantic phases, structured contracts, and risk governance in a production enterprise system.
OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces cs.AI · 2026-05-09 · unverdicted · none · ref 42
OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
EvoMAS: Learning Execution-Time Workflows for Multi-Agent Systems cs.AI · 2026-05-09 · unverdicted · none · ref 30
EvoMAS trains a workflow adapter with policy gradients to dynamically instantiate stage-specific multi-agent workflows from a fixed agent pool, using explicit task-state construction and terminal success signals, and outperforms static baselines on GAIA, HLE, and DeepResearcher.
Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems cs.CR · 2026-05-01 · unverdicted · none · ref 19
ASPO combines multi-agent LLM proposals with deterministic enforcement in a MAPE-K loop to select conflict-free, resource-feasible security patterns for IoT, delivering 100% safety invariants and 21-23% tail latency/energy reductions on testbed workloads.
An AI Agent Execution Environment to Safeguard User Data cs.CR · 2026-04-21 · unverdicted · none · ref 35
GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.
Visual Inception: Compromising Long-term Planning in Agentic Recommenders via Multimodal Memory Poisoning cs.CR · 2026-04-18 · unverdicted · none · ref 8
Visual Inception poisons images to hijack long-term memory in agentic recommenders and steer planning, while CognitiveGuard reduces success to about 10% via perceptual sanitization and reasoning verification.
In-situ process monitoring for defect detection in wire-arc additive manufacturing: an agentic AI approach cs.AI · 2026-04-10 · unverdicted · none · ref 83
A multi-agent AI framework using processing and acoustic agents achieves 91.6% accuracy and 0.821 F1 score for in-situ porosity defect detection in wire-arc additive manufacturing.
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents cs.CL · 2024-10-30 · unverdicted · none · ref 29
OS-Atlas, trained on the largest open-source cross-platform GUI grounding corpus of 13 million elements, outperforms prior open-source models on six benchmarks across mobile, desktop, and web platforms.
Agent Mentor: Framing Agent Knowledge through Semantic Trajectory Analysis cs.AI · 2026-04-12 · unverdicted · none · ref 27
Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.
Toward Explanatory Equilibrium: Verifiable Reasoning as a Coordination Mechanism under Asymmetric Information cs.MA · 2026-04-10 · unverdicted · none · ref 34
Structured reasoning artifacts enable coordination in LLM multi-agent systems by preventing approval and welfare collapse under asymmetric information while keeping bad-approval rates low across audit regimes.
SciFi: A Safe, Lightweight, User-Friendly, and Fully Autonomous Agentic AI Workflow for Scientific Applications cs.AI · 2026-04-14 · unverdicted · none · ref 1
SciFi is a safe, lightweight agentic AI framework that automates structured scientific tasks with minimal human intervention via isolated environments and layered self-assessing agents.
Aethon: A Reference-Based Replication Primitive for Constant-Time Instantiation of Stateful AI Agents cs.AI · 2026-04-13 · unverdicted · none · ref 36
Aethon enables near-constant-time instantiation of stateful AI agents via reference-based replication over compositional views, layered memory, and copy-on-write semantics.
OpenKedge: Governing Agentic Mutation with Execution-Bound Safety and Evidence Chains cs.AI · 2026-04-07 · unverdicted · none · ref 1
OpenKedge redefines AI agent state mutations as a governed process using intent proposals, policy-evaluated execution contracts, and cryptographic evidence chains to enable safe, auditable agentic behavior.
Understanding the planning of LLM agents: A survey cs.AI · 2024-02-05 · accept · none · ref 44
A survey that provides a taxonomy of methods for improving planning in LLM-based agents across task decomposition, plan selection, external modules, reflection, and memory.
The Rise and Potential of Large Language Model Based Agents: A Survey cs.AI · 2023-09-14 · accept · none · ref 90
The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.
Large Language Models: A Survey cs.CL · 2024-02-09 · accept · none · ref 175
The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

A survey on large language model based autonomous agents

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer