Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.
hub Canonical reference
Frontiers Comput
Canonical reference. 100% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
roles
background 16polarities
background 16representative citing papers
Agentic Workflow Injection is a new injection vulnerability class in LLM-augmented GitHub Actions, with two patterns (P2A and P2S) detected via the TaintAWI tool yielding 496 confirmed exploitable instances across 13,392 workflows.
The authors create the first large-scale dataset and taxonomy of failure modes in multi-agent LLM systems to explain their limited performance gains.
AgentRivet applies commercial LLMs in an autonomous workflow to extract physics details from ATLAS and CMS papers and generate Rivet routines, achieving few syntax errors but occasional physics implementation issues on two test cases.
PRISM is a new activation-conditioned model that recovers full sets of simultaneous instructions from LLM hidden states via judge-guided GRPO training and outperforms prior activation-to-language methods on security-relevant tasks.
TianJi-Environ is a WRF-Chem-based multi-agent AI framework for autonomous validation of atmospheric chemistry mechanisms through executable experiments and evidence assessment.
The paper delivers the first systems characterization of agent memory, with a four-axis taxonomy, phase-aware profiler, evaluation of ten systems on two benchmarks, and ten design recommendations.
Introduces QGP and PushBench to evaluate LLM agent persistence on quantitative goals, showing specialized controllers outperform baselines on verifier-checked artifact collection tasks.
Agentic CLEAR automates multi-level evaluation of LLM agents, generating textual insights at system, trace, and node granularity that align with human annotations and predict task success.
Mobius Injection exploits semantic closure in LLM agents to enable single-message AbO-DDoS attacks achieving up to 51x call amplification and 229x latency inflation.
Evolving-RL jointly optimizes experience extraction and utilization in LLM agents via RL with separate evaluation signals, delivering up to 98.7% relative gains on out-of-distribution tasks in ALFWorld and Mind2Web.
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
A malicious relay can strategically rewrite aligned LLM outputs in BYOK agent architectures to achieve up to 99.1% attack success on benchmarks like AgentDojo and ASB.
LLM adaptive exploration via runtime code execution outperforms static query generation for information extraction from heterogeneous BIM models on the new ifc-bench v2 benchmark.
A novel function hijacking attack achieves 70-100% success rates in forcing specific function calls across five LLMs on the BFCL benchmark and is robust to context semantics.
MALMAS is a memory-augmented multi-agent LLM system that generates diverse, high-quality features for tabular data via agent decomposition, routing, and iterative memory-guided refinement.
ReasoningBank distills generalizable reasoning strategies from agent successes and failures to enable self-evolution, with memory-aware test-time scaling amplifying gains over raw-trajectory or success-only memory on web and software benchmarks.
Mini-Mafia supplies an analytical model logit(p) = v*(m-d) for mafia win probability in LLM role interactions and uses Bayesian inference to estimate per-model parameters that predict tournament results with 76.6% Brier-score improvement over random.
SWE-agent introduces a custom agent-computer interface that lets LM agents solve software engineering tasks, reaching 12.5% pass@1 on SWE-bench and 87.7% on HumanEvalFix, exceeding prior non-interactive approaches.
AHOIS is a Socratic multi-agent AI that autonomously discovers and validates a random-interference encoding strategy for multimode fiber optics, achieving 76.97% MNIST and 83.17% Fashion-MNIST accuracy with 16x16 measurements of effective rank 56.9.
Proposes Agentic Programming in which programs control execution flow and LLMs act as invoked components (LLM-as-Code) only for reasoning, producing DAG-structured contexts that improve stability in long-horizon computer-use agents.
Multicultural multi-agent LLM systems exhibit substantially lower value diversity than human societies on the World Values Survey, with diversity uncorrelated to per-agent alignment and further reduced by agent interactions.
TravelEval is a new benchmark with a six-dimensional evaluation framework, realistic data sandbox, and simulation-based global assessment for LLM-powered travel planning agents.
Introduces ClawTrojan benchmark achieving 95.5% ASR for multi-step trojan attacks in agentic harnesses and DASGuard defense that sanitizes control content from untrusted sources.
citing papers explorer
-
Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values
Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.
-
Demystifying and Detecting Agentic Workflow Injection Vulnerabilities in GitHub Actions
Agentic Workflow Injection is a new injection vulnerability class in LLM-augmented GitHub Actions, with two patterns (P2A and P2S) detected via the TaintAWI tool yielding 496 confirmed exploitable instances across 13,392 workflows.
-
Why Do Multi-Agent LLM Systems Fail?
The authors create the first large-scale dataset and taxonomy of failure modes in multi-agent LLM systems to explain their limited performance gains.
-
AgentRivet: an automated system for producing Rivet routines from journal publications
AgentRivet applies commercial LLMs in an autonomous workflow to extract physics details from ATLAS and CMS papers and generate Rivet routines, achieving few syntax errors but occasional physics implementation issues on two test cases.
-
PRISM: Recovering Instruction Sets from Language Model Activations
PRISM is a new activation-conditioned model that recovers full sets of simultaneous instructions from LLM hidden states via judge-guided GRPO training and outperforms prior activation-to-language methods on security-relevant tasks.
-
TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research
TianJi-Environ is a WRF-Chem-based multi-agent AI framework for autonomous validation of atmospheric chemistry mechanisms through executable experiments and evidence assessment.
-
Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads
The paper delivers the first systems characterization of agent memory, with a four-axis taxonomy, phase-aware profiler, evaluation of ten systems on two benchmarks, and ten design recommendations.
-
Push Your Agent: Measuring and Enforcing Quantitative Goal Persistence in Long-Horizon LLM Agents
Introduces QGP and PushBench to evaluate LLM agent persistence on quantitative goals, showing specialized controllers outperform baselines on verifier-checked artifact collection tasks.
-
Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
Agentic CLEAR automates multi-level evaluation of LLM agents, generating textual insights at system, trace, and node granularity that align with human annotations and predict task success.
-
Can a Single Message Paralyze the AI Infrastructure? The Rise of AbO-DDoS Attacks through Targeted Mobius Injection
Mobius Injection exploits semantic closure in LLM agents to enable single-message AbO-DDoS attacks achieving up to 51x call amplification and 229x latency inflation.
-
Evolving-RL: End-to-End Optimization of Experience-Driven Self-Evolving Capability within Agents
Evolving-RL jointly optimizes experience extraction and utilization in LLM agents via RL with separate evaluation signals, delivering up to 98.7% relative gains on out-of-distribution tasks in ALFWorld and Mind2Web.
-
The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
-
When Alignment Isn't Enough: Response-Path Attacks on LLM Agents
A malicious relay can strategically rewrite aligned LLM outputs in BYOK agent architectures to achieve up to 99.1% attack success on benchmarks like AgentDojo and ASB.
-
BIM Information Extraction Through LLM-based Adaptive Exploration
LLM adaptive exploration via runtime code execution outperforms static query generation for information extraction from heterogeneous BIM models on the new ifc-bench v2 benchmark.
-
Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models
A novel function hijacking attack achieves 70-100% success rates in forcing specific function calls across five LLMs on the BFCL benchmark and is robust to context semantics.
-
Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data
MALMAS is a memory-augmented multi-agent LLM system that generates diverse, high-quality features for tabular data via agent decomposition, routing, and iterative memory-guided refinement.
-
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
ReasoningBank distills generalizable reasoning strategies from agent successes and failures to enable self-evolution, with memory-aware test-time scaling amplifying gains over raw-trajectory or success-only memory on web and software benchmarks.
-
Deceive, Detect, and Disclose: Large Language Models Play Mini-Mafia
Mini-Mafia supplies an analytical model logit(p) = v*(m-d) for mafia win probability in LLM role interactions and uses Bayesian inference to estimate per-model parameters that predict tournament results with 76.6% Brier-score improvement over random.
-
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
SWE-agent introduces a custom agent-computer interface that lets LM agents solve software engineering tasks, reaching 12.5% pass@1 on SWE-bench and 87.7% on HumanEvalFix, exceeding prior non-interactive approaches.
-
Socratic agents for autonomous scientific discovery in high-dimensional physical systems
AHOIS is a Socratic multi-agent AI that autonomously discovers and validates a random-interference encoding strategy for multimode fiber optics, achieving 76.97% MNIST and 83.17% Fashion-MNIST accuracy with 16x16 measurements of effective rank 56.9.
-
LLM-as-Code: Agentic Programming for Agent Harness
Proposes Agentic Programming in which programs control execution flow and LLMs act as invoked components (LLM-as-Code) only for reasoning, producing DAG-structured contexts that improve stability in long-horizon computer-use agents.
-
Beyond Alignment: Value Diversity as a Collective Property in Multicultural Agent Systems
Multicultural multi-agent LLM systems exhibit substantially lower value diversity than human societies on the World Values Survey, with diversity uncorrelated to per-agent alignment and further reduced by agent interactions.
-
TravelEval: A Comprehensive Benchmarking Framework for Evaluating LLM-Powered Travel Planning Agents
TravelEval is a new benchmark with a six-dimensional evaluation framework, realistic data sandbox, and simulation-based global assessment for LLM-powered travel planning agents.
-
From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors
Introduces ClawTrojan benchmark achieving 95.5% ASR for multi-step trojan attacks in agentic harnesses and DASGuard defense that sanitizes control content from untrusted sources.
-
Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents
Auto-Dreamer trains an offline memory consolidator via GRPO on agent performance to abstract cross-session patterns, outperforming baselines by 7 points on ScienceWorld with 12x smaller memory and generalizing to ALFWorld and WebArena.
-
Proof-Carrying Certificates for LLM Pipelines: A Trust-Boundary Architecture
Introduces a trust-boundary architecture in Lean 4 with three certificate families and two operators that deliver sorry-free, axiom-audited assurances for LLM pipeline components.
-
Sustaining Cooperation in Populations Guided by AI: A Folk Theorem for LLMs
A folk theorem for LLMs proves that all feasible and individually rational outcomes can be sustained as ε-equilibria in repeated games where LLMs advise client populations, despite indirect observation.
-
Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems
DiffMAS jointly optimizes latent communication and reasoning in multi-agent LLM systems via parameter-efficient supervised training on trajectories, yielding consistent gains over baselines on math, science, and code benchmarks.
-
From Time Series Analysis to Question Answering: A Survey in the LLM Era
A survey proposing a taxonomy of Injective, Bridging, and Internal Alignment paradigms to evolve TSA into user-driven Time Series Question Answering with LLMs.
-
ActiveMem: Distributed Active Memory for Long-Horizon LLM Reasoning
ActiveMem proposes a heterogeneous distributed memory framework for LLM agents that separates planning from active memory management, reporting SOTA accuracy with lower overhead on BrowseComp-Plus and GAIA.
-
Optimizing the Cost-Quality Tradeoff of Agentic Theorem Provers in Lean
An agentic theorem prover in Lean uses a control plane to route actions based on cost and success estimates, achieving 28.9% lower average cost than a fixed-step baseline on a PutnamBench subset while preserving performance.
-
Exploring Cross-Scenario Generality of Agentic Memory Systems: Diagnostics and a Strong Baseline
An agentic harness letting the LLM self-manage flat text-file storage via tool calls outperforms eight prior memory systems on cross-scenario generality across QA, chat, trajectory, stress-test, and long-horizon tasks.
-
Assistance to Autonomy: A Systematic Literature Review of Agentic AI across the Software Development Life Cycle
Systematic review of agentic AI in the SDLC finds output verifiability drives industrial adoption in later phases, with Planner-Executor-Reviewer as the dominant pattern, plus a new multi-agent LLM screening pipeline for high-volume SLRs.
-
The Semantic Training Gap: Ontology-Grounded Tool Architectures for Industrial AI Agent Systems
Ontology-grounded tool architectures eliminate hallucination of domain identifiers in industrial AI agents by enforcing semantic constraints through a typed relational configuration and three-operation interface.
-
Heterogeneous Scientific Foundation Model Collaboration
Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
-
JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents
JTPRO co-optimizes prompts and tool descriptions via reflection to raise overall success rate by 5-20% over baselines on multi-tool benchmarks.
-
Coding-Free and Privacy-Preserving Agentic Framework for Data-Driven Clinical Research
CARIS is a new agentic LLM framework that automates clinical research workflows from planning to reporting in a coding-free and privacy-preserving manner, achieving high completeness scores on heterogeneous datasets.
-
Agentic Federated Learning: The Future of Distributed Training Orchestration
Agentic-FL introduces language model agents for autonomous orchestration in federated learning to address client heterogeneity and dynamic conditions.
-
Explainable Iterative Data Visualisation Refinement via an LLM Agent
An LLM agent automates iterative refinement of data embedding visualizations by generating semantic evaluation reports and recommending configuration changes.
-
Addressing Moral Uncertainty using Large Language Models for Ethical Decision-Making
A reinforcement learning model is ethically fine-tuned using aggregated feedback from LLMs embodying five moral principles via Belief Jensen-Shannon Divergence and Dempster-Shafer Theory.
-
A Comprehensive Survey of Agents for Computer Use: Foundations, Challenges, and Future Directions
A survey of 87 agents for computer use and 33 datasets that introduces a three-dimensional taxonomy across domain, interaction, and agent perspectives and identifies six research gaps.
-
Automated Summarization of Software Documents: An LLM-based Multi-Agent Approach
Metagente is an LLM multi-agent system using Teacher-Student collaboration that outperforms baselines on real-world software documentation summarization for requirements analysis and technical docs.
-
LandslideAgent with Multimodal LandslideBench: A Domain-Rule-Augmented Agent for Autonomous Landslide Identification and Analysis
LandslideAgent is a rule-augmented agent built on a fine-tuned landslide VLM and a new multimodal benchmark dataset that reports accuracy gains in classification, segmentation, and description tasks.
-
SkillChain: Closing the Loop on Skill Evolution for Image-Based E-Commerce AI Assistants
SkillChain automates skill lifecycle for e-commerce image AI assistants via creator, optimizer, and refiner stages, leading to improved response quality and user engagement in production A/B tests.
-
AI as a Tool for Simulation-Based Experiments in Literary Studies
Proposes AI-driven simulations for literary-historical experiments and reports preliminary text-generation results claiming the first limited in-distribution outputs matching human novels.
-
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs
SELFCI uses complementary self-distillation with two reverse KL divergences to align LLMs to contextual integrity while preserving utility, outperforming RL baselines like GRPO in agentic settings.
-
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
-
Multi-Modal Agents for Power Distribution Defect Detection: An Evaluation of Foundation Models
Evaluates multimodal foundation models as agents for power distribution defect detection across perception, reasoning, and tool usage using a custom benchmark.
-
Impact of Task Phrasing on Presumptions in Large Language Models
LLMs show susceptibility to presumptions induced by task phrasing in decision tasks like the iterated prisoner's dilemma, mitigated by neutral wording.
-
Governance by Design: Architecting Agentic AI for Organizational Learning and Scalable Autonomy
A qualitative case study of one IT services firm's agentic AI deployment identifies architectural governance mechanisms and distills seven operational lessons for balancing autonomy with accountability.