MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning

Hongjin Qian; Zheng Liu

arxiv: 2508.00271 · v2 · pith:UFM7MDIAnew · submitted 2025-08-01 · 💻 cs.AI · cs.CL· cs.IR

MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning

Hongjin Qian , Zheng Liu This is my paper

classification 💻 cs.AI cs.CLcs.IR

keywords metaagentknowledgetoolagenticcontinualdiscoveryfurtherreasoning

0 comments

read the original abstract

In this work, we propose MetaAgent, an agentic paradigm inspired by the principle of learning-by-doing, where expertise is developed through hands-on practice and continual self-improvement. MetaAgent starts with a minimal workflow, equipped only with basic reasoning and adaptive help-seeking abilities. When a knowledge gap is encountered, MetaAgent generates natural language help requests, which are routed to the most suitable external tool by a dedicated tool router. As MetaAgent solves tasks, it continually conducts self-reflection and answer verification, distilling actionable experience into concise texts that are dynamically incorporated into future task contexts. Besides, MetaAgent autonomously builds in-house tools and a persistent knowledge base by organizing its tool-use history, further enhancing its ability to retrieve and integrate relevant information We term this continual, data-driven process as \textit{meta tool learning}, through which MetaAgent incrementally refines its reasoning and tool-use strategies, without changing model parameters or requiring further post-training. Evaluated on challenging knowledge discovery benchmarks, including GAIA, WebWalkerQA, and BrowseCamp, MetaAgent consistently outperforms workflow-based baselines and matches or exceeds end-to-end trained agents, demonstrating the promise of self-evolving agentic systems for robust, general-purpose knowledge discovery. We provide our source codes in https://github.com/qhjqhj00/MetaAgent.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning
cs.CL 2026-06 unverdicted novelty 6.0

EDV decouples execution, distillation by a third-party agent, and consensus verification to filter erroneous trajectories in LLM agent experience learning, outperforming baselines on tau2-bench, Mind2Web, and MMTB.
MetaPS: Adaptive Programmatic Strategy Selection for Market Agents
cs.AI 2026-06 unverdicted novelty 6.0

MetaPS trains models via simulation rollouts to select from programmatic strategy libraries for market agents, yielding better performance than fixed or direct LLM baselines across model sizes.