MemEvolve jointly evolves agent experiential knowledge and memory architectures via a modular codebase, delivering up to 17% gains on agent benchmarks with cross-task and cross-model generalization.
hub
Alita: Generalist agent enabling scalable agentic reasoning with minimal predefinition and maximal self-evolution
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
FluxMem evolves memory as a heterogeneous graph via three refinement stages and reports consistent state-of-the-art results on LoCoMo, Mind2Web, and GAIA benchmarks.
BoundaryRouter routes queries to LLM or agent using early experience memory from a seed set, cutting inference time 60.6% versus always using agents and raising performance 28.6% versus always using direct LLM inference.
MGA is a memory-driven GUI agent that uses an observer for bias-free screen reading and structured memory for compact state transitions to enable efficient long-horizon automation.
SimpleMem proposes semantic structured compression, online synthesis, and intent-aware retrieval to create efficient lifelong memory for LLM agents, reporting 26.4% F1 gains and up to 30x lower token use on LoCoMo benchmarks.
UI-TARS-2 reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld while attaining 59.8 mean normalized score on a 15-game suite through multi-turn RL and scalable data generation.
A comprehensive review of self-evolving AI agents that improve themselves over time, organized via a framework of inputs, agent system, environment, and optimizers, with domain-specific and safety discussions.
Selecting preference pairs whose DPO implicit reward gap is small yields better LLM alignment than random or baseline selection while using only 10% of the data.
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.
citing papers explorer
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
UI-TARS-2 reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld while attaining 59.8 mean normalized score on a 15-game suite through multi-turn RL and scalable data generation.
-
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.
- A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications