{"total":13,"items":[{"citing_arxiv_id":"2605.25746","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Multi-Agent Coordination Adaptation via Structure-Guided Orchestration","primary_cat":"cs.MA","submitted_at":"2026-05-25T11:59:58+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MACA frames multi-agent coordination as posterior inference, learns a structural prior to guide orchestration, and reports 8.42% higher performance with 43.19% fewer tokens than adaptive baselines on benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12655","ref_index":48,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning","primary_cat":"cs.AI","submitted_at":"2026-05-12T19:01:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18799","ref_index":33,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ReCrit: Transition-Aware Reinforcement Learning for Scientific Critic Reasoning","primary_cat":"cs.LG","submitted_at":"2026-05-11T09:22:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ReCrit frames critic interaction as a correctness-transition problem and uses quadrant-based RL rewards to improve LLM performance on scientific reasoning benchmarks by rewarding corrections and robustness while penalizing sycophancy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18133","ref_index":131,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures","primary_cat":"cs.AI","submitted_at":"2026-04-20T12:00:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A survey comparing classical multi-agent systems with large foundation model-enabled multi-agent systems, showing how the latter enables semantic-level collaboration and greater adaptability.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"fest in their generalization capabilities at the task and scenario levels. In model-based CMASs, changes in system dynamics or task objectives typically require redesign and new analytical procedures. Learning-based methods in CMASs often rely on task-specific training distributions. They tend to overfit environmental details and show degraded performance when faced with scenario or distributional shifts [131], [15]. Large- scale pretraining injects broad world knowledge into the model. This enables rapid zero-shot or few-shot adaptation via in-context learning [54]. Additionally, LMASs translate task requirements into collaborative processes by role allocation and communication. They also use memory mechanisms to accumulate experience over time, enhancing understanding"},{"citing_arxiv_id":"2604.17191","ref_index":34,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Do LLM-derived graph priors improve multi-agent coordination?","primary_cat":"cs.LG","submitted_at":"2026-04-19T01:40:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLM-generated coordination graph priors improve multi-agent reinforcement learning performance on MPE benchmarks, with models as small as 1.5B parameters proving effective.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.15840","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution","primary_cat":"cs.CL","submitted_at":"2026-04-17T08:41:26+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CoEvolve improves LLM agent performance by 15-19% on AppWorld and BFCL benchmarks through mutual evolution of the agent and data distribution using feedback-driven task synthesis.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2603.12631","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Joint Optimization of Multi-agent Memory System","primary_cat":"cs.MA","submitted_at":"2026-03-13T04:04:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CoMAM jointly optimizes agents in multi-agent LLM memory systems via end-to-end RL and adaptive credit assignment to improve collaboration and performance.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.01885","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents","primary_cat":"cs.CL","submitted_at":"2026-01-05T08:24:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"AgeMem unifies long-term and short-term memory management in LLM agents by exposing memory operations as learnable tool actions trained via three-stage progressive reinforcement learning, outperforming baselines on long-horizon tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.14063","ref_index":27,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Adaptive Obstacle-Aware Task Assignment and Planning for Heterogeneous Robot Teaming","primary_cat":"cs.RO","submitted_at":"2025-10-15T20:04:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"OATH combines adaptive Halton sampling, obstacle-aware clustering with auctions, and LLM-based instruction interpretation to improve task assignment and planning for heterogeneous robot teams in obstacle-rich environments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.21046","ref_index":125,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence","primary_cat":"cs.AI","submitted_at":"2025-07-28T17:59:05+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.02592","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"WebSailor: Navigating Super-human Reasoning for Web Agent","primary_cat":"cs.CL","submitted_at":"2025-07-03T12:59:07+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"WebSailor trains open-source web agents to match proprietary performance on complex information-seeking tasks by generating high-uncertainty scenarios and using a new RL method called DUPO.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2501.06322","ref_index":118,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Multi-Agent Collaboration Mechanisms: A Survey of LLMs","primary_cat":"cs.AI","submitted_at":"2025-01-10T19:56:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and identifies challenges for future research.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"This approach aligns with ongoing research in Multi-Agent Systems (MASs) and collaborative AI, which focus on enabling groups of intelligent agents to coordinate, share knowledge, and solve problems collectively. The convergence of these fields has given rise to LLM-based MASs, which harness the collective intelligence of multiple LLMs to tackle complex, multi-step challenges [118]. Inspiration for MASs extends beyond technological advancements and finds roots in human collective intelligence (e.g., society of mind [87], theory of mind [45]). Human societies excel in leveraging teamwork and specialization to achieve shared goals, from everyday tasks to scientific discoveries. Similarly, MASs are designed to emulate these principles, enabling AI agents to collaborate effectively by combining their individual strengths"},{"citing_arxiv_id":"2411.18279","ref_index":51,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Large Language Model-Brained GUI Agents: A Survey","primary_cat":"cs.AI","submitted_at":"2024-11-27T12:13:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"MODERATE","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[46] A comprehensive survey of LLM-based agents. ✓ Wanget al.,. [47] A survey on LLM-based autonomous agents. ✓ Guoet al., [48] A survey of mult-agent LLM frameworks. ✓ Hanet al., [49] A survey on LLM multi-agent systems, with their challenges and open problems. ✓ Sunet al., [50] A survey on LLM-based multi-agent reinforcement learning. ✓ Huanget al., [51] A survey on planning in LLM agents. ✓ Aghzalet al., [52] A survey on automated planning in LLMs. ✓ Zhenget al., [53] Discuss the roadmap of lifelong learning in LLM agents. ✓ Zhanget al., [54] A survey on the memory of LLM-based agents. ✓ Shen [13] A survey of the tool usage in LLM agents. ✓ Changet al., [55] A survey on evaluation of LLMs. ✓ Liet al."}],"limit":50,"offset":0}