{"total":10,"items":[{"citing_arxiv_id":"2605.21622","ref_index":36,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"TO-Agents: A Multi-Agent AI Pipeline for Preference-Guided Topology Optimization","primary_cat":"cs.AI","submitted_at":"2026-05-20T18:32:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A multi-agent pipeline iteratively refines topology optimization outputs to match natural language preferences for branched structures, achieving 60% success rate across replicates in cantilever and phone-stand tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13172","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"When Does Hierarchy Help? Benchmarking Agent Coordination in Event-Driven Industrial Scheduling","primary_cat":"cs.MA","submitted_at":"2026-05-13T08:33:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"DESBench reveals structural trade-offs among centralized, hierarchical, heterarchical, and holonic coordination in dynamic industrial scheduling that outcome metrics alone miss.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"no-progress stopping. The remaining metrics quantify unresolved workload through unfinished jobs (Gdj), overdue debt from unfinished jobs (Gut), and remaining operations (Gdo). 4 Experimental Result and Discussion Experimental Setting.We evaluate four coordination paradigms, each implemented in two or- chestration frameworks, LangGraph [29] and AgentScope [30], and paired with three representative LLMs: GPT-5.4, Gemini-3-Flash, and Qwen-3.5. The experiments adopt A5C12 topology, consisting of five Areas and twelve Cells, partitioned as 4-3-2-2-1. Three scenario profiles are tested, each designed to emphasize a different coordination difficulty: branch pressure (high task branching and routing decisions), strong cluster pull (tight coupling of resource usage across areas), and late"},{"citing_arxiv_id":"2605.02669","ref_index":20,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"An explainable hypothesis-driven approach to Drug-Induced Liver Injury with HADES","primary_cat":"cs.AI","submitted_at":"2026-05-04T14:50:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HADES is an agentic AI system that generates mechanistic hypotheses for drug-induced liver injury using molecular, metabolite, and pathway evidence, outperforming prior binary classifiers on the new DILER benchmark while establishing a baseline for hypothesis alignment.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.16646","ref_index":27,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agentic Frameworks for Reasoning Tasks: An Empirical Study","primary_cat":"cs.AI","submitted_at":"2026-04-17T19:02:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An empirical evaluation of 22 agentic frameworks on BBH, GSM8K, and ARC benchmarks shows stable performance in 12 frameworks but highlights orchestration failures and weaker mathematical reasoning.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"integration and persistent memory, thereby constraining the development of scalable and context-aware intelligent agents [25, 26]. Agentic frameworks were introduced to address these limitations by providing reusable abstractions and built-in support for agent orchestration, memory management, tool integration, and multi-step execution, thereby simplifying the development and deployment of intelligent agent systems [27]. Since 2023, the development of agentic frameworks such as AutoGen [28], Camel [29], CrewAI [30], SuperAGI [31], TaskWeaver [32], MetaGPT [33], and ChatDev [34] has simplified the development of intelligent agent systems. These frameworks support complex functionalities, including state management, tool integration, and inter-agent communication [3]."},{"citing_arxiv_id":"2604.15657","ref_index":22,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Understanding Inference-Time Token Allocation and Coverage Limits in Agentic Hardware Verification","primary_cat":"cs.AR","submitted_at":"2026-04-17T03:15:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Domain-specialized LLM agents for hardware verification close 95-99% coverage using 4-13x fewer tokens and 2-4x faster convergence than general-purpose agents by reallocating tokens toward coverage-directed reasoning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05211","ref_index":56,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Review of Large Language Models for Stock Price Forecasting from a Hedge-Fund Perspective","primary_cat":"q-fin.PR","submitted_at":"2026-04-10T17:36:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"This review synthesizes LLM uses in stock forecasting and catalogs key practical pitfalls from a hedge-fund viewpoint.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"Popular paradigms that can be plugged into agentic trading loops include ReACT for interleaving reasoning with tool use [52], Chain-of-Thought (CoT) for stepwise deduction [53], Tree-of-Thought (ToT) for exploring alternate reasoning [54], and Reflexion for self- critique and iterative improvement [55]. Finally, engineering frameworks such as LangChain or LangGraph [56], AutoGen [57], and CrewAI [58] are useful for building agentic trading systems. They provide tools and libraries for developers and hedge funds to deploy agentic AI. III. ISSUES ANDCHALLENGES FROM AHEDGE-FUND PERSPECTIVE The following section examines the issues and challenges of applying LLMs to stock price forecasting under real-world trading constraints."},{"citing_arxiv_id":"2604.06296","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent","primary_cat":"cs.LG","submitted_at":"2026-04-07T17:13:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-optimal accuracy on benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13848","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"GraphBit: A Graph-based Agentic Framework for Non-Linear Agent Orchestration","primary_cat":"cs.AI","submitted_at":"2026-03-08T18:32:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"GraphBit is a DAG-based engine-orchestrated framework for agentic LLMs that achieves 67.6% accuracy with zero hallucinations on GAIA benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2603.04474","ref_index":45,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration","primary_cat":"cs.MA","submitted_at":"2026-03-04T11:45:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A graph-based propagation model for error cascades in LLM multi-agent systems plus a genealogy-graph governance plugin that prevents final infection in at least 89% of runs across tested frameworks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.15219","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Multi-Agent Home Energy Management Assistant","primary_cat":"cs.HC","submitted_at":"2026-02-16T21:55:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HEMA is a multi-agent LLM system with analysis, knowledge, and control agents plus a self-consistency router that enables conversational home energy tasks, evaluated via LLM-simulated users on 23 metrics.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}