{"total":12,"items":[{"citing_arxiv_id":"2605.28774","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agent Explorative Policy Optimization for Multimodal Agentic Reasoning","primary_cat":"cs.CL","submitted_at":"2026-05-27T17:36:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"AXPO addresses the Thinking-Acting Gap in agentic RL training by targeted resampling of tool calls in all-wrong subgroups, delivering +1.8pp gains over GRPO on nine multimodal benchmarks with an 8B model beating a 32B baseline on Pass@4.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17561","ref_index":57,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Automated Root-Cause Subclassification and No-Code Fix Generation for Invalid Bug Reports","primary_cat":"cs.SE","submitted_at":"2026-05-17T17:45:13+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.14930","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"IE as Cache: Information Extraction Enhanced Agentic Reasoning","primary_cat":"cs.CL","submitted_at":"2026-04-16T12:18:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"IE-as-Cache framework repurposes information extraction as a dynamic cognitive cache to improve agentic reasoning accuracy in LLMs on challenging benchmarks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"that can be continuously consumed and updated by down- stream cognitive processes [10]. In this work, we posit that strategically extracted and maintained information can directly scaffold large language model (LLM) reasoning, especially as LLMs increasingly operate in an agentic manner where they must iteratively read, decide, and act over complex inputs [11]. This perspective is crucial for processing noise-rich, long- form content [12], where LLMs struggle with irrelevant dis- tractors [13] and information decay in middle contexts [14]. In such settings, simply providing raw context is often insuffi- cient; instead, models benefit from a compact intermediate rep- resentation that preserves salient evidence while suppressing"},{"citing_arxiv_id":"2604.07927","ref_index":28,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"EigentSearch-Q+: Enhancing Deep Research Agents with Structured Reasoning Tools","primary_cat":"cs.AI","submitted_at":"2026-04-09T07:47:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Structured query and evidence tools added to an AI research agent improve benchmark accuracy by 0.6 to 3.8 percentage points.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.07720","ref_index":38,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Towards Knowledgeable Deep Research: Framework and Benchmark","primary_cat":"cs.AI","submitted_at":"2026-04-09T02:06:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The paper introduces the KDR task, HKA multi-agent framework, and KDR-Bench to enable LLM agents to integrate structured knowledge into deep research reports, with experiments showing outperformance over prior agents.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[36] Junde Wu, Jiayuan Zhu, Yuyuan Liu, Min Xu, and Yueming Jin. 2025. Agen- tic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools. arXiv:2502.04644 [cs.AI] https://arxiv.org/abs/2502.04644 [37] xAI. 2025. Grok 4 Model Card. https://data.x.ai/2025-08-20-grok-4-model- card.pdf. Official Grok 4 model card, August 2025. [38] Renjun Xu and Jingwen Peng. 2025. A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications. arXiv:2506.12594 [cs.AI] https:// arxiv.org/abs/2506.12594 [39] An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong"},{"citing_arxiv_id":"2603.06194","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MICA: Multi-granularity Intertemporal Credit Assignment for Long-Horizon Emotional Support Dialogue","primary_cat":"cs.CL","submitted_at":"2026-03-06T12:06:57+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MICA combines incremental per-turn distance rewards and Monte Carlo returns from a shared potential function over user support states to create a mixed advantage signal that enables stable multi-turn RL optimization for emotional support dialogues.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.09725","ref_index":67,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Efficient Remote KV Cache Reuse with GPU-native Video Codec","primary_cat":"cs.DC","submitted_at":"2026-02-10T12:29:02+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"KVCodec uses GPU-native video codecs and pipelined fetching to compress and transmit KV caches, delivering up to 3.51x faster TTFT than prior methods while preserving accuracy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.19827","ref_index":29,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"When Iterative RAG Beats Ideal Evidence: A Diagnostic Study in Scientific Multi-hop Question Answering","primary_cat":"cs.CL","submitted_at":"2026-01-27T17:35:05+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.12538","ref_index":224,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agentic Reasoning for Large Language Models","primary_cat":"cs.AI","submitted_at":"2026-01-18T18:58:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"access up-to-date information, execute precise symbolic or numerical computations, and decompose complex tasks into grounded, tool-assisted reasoning steps [218, 219, 9, 220, 221]. With tools as intermediaries, models are enriched and augmented by external capabilities, enabling the generation of more accurate and generalizable agentic reasoning trajectories [222, 215, 223]. Bootstrapping of Tool Use via SFT.Early works on tool-integration [5, 6, 203, 204, 224, 225, 226, 227, 228] primarily apply supervised fine-tuning (SFT) over curated tool-use reasoning steps, where models were trained to imitate demonstrations of search queries, code executions, or API calls. The SFT stage provided an initial competency in invoking tools, interpreting tool outputs, and integrating the results into coherent reasoning chains [225, 14]. For example, Toolformer [6] introduces a self-supervised framework in which"},{"citing_arxiv_id":"2506.00886","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Position: Agent Should Invoke External Tools ONLY When Epistemically Necessary","primary_cat":"cs.AI","submitted_at":"2025-06-01T07:52:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Agents should invoke external tools only when epistemically necessary, per the introduced Theory of Agent framework that frames tool use as a decision under uncertainty.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.11060","ref_index":37,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Code Researcher: Deep Research Agent for Large Systems Code and Commit History","primary_cat":"cs.SE","submitted_at":"2025-05-27T04:57:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Code Researcher retrieves global context via multi-step reasoning on code semantics, patterns, and commit history to fix Linux kernel crashes, reaching 48% crash-resolution rate versus 31% for baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2504.19678","ref_index":137,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review","primary_cat":"cs.AI","submitted_at":"2025-04-28T11:08:22+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.","context_count":1,"top_context_role":"method","top_context_polarity":"background","context_text":"tonomous decision-making and dynamic multi-step reason- ing. The frameworks discussed include LangChain [131], LlamaIndex [132], CrewAI [133], and Swarm [134], which abstract complex functionalities into reusable components that enable context management, tool integration, and iterative refinement of outputs. Additionally, pioneering efforts in GUI control [135] and agentic reasoning [136], [137] demonstrate the increasing capabilities of these systems to interact with external environments and tools in real-time. In parallel, this section presents a diverse range of AI agent applications that span materials science, biomedical research, academic ideation, software engineering, synthetic data generation, and chemical reasoning. Systems such as"}],"limit":50,"offset":0}