APEX-Searcher: Refining Credit Assignment with Subgoaling for Agentic Retrieval-Augmented Generation

· 2026 · cs.CL · arXiv 2603.13853

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Retrieval-augmented generation (RAG) connects large language models (LLMs) to external knowledge, but single-round retrieval is often insufficient for complex multi-hop questions. To enhance search capabilities for complex tasks, most existing works integrate multi-round iterative retrieval with reasoning processes via end-to-end training. While these approaches improve problem-solving performance, they still face challenges in task reasoning and model training, especially ambiguous retrieval execution paths and sparse rewards in end-to-end reinforcement learning (RL), which can lead to inaccurate retrieval results and lower performance. We attribute these failures to hierarchical credit entanglement: a single final reward updates planning and execution together, so the model cannot clearly separate plan errors from retrieval errors. We propose APEX-Searcher, which uses a Refining Credit Assignment paradigm: planning is optimized by RL with a plan-level reward, while execution is learned by SFT. Extensive experiments show consistent gains in both multi-hop RAG and task planning across benchmarks.

representative citing papers

LLM-Guided Planning for Multi-hop Reasoning over Multimodal Nuclear Regulatory Documents

cs.AI · 2026-06-28 · unverdicted · novelty 4.0

LLM planning agent with dynamic KG state achieves 81.5% accuracy on 200 multi-hop questions from NuScale FSAR documents, outperforming non-planning RAG baselines by up to 38pp.

citing papers explorer

Showing 1 of 1 citing paper.

LLM-Guided Planning for Multi-hop Reasoning over Multimodal Nuclear Regulatory Documents cs.AI · 2026-06-28 · unverdicted · none · ref 2 · internal anchor
LLM planning agent with dynamic KG state achieves 81.5% accuracy on 200 multi-hop questions from NuScale FSAR documents, outperforming non-planning RAG baselines by up to 38pp.

APEX-Searcher: Refining Credit Assignment with Subgoaling for Agentic Retrieval-Augmented Generation

fields

years

verdicts

representative citing papers

citing papers explorer