HiPRAG adds hierarchical process rewards to RL training for agentic RAG, reducing over-search to 2.3% and achieving 65.4-67.2% accuracy on seven QA benchmarks across 3B and 7B models.
ISBN 979-8-89176-251-0
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3verdicts
UNVERDICTED 3representative citing papers
Retrieval-state lock-in causes zero-dispersion errors in 42% of KG-RAG and 59% of dense-retrieval failures; a three-object check rule reaches 91.9% pooled precision at 7.7% coverage.
MASH uses RL with a pay-per-search reward to make LLMs seek external help only when needed, improving multi-hop QA accuracy by 7.6% and enabling competitive abstention without pre-defined knowledge boundaries.
citing papers explorer
-
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
HiPRAG adds hierarchical process rewards to RL training for agentic RAG, reducing over-search to 2.3% and achieving 65.4-67.2% accuracy on seven QA benchmarks across 3B and 7B models.
-
When Confidence Takes the Wrong Path: Diagnosing Retrieval-State Lock-In in RAG
Retrieval-state lock-in causes zero-dispersion errors in 42% of KG-RAG and 59% of dense-retrieval failures; a three-object check rule reaches 91.9% pooled precision at 7.7% coverage.
-
MASH: Modeling Abstention via Selective Help-Seeking
MASH uses RL with a pay-per-search reward to make LLMs seek external help only when needed, improving multi-hop QA accuracy by 7.6% and enabling competitive abstention without pre-defined knowledge boundaries.