Recognition: unknown
MASH: Modeling Abstention via Selective Help-Seeking
read the original abstract
LLMs cannot reliably recognize their parametric knowledge boundaries and often hallucinate answers to outside-of-boundary questions. In this paper, we introduce MASH (Modeling Abstention via Selective Help-seeking), a training framework that readily extracts abstentions from LLMs. Our key idea is that any external help-seeking by an LLM, i.e. search tool use, can serve as a proxy for abstention if the external help (search) is appropriately penalized while also rewarding answer accuracy. MASH operationalizes this idea using reinforcement learning with a pay-per-search reward. We run experiments on three knowledge-intensive QA datasets. Our results show that MASH substantially improves upon the selective help-seeking performance of prior efficient search approaches; on multi-hop datasets, it improves answer accuracy by 7.6%. Furthermore, MASH demonstrates strong off-the-shelf abstention performance, showcasing behavior competitive with prior abstention methods that additionally require predetermining model knowledge boundaries to construct training data. Overall, we show MASH training effectively aligns search tool use with parametric knowledge, which can be successfully leveraged for making abstention decisions and efficient search tool use
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Don't Start What You Can't Finish: A Counterfactual Audit of Support-State Triage in LLM Agents
LLM agents overcommit on non-complete tasks at 41.7% unless given explicit support-state categories, which raise typed deferral accuracy to 91.7%.
-
KARL: Mitigating Hallucinations in LLMs via Knowledge-Boundary-Aware Reinforcement Learning
KARL uses a knowledge-boundary-aware reward from within-group response statistics and two-stage RL training to align LLM abstention with actual knowledge, yielding a better accuracy-hallucination trade-off on benchmarks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.