Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning

Dahyun Lee; Dayoon Ko; Gunhee Kim; Haeju Park; Jihyuk Kim; Kyungjae Lee; Moontae Lee; Sohyeon Kim; Yongrae Jo

arxiv: 2508.19113 · v3 · pith:OYBNBUGQnew · submitted 2025-08-26 · 💻 cs.AI

Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning

Dayoon Ko , Jihyuk Kim , Haeju Park , Sohyeon Kim , Dahyun Lee , Yongrae Jo , Gunhee Kim , Moontae Lee

show 1 more author

Kyungjae Lee

This is my paper

classification 💻 cs.AI

keywords searchreasoningparallelsequentialaggregationstructuredapproachesdeep

0 comments

read the original abstract

Large reasoning models (LRMs) combined with retrieval-augmented generation (RAG) have enabled deep research agents capable of multi-step reasoning with external knowledge retrieval. However, we find that existing approaches rarely demonstrate test-time search scaling. Methods that extend reasoning through single-query sequential search suffer from limited evidence coverage, while approaches that generate multiple independent queries per step often lack structured aggregation, hindering deeper sequential reasoning. We propose a hybrid search strategy to address these limitations. We introduce HybridDeepSearcher, a structured search agent that integrates parallel query expansion with explicit evidence aggregation before advancing to deeper sequential reasoning. To supervise this behavior, we introduce HDS-QA, a novel dataset that guides models to combine broad parallel search with structured aggregation through supervised reasoning-query0retrieval trajectories containing parallel sub-queries. Across five benchmarks, HybridDeepSearcher significantly outperforms the state-of-the-art, improving F1 scores by +15.9 on FanOutQA and +9.2 on a subset of BrowseComp. Further analysis shows its consistent test-time search scaling: performance improves as additional search turns or calls are allowed, while competing methods plateau.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Erase to Improve: Erasable Reinforcement Learning for Search-Augmented LLMs
cs.CL 2025-10 unverdicted novelty 5.0

ERL trains LLMs to erase faulty reasoning steps and regenerate them in place, yielding gains of up to 8.48% EM on multi-hop QA benchmarks like HotpotQA.