Debrup Das, Sam O’ Nuallain, and Razieh Rahimi

Orion Weller, Kathryn Ricci, Eugene Yang, Andrew Yates, Dawn Lawrie, Benjamin Van Durme · 2025 · arXiv 2502.18418

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

FollowTable: A Benchmark for Instruction-Following Table Retrieval

cs.IR · 2026-05-01 · unverdicted · novelty 8.0

FollowTable is the first large-scale benchmark for instruction-following table retrieval, paired with an Instruction Responsiveness Score, showing that existing models fail to adapt to fine-grained constraints beyond topical similarity.

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

cs.CL · 2026-05-05 · unverdicted · novelty 7.0

BRIGHT-Pro and RTriever-Synth advance reasoning-intensive retrieval by adding multi-aspect evidence evaluation and aspect-decomposed synthetic training, with the fine-tuned RTriever-4B showing gains over its base model.

MARVEL: Multimodal Adaptive Reasoning-intensiVe Expand-rerank and retrievaL

cs.IR · 2026-04-08 · unverdicted · novelty 7.0

MARVEL reaches 37.9 nDCG@10 on the MM-BRIGHT benchmark by combining LLM query expansion, a reasoning-enhanced dense retriever, and GPT-4o CoT reranking, beating prior multimodal encoders by 10.3 points.

Verbal-R3: Verbal Reranker as the Missing Bridge between Retrieval and Reasoning

cs.CL · 2026-05-02 · unverdicted · novelty 6.0

Verbal-R3 uses a verbal reranker to generate analytic narratives that guide retrieval and reasoning in LLMs, achieving SOTA results on complex QA benchmarks.

A Survey of Reasoning-Intensive Retrieval: Progress and Challenges

cs.IR · 2026-04-30 · unverdicted · novelty 6.0

A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.

HIVE: Query, Hypothesize, Verify An LLM Framework for Multimodal Reasoning-Intensive Retrieval

cs.IR · 2026-04-08 · unverdicted · novelty 6.0

HIVE raises multimodal retrieval nDCG@10 to 41.7 on the MM-BRIGHT benchmark by inserting LLM-driven hypothesis generation and verification between retrieval passes, delivering +9.5 over the best text-only baseline and +14.1 over the best multimodal baseline.

Test-time Scaling over Perception: Resolving the Grounding Paradox in Thinking with Images

cs.CV · 2026-04-13 · unverdicted · novelty 5.0

TTSP resolves the Grounding Paradox by treating perception as a scalable test-time process that generates, filters, and iteratively refines multiple visual exploration traces, outperforming baselines on high-resolution and multimodal reasoning tasks.

Reproducing Adaptive Reranking for Reasoning-Intensive IR

cs.IR · 2026-04-30 · unverdicted · novelty 2.0

Reproducing GAR on BRIGHT shows it boosts reasoning-intensive retrieval effectiveness with low overhead when the reranker's signal quality is strong.

LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

cs.IR · 2026-05-13

citing papers explorer

Showing 9 of 9 citing papers.

FollowTable: A Benchmark for Instruction-Following Table Retrieval cs.IR · 2026-05-01 · unverdicted · none · ref 54
FollowTable is the first large-scale benchmark for instruction-following table retrieval, paired with an Instruction Responsiveness Score, showing that existing models fail to adapt to fine-grained constraints beyond topical similarity.
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems cs.CL · 2026-05-05 · unverdicted · none · ref 11
BRIGHT-Pro and RTriever-Synth advance reasoning-intensive retrieval by adding multi-aspect evidence evaluation and aspect-decomposed synthetic training, with the fine-tuned RTriever-4B showing gains over its base model.
MARVEL: Multimodal Adaptive Reasoning-intensiVe Expand-rerank and retrievaL cs.IR · 2026-04-08 · unverdicted · none · ref 34
MARVEL reaches 37.9 nDCG@10 on the MM-BRIGHT benchmark by combining LLM query expansion, a reasoning-enhanced dense retriever, and GPT-4o CoT reranking, beating prior multimodal encoders by 10.3 points.
Verbal-R3: Verbal Reranker as the Missing Bridge between Retrieval and Reasoning cs.CL · 2026-05-02 · unverdicted · none · ref 51
Verbal-R3 uses a verbal reranker to generate analytic narratives that guide retrieval and reasoning in LLMs, achieving SOTA results on complex QA benchmarks.
A Survey of Reasoning-Intensive Retrieval: Progress and Challenges cs.IR · 2026-04-30 · unverdicted · none · ref 76
A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.
HIVE: Query, Hypothesize, Verify An LLM Framework for Multimodal Reasoning-Intensive Retrieval cs.IR · 2026-04-08 · unverdicted · none · ref 37
HIVE raises multimodal retrieval nDCG@10 to 41.7 on the MM-BRIGHT benchmark by inserting LLM-driven hypothesis generation and verification between retrieval passes, delivering +9.5 over the best text-only baseline and +14.1 over the best multimodal baseline.
Test-time Scaling over Perception: Resolving the Grounding Paradox in Thinking with Images cs.CV · 2026-04-13 · unverdicted · none · ref 35
TTSP resolves the Grounding Paradox by treating perception as a scalable test-time process that generates, filters, and iteratively refines multiple visual exploration traces, outperforming baselines on high-resolution and multimodal reasoning tasks.
Reproducing Adaptive Reranking for Reasoning-Intensive IR cs.IR · 2026-04-30 · unverdicted · none · ref 47
Reproducing GAR on BRIGHT shows it boosts reasoning-intensive retrieval effectiveness with low overhead when the reranker's signal quality is strong.
LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving cs.IR · 2026-05-13 · unreviewed · ref 24

Debrup Das, Sam O’ Nuallain, and Razieh Rahimi

fields

years

verdicts

representative citing papers

citing papers explorer