hub Baseline reference

Voorhees

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M V oorhees · 2019 · arXiv 2003.07820

Baseline reference. 67% of citing Pith papers use this work as a benchmark or comparison.

19 Pith papers citing it

Baseline 67% of classified citations

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 4 background 2

citation-polarity summary

use dataset 4 background 2

representative citing papers

Layer-wise Token Compression for Efficient Document Reranking

cs.IR · 2026-05-20 · unverdicted · novelty 7.0 · 2 refs

Layer-wise Token Compression applies adaptive token pooling at middle transformer layers for cross-encoder rerankers, preserving MS MARCO ranking quality while raising QPS up to 25% on passages and 116% on documents, with added gains on listwise LLM rerankers and a regularizer effect for long inputs

Led to Mislead: Adversarial Content Injection for Attacks on Neural Ranking Models

cs.IR · 2026-05-02 · unverdicted · novelty 7.0

CRAFT is a supervised LLM framework using retrieval-augmented generation, self-refinement, fine-tuning, and preference optimization to create fluent adversarial content that boosts target ranks in neural ranking models, outperforming baselines on MS MARCO and TREC benchmarks with cross-architecture

ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression

cs.IR · 2026-04-24 · conditional · novelty 7.0

ResRank unifies retrieval and listwise reranking by compressing passages to one token each, using residual connections and cosine-similarity scoring, achieving competitive effectiveness on TREC DL and BEIR benchmarks with zero generated tokens.

HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads

cs.IR · 2026-04-19 · unverdicted · novelty 7.0

HeadRank lifts preference optimization into attention space via entropy-regularized head selection and distribution regularizers to sharpen discriminability for efficient listwise reranking.

Scaling Laws for Cross-Encoder Reranking

cs.IR · 2026-03-05 · unverdicted · novelty 7.0

Cross-encoder reranker performance scales predictably via power laws with model size and training exposure, allowing accurate forecasts for 400M and 1B models and data-heavy compute allocation.

GAIA: a benchmark for General AI Assistants

cs.CL · 2023-11-21 · unverdicted · novelty 7.0

GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.

DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

cs.IR · 2026-05-08 · unverdicted · novelty 6.0

DiffRetriever uses parallel masked tokens in diffusion LMs for retrieval representations, outperforming DiffEmbed and other baselines on aggregate effectiveness while supporting efficient multi-representation matching.

RAQG-QPP: Query Performance Prediction with Retrieved Query Variants and Retrieval Augmented Query Generation

cs.IR · 2026-04-29 · unverdicted · novelty 6.0

Retrieved query variants from logs combined with LLM-augmented generation improve unsupervised QPP accuracy by up to 30% for neural rankers on TREC DL'19 and DL'20.

Data, Not Model: Explaining Bias toward LLM Texts in Neural Retrievers

cs.IR · 2026-04-07 · unverdicted · novelty 6.0

Bias toward LLM texts in neural retrievers arises from artifact imbalances between positive and negative documents in training data that are absorbed during contrastive learning.

Formalized Information Needs Improve Large-Language-Model Relevance Judgments

cs.IR · 2026-04-05 · conditional · novelty 6.0

Synthetically formalizing information needs into topics with descriptions and narratives improves LLM relevance assessor agreement with humans and reduces over-labeling of relevant documents on TREC Deep Learning and Robust04.

Where Relevance Emerges: A Layer-Wise Study of Internal Attention for Zero-Shot Re-Ranking

cs.IR · 2026-02-26 · unverdicted · novelty 6.0

Internal attention in LLMs shows a bell-curve relevance distribution across layers, enabling Selective-ICR that cuts inference latency 30-50% and lets an 8B zero-shot model match 14B RL re-rankers on BRIGHT.

Access Paths for Efficient Ordering with Large Language Models

cs.DB · 2025-08-30 · unverdicted · novelty 6.0

Introduces the LLM ORDER BY semantic operator with algorithmic improvements, a semantic-aware external merge sort, and a budget-aware optimizer that selects near-optimal access paths for LLM-based ordering.

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models

cs.IR · 2025-02-02 · unverdicted · novelty 6.0

RankFlow deploys four LLM roles in sequence to rewrite queries, generate pseudo-answers, summarize passages, and rerank candidates, outperforming prior methods on TREC-DL, BEIR, and NovelEval.

A Reproducibility Study of LLM-Based Query Reformulation

cs.IR · 2026-04-30 · unverdicted · novelty 5.0

A unified evaluation finds LLM query reformulation gains are strongly conditioned on retrieval paradigm, do not consistently transfer to neural retrievers, and are not uniformly improved by larger LLMs.

Efficient Listwise Reranking with Compressed Document Representations

cs.IR · 2026-04-29 · unverdicted · novelty 5.0

RRK compresses documents to multi-token embeddings for efficient listwise reranking, enabling an 8B model to achieve 3x-18x speedups over smaller models with comparable or better effectiveness.

Beyond Hard Negatives: The Importance of Score Distribution in Knowledge Distillation for Dense Retrieval

cs.IR · 2026-04-06 · unverdicted · novelty 5.0

Stratified sampling preserving teacher score distribution outperforms hard-negative mining as a robust baseline for knowledge distillation in dense retrieval.

Statistical Foundations of DIME: Risk Estimation for Practical Index Selection

cs.IR · 2026-01-09 · unverdicted · novelty 5.0

A statistical risk estimation method enables query-specific dimension selection in dense embeddings, achieving equivalent effectiveness with about 50% smaller embeddings at inference time.

Hypencoder Revisited: Reproducibility and Analysis of Non-Linear Scoring for First-Stage Retrieval

cs.IR · 2026-04-29 · conditional · novelty 3.0

Reproducibility study confirms Hypencoder's non-linear query-specific scoring improves retrieval over bi-encoders on standard benchmarks but standard methods remain faster and hard-task results are mixed due to implementation issues.

Dynamic Ranked List Truncation for Reranking Pipelines via LLM-generated Reference-Documents

cs.IR · 2026-04-10

citing papers explorer

Showing 2 of 2 citing papers after filters.

Hypencoder Revisited: Reproducibility and Analysis of Non-Linear Scoring for First-Stage Retrieval cs.IR · 2026-04-29 · conditional · none · ref 11
Reproducibility study confirms Hypencoder's non-linear query-specific scoring improves retrieval over bi-encoders on standard benchmarks but standard methods remain faster and hard-task results are mixed due to implementation issues.
Dynamic Ranked List Truncation for Reranking Pipelines via LLM-generated Reference-Documents cs.IR · 2026-04-10 · unreviewed · ref 4

Voorhees

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer