Dimension d = O(m^{-2} log n) nearly achieves the optimal margin m^rd(+∞, A) for retrieval embeddings, with matching lower bounds showing d = O(k log(n/k)) suffices and is necessary for m = Θ(k^{-1/2}) on k-sparse query matrices.
Mixed citations
Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =
Mixed citation behavior. Most common role is background (56%).
citation-role summary
citation-polarity summary
representative citing papers
Introduces P-CHR AUC and CRR metrics to demonstrate that semantic caching model selection is limited by calibration quality rather than ranking performance.
DICE aggregates independently encoded document chunks into a single vector to reduce evidence dilution in long-document dense retrieval, reporting gains on LongEmbed especially beyond 4k tokens.
MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.
NumColBERT improves ColBERT performance on numerical query conditions non-intrusively via gating and contrastive learning, outperforming fine-tuning while matching or exceeding separate text-number scoring methods.
Code-switching creates a fundamental performance bottleneck for multilingual retrievers, causing drops of up to 27% on new benchmarks CSR-L and CS-MTEB, with embedding divergence as the key cause and vocabulary expansion insufficient to fix it.
A single model unifies retrieval and context compression for on-device RAG via shared representations, matching traditional RAG performance at 1/10 context size with no extra storage.
Presents the first evidence-grounded retrieval benchmark and hybrid RAG framework for silicon pixel detector R&D, with evaluation showing hybrid sparse-dense retrieval most reliable for evidence recovery.
LightSTAR achieves state-of-the-art accuracy in visual document retrieval by decomposing the task into LLM-free high-recall candidate selection and vision-adaptive semantic refinement on candidates, cutting end-to-end latency several-fold.
RSRank learns calibrated relevance scores from alignment between representational shifts induced by candidate documents and those from oracle document sets, enabling zero-threshold filtering.
Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.
Attention-based models can retrieve evidence intrinsically by using decoder attention to score and reuse their own pre-encoded chunks, outperforming separate retrieval pipelines on QA benchmarks.
XTR training does not improve retrieval effectiveness over ColBERT but enhances IVF engine efficiency by flattening token scores to produce more discriminative centroids.
A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
Entity signals cover only 19.7% of relevant documents on Robust04 and no configuration among 443 systems improves MAP by more than 0.05 in open-world evaluation, despite gains when entities are pre-restricted.
A Voronoi cell estimation framework in embedding space enables principled token pruning for late-interaction models, reducing index size while retaining retrieval quality.
A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.
ConstBERT and ColBERT-v2 reproduce on MS-MARCO but drop 86-97% on long queries because MaxSim cannot filter filler noise, and extra fine-tuning or backend changes do not overcome the architectural constraint.
Hard maximum similarity pooling in late-interaction models induces higher patch-level gradient concentration and greater length sensitivity than top-k or softmax alternatives.
ModernBERT is a new bidirectional encoder model achieving SOTA performance on diverse classification and retrieval benchmarks while offering superior speed and memory efficiency for long-context inference.
E5 text embeddings trained with weakly-supervised contrastive pre-training on CCPairs outperform BM25 on BEIR zero-shot and achieve top results on MTEB, beating much larger models.
Pretrained and fine-tuned Qwen3 embeddings exhibit measurable alignment with an expert symptom matrix via RSA on Reddit mental-health data, strengthened by fine-tuning at fine-grained levels and larger scale, with residual alignment after VAD/LIWC/topic controls.
Mira-Embeddings-V1 adapts embeddings for recruitment reranking by synthesizing positive and hard-negative samples with LLMs, then applies JD-JD contrastive and JD-CV triplet training plus a BoundaryHead MLP, lifting Recall@50 from 68.89% to 77.55% and Recall@200 from 0.5969 to 0.7047.
citing papers explorer
-
Is Dimensionality a Barrier for Retrieval Models?
Dimension d = O(m^{-2} log n) nearly achieves the optimal margin m^rd(+∞, A) for retrieval embeddings, with matching lower bounds showing d = O(k log(n/k)) suffices and is necessary for m = Θ(k^{-1/2}) on k-sparse query matrices.
-
Closing the Calibration Gap in Semantic Caching
Introduces P-CHR AUC and CRR metrics to demonstrate that semantic caching model selection is limited by calibration quality rather than ranking performance.
-
Lost in a Single Vector: Improving Long-Document Retrieval with Chunk Evidence Aggregation
DICE aggregates independently encoded document chunks into a single vector to reduce evidence dilution in long-document dense retrieval, reporting gains on LongEmbed especially beyond 4k tokens.
-
MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image
MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.
-
NumColBERT: Non-Intrusive Numeracy Injection for Late-Interaction Retrieval Models
NumColBERT improves ColBERT performance on numerical query conditions non-intrusively via gating and contrastive learning, outperforming fine-tuning while matching or exceeding separate text-number scoring methods.
-
Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers
Code-switching creates a fundamental performance bottleneck for multilingual retrievers, causing drops of up to 27% on new benchmarks CSR-L and CS-MTEB, with embedding divergence as the key cause and vocabulary expansion insufficient to fix it.
-
A Unified Model and Document Representation for On-Device Retrieval-Augmented Generation
A single model unifies retrieval and context compression for on-device RAG via shared representations, matching traditional RAG performance at 1/10 context size with no extra storage.
-
A Grounded Evidence-Retrieval Benchmark and Hybrid RAG Framework for Silicon Pixel Detector R&D
Presents the first evidence-grounded retrieval benchmark and hybrid RAG framework for silicon pixel detector R&D, with evaluation showing hybrid sparse-dense retrieval most reliable for evidence recovery.
-
LightSTAR: Efficient Visual Document Retrieval via Lightweight Selection with Vision-Adaptive Refinement
LightSTAR achieves state-of-the-art accuracy in visual document retrieval by decomposing the task into LLM-free high-recall candidate selection and vision-adaptive semantic refinement on candidates, cutting end-to-end latency several-fold.
-
RSRank: Learning Relevance from Representational Shifts
RSRank learns calibrated relevance scores from alignment between representational shifts induced by candidate documents and those from oracle document sets, enabling zero-threshold filtering.
-
Task-Adaptive Embedding Refinement via Test-time LLM Guidance
Test-time LLM feedback refines query embeddings to deliver up to 25% relative gains on zero-shot literature search, intent detection, and related benchmarks.
-
Retrieval from Within: An Intrinsic Capability of Attention-Based Models
Attention-based models can retrieve evidence intrinsically by using decoder attention to score and reuse their own pre-encoded chunks, outperforming separate retrieval pipelines on QA benchmarks.
-
A Replicability Study of XTR
XTR training does not improve retrieval effectiveness over ColBERT but enhances IVF engine efficiency by flattening token scores to produce more discriminative centroids.
-
A Survey of Reasoning-Intensive Retrieval: Progress and Challenges
A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.
-
ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
-
Entities as Retrieval Signals: A Systematic Study of Coverage, Supervision, and Evaluation in Entity-Oriented Ranking
Entity signals cover only 19.7% of relevant documents on Robust04 and no configuration among 443 systems improves MAP by more than 0.05 in open-world evaluation, despite gains when entities are pre-restricted.
-
A Voronoi Cell Formulation for Principled Token Pruning in Late-Interaction Retrieval Models
A Voronoi cell estimation framework in embedding space enables principled token pruning for late-interaction models, reducing index size while retaining retrieval quality.
-
Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction
A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.
-
Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions
ConstBERT and ColBERT-v2 reproduce on MS-MARCO but drop 86-97% on long queries because MaxSim cannot filter filler noise, and extra fine-tuning or backend changes do not overcome the architectural constraint.
-
Spike Hijacking in Late-Interaction Retrieval
Hard maximum similarity pooling in late-interaction models induces higher patch-level gradient concentration and greater length sensitivity than top-k or softmax alternatives.
-
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
ModernBERT is a new bidirectional encoder model achieving SOTA performance on diverse classification and retrieval benchmarks while offering superior speed and memory efficiency for long-context inference.
-
Text Embeddings by Weakly-Supervised Contrastive Pre-training
E5 text embeddings trained with weakly-supervised contrastive pre-training on CCPairs outperform BM25 on BEIR zero-shot and achieve top results on MTEB, beating much larger models.
-
Do LLM Embedding Spaces Recover Expert Structure?
Pretrained and fine-tuned Qwen3 embeddings exhibit measurable alignment with an expert symptom matrix via RSA on Reddit mental-health data, strengthened by fine-tuning at fine-grained levels and larger scale, with residual alignment after VAD/LIWC/topic controls.
-
Mira-Embeddings-V1: Domain-Adapted Semantic Reranking for Recruitment via LLM-Synthesized Data
Mira-Embeddings-V1 adapts embeddings for recruitment reranking by synthesizing positive and hard-negative samples with LLMs, then applies JD-JD contrastive and JD-CV triplet training plus a BoundaryHead MLP, lifting Recall@50 from 68.89% to 77.55% and Recall@200 from 0.5969 to 0.7047.
-
A Hybrid Retrieval and Reranking Framework for Evidence-Grounded Retrieval-Augmented Generation
A hybrid RAG system with retrieval, Cohere reranking, and claim-level LLM judgment achieves 100% grounding accuracy on 200 claims from 25 biomedical queries in a pilot study.
-
Overview of the EReL@MIR 2025 Multimodal Document Retrieval Challenge (Track 1)
The EReL@MIR 2025 Track 1 challenge evaluates single systems on two multimodal retrieval tasks and finds that Qwen2-VL decoder-based embedders dominate, with a training-free entry within 0.1 points of the fine-tuned winner.
- Where Does Authorship Signal Emerge in Encoder-Based Language Models?
- Test-Time Compute for Frozen Embedding Models through Agentic Program Search
- Kernel Affine Hull Machines as Compute-Efficient Encoders for Frozen Semantic Spaces