arxiv: 2401.08281 · v4 · submitted 2024-01-16 · 💻 cs.LG · cs.CV· cs.SE

Recognition: 2 theorem links

· Lean Theorem

The Faiss library

Alexandr Guzhva, Chengqi Deng, Gergely Szilvasy, Herv\'e J\'egou, Jeff Johnson, Lucas Hosseini, Maria Lomeli, Matthijs Douze, Pierre-Emmanuel Mazar\'e

Pith reviewed 2026-05-12 01:43 UTC · model grok-4.3

classification 💻 cs.LG cs.CVcs.SE

keywords vector similarity searchindexing methodsvector databasesembedding vectorsFaissclusteringvector compression

0 comments

The pith

The Faiss library supplies indexing methods and primitives for vector similarity search in large embedding collections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents the Faiss library as a dedicated toolkit for vector similarity search, which forms a core part of vector databases managing growing numbers of AI embeddings. It maps out the trade-off space for such searches and explains the library's design choices in structure, optimization strategies, and how users interface with it. Benchmarks of main capabilities are included, together with examples of applications that illustrate how the library handles practical workloads.

Core claim

The Faiss library is dedicated to vector similarity search, a core functionality of vector databases. Faiss is a toolkit of indexing methods and related primitives used to search, cluster, compress and transform vectors.

What carries the argument

Indexing methods and related primitives that enable search, clustering, compression, and transformation of vectors.

If this is right

Vector databases can scale to larger embedding collections by selecting from Faiss indexing options that balance speed, memory, and accuracy.
Applications requiring vector clustering or compression can reuse the same primitives already tuned for search.
Hardware-specific optimizations in Faiss allow performance gains on common CPU and GPU setups without custom code.
New AI systems can integrate Faiss primitives directly for embedding management rather than building search layers from scratch.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Teams choosing a vector search backend may compare Faiss against alternatives by replicating the paper's benchmark setup on their own data.
The library's modularity suggests it could serve as a base for domain-specific extensions such as time-series embeddings or multimodal vectors.
As embedding dimensions and collection sizes continue to grow, the trade-off analyses provide a starting point for predicting when index rebuilding becomes necessary.

Load-bearing premise

The described trade-off space, design principles, benchmarks, and selected applications accurately represent the library's practical performance and broad applicability without significant unstated limitations in real deployments.

What would settle it

A deployment on real data where measured search latency, recall, or memory usage deviates substantially from the reported benchmarks for the corresponding index types and dataset sizes.

read the original abstract

Vector databases typically manage large collections of embedding vectors. Currently, AI applications are growing rapidly, and so is the number of embeddings that need to be stored and indexed. The Faiss library is dedicated to vector similarity search, a core functionality of vector databases. Faiss is a toolkit of indexing methods and related primitives used to search, cluster, compress and transform vectors. This paper describes the trade-off space of vector search and the design principles of Faiss in terms of structure, approach to optimization and interfacing. We benchmark key features of the library and discuss a few selected applications to highlight its broad applicability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript describes the Faiss library as a toolkit for vector similarity search, clustering, compression, and transformation of embedding vectors in vector databases. It outlines the trade-off space of vector search, details design principles concerning library structure, optimization strategies, and user interfacing, presents benchmarks for key features, and illustrates selected applications to demonstrate broad applicability in AI systems.

Significance. If the descriptions and benchmarks hold, the paper provides a useful reference for the design and performance characteristics of a widely adopted open-source library central to modern embedding-based applications. It explicitly addresses practical trade-offs rather than advancing new algorithms, which is a strength for practitioners needing to navigate indexing choices at scale.

minor comments (2)

[Abstract] Abstract: the claim of 'broad applicability' would be strengthened by a brief statement of the scale (vector dimensionality and dataset size) at which the selected applications were tested.
The paper would benefit from an explicit statement of the Faiss version and commit hash corresponding to the reported benchmarks to improve reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review and recommendation to accept the manuscript. We are pleased that the paper is viewed as a useful reference for the design and performance characteristics of Faiss in the context of vector databases and embedding-based AI applications.

Circularity Check

0 steps flagged

No significant circularity: descriptive library overview

full rationale

The paper is a factual description of the Faiss library's design, indexing methods, benchmarks, and applications. It contains no derivations, equations, predictions, or quantitative claims that could reduce to fitted inputs or self-referential definitions. The content is self-contained as an overview of existing software without load-bearing theoretical steps or self-citation chains that substitute for independent evidence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is a software library description with no mathematical derivations, fitted parameters, background axioms, or postulated entities.

pith-pipeline@v0.9.0 · 5426 in / 1056 out tokens · 36155 ms · 2026-05-12T01:43:42.479368+00:00 · methodology

discussion (0)

Forward citations

Cited by 42 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SalesSim: Benchmarking and Aligning Multimodal Language Models as Retail User Simulators
cs.CL 2026-05 unverdicted novelty 7.0

SalesSim benchmarks MLLMs as retail user simulators, finds gaps in persona adherence and over-persuasion, and introduces UserGRPO RL to raise decision alignment by 13.8%.
Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement
cs.CL 2026-05 unverdicted novelty 7.0

Concept Fields model text corpora as local Gaussian drift fields in embedding space to score sentence transitions for hallucination detection and novelty via standardized deviation.
Beyond Rules: LLM-Powered Linting for Quantum Programs
cs.SE 2026-05 unverdicted novelty 7.0

LLM-powered linters with CoT and RAG detect quantum programming problems more accurately than rule-based LintQ on Qiskit code, with higher precision, recall, and F1 scores.
Contrastive Privacy: A Semantic Approach to Measuring Privacy of AI-based Sanitization
cs.CR 2026-05 unverdicted novelty 7.0

Contrastive privacy is a new corpus-contrast test for semantic privacy in AI-sanitized media that uses latent concept measures and requires no manual labeling.
Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings
cs.CL 2026-04 unverdicted novelty 7.0

Modern text encoders resist second-order collapse under mean pooling because token embeddings concentrate tightly within texts, and this resistance correlates with stronger downstream performance.
CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG
cs.CL 2026-04 unverdicted novelty 7.0

CORAL uses an agentic loop to adaptively refine retrieval corpora and queries in multilingual RAG based on evidence critique, yielding up to 3.58 percentage point accuracy gains on low-resource language cultural QA be...
AsmRAG: LLM-Driven Malware Detection by Retrieving Functionally Similar Assembly Code
cs.CR 2026-04 unverdicted novelty 7.0

AsmRAG detects malware at 96% F1 and attributes families at 95% F1 by retrieving functionally similar assembly code via LLM embeddings and density-weighted anchor selection, remaining robust to metamorphic obfuscation.
MCI: A Maximal Clique Index for Efficient Arbitrary-Filtered Approximate Nearest Neighbor Search
cs.DB 2026-04 unverdicted novelty 7.0

MCI approximates dense nearest neighbor graphs via maximal clique covers and progressive local densification to support fast arbitrary-filtered approximate nearest neighbor search with reduced space.
Structure Guided Retrieval-Augmented Generation for Factual Queries
cs.IR 2026-04 unverdicted novelty 7.0

SG-RAG frames retrieval as subgraph matching to ensure LLMs meet every condition in factual queries and reports large gains over baselines on a new 120k-pair ERQA dataset.
DocQAC: Adaptive Trie-Guided Decoding for Effective In-Document Query Auto-Completion
cs.IR 2026-04 conditional novelty 7.0

Adaptive trie-guided decoding with document context and tunable penalties improves in-document query auto-completion, outperforming baselines and larger models like LLaMA-3 on seen queries.
HORIZON: A Benchmark for In-the-wild User Behaviour Modeling
cs.IR 2026-04 unverdicted novelty 7.0

HORIZON creates a cross-domain, long-horizon user modeling benchmark from Amazon Reviews that tests generalization across time, domains, and unseen users, exposing gaps in sequential and LLM-based recommendation models.
On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability
cs.IR 2026-04 unverdicted novelty 7.0

LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulne...
SGA-MCTS: Decoupling Planning from Execution via Training-Free Atomic Experience Retrieval
cs.AI 2026-04 unverdicted novelty 7.0

SGA-MCTS distills MCTS trajectories into de-lexicalized State-Goal-Action atoms for hybrid retrieval, enabling open-weight LLMs to match frontier model performance on complex planning without fine-tuning.
JZ-Tree: GPU friendly neighbour search and friends-of-friends with dual tree walks in JAX plus CUDA
cs.DC 2026-04 unverdicted novelty 7.0

JZ-Tree introduces a flattened Morton plane-based tree hierarchy enabling collaborative dual-tree walks that deliver more than 10x faster exact k-NN search and FoF clustering on GPUs for N greater than 10 million part...
Can You Trust the Vectors in Your Vector Database? Black-Hole Attack from Embedding Space Defects
cs.CR 2026-04 unverdicted novelty 7.0

Injecting a few malicious vectors near the centroid exploits centrality-driven hubness in high-dimensional embeddings, causing them to dominate top-k retrievals in up to 99.85% of cases.
Tencent Advertising Algorithm Challenge 2025: All-Modality Generative Recommendation
cs.IR 2026-04 accept novelty 7.0

Releases TencentGR-1M and TencentGR-10M datasets with baselines for all-modality generative recommendation in advertising, including weighted evaluation for conversions.
Distance Comparison Operations Are Not Silver Bullets in Vector Similarity Search: A Benchmark Study on Their Merits and Limits
cs.DB 2026-04 accept novelty 7.0

Benchmark study shows DCO methods for vector similarity search are not reliable silver bullets due to high sensitivity to data properties and hardware, making them unsuitable for production deployment.
HiDream-O1-Image: A Natively Unified Image Generative Foundation Model with Pixel-level Unified Transformer
cs.CV 2026-05 unverdicted novelty 6.0

A pixel-space Diffusion Transformer with Unified Transformer architecture unifies image generation, editing, and personalization in an end-to-end model that maps all inputs to a shared token space and scales from 8B t...
Similar Pattern Annotation via Retrieval Knowledge for LLM-Based Test Code Fault Localization
cs.SE 2026-05 unverdicted novelty 6.0

SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
GASim: A Graph-Accelerated Hybrid Framework for Social Simulation
cs.AI 2026-05 unverdicted novelty 6.0

GASim accelerates hybrid LLM-ABM social simulations via graph-optimized memory, graph message passing, and entropy-driven agent grouping, delivering 9.94x speedup and under 20% token use while aligning with real-world trends.
Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement
cs.CL 2026-05 unverdicted novelty 6.0

Concept Fields model text corpora as local Gaussian drift fields in embedding space to score sentence transitions for groundedness and novelty without model internals.
Kernel Affine Hull Machines for Compute-Efficient Query-Side Semantic Encoding
cs.LG 2026-05 unverdicted novelty 6.0

Kernel Affine Hull Machines map lexical features to semantic embeddings via RKHS and least-mean-squares, outperforming adapters in reconstruction and retrieval metrics while reducing latency 8.5-fold on a legal benchmark.
Robust Multimodal Recommendation via Graph Retrieval-Enhanced Modality Completion
cs.IR 2026-05 unverdicted novelty 6.0

GRE-MC retrieves relevant subgraphs and uses a graph transformer plus sparse codebook to complete missing modalities, outperforming prior methods on recommendation benchmarks.
Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing
cs.IR 2026-04 unverdicted novelty 6.0

TACHIOM speeds up multivector retrieval by up to 247x in clustering and 9.8x in retrieval on MS-MARCOv1 and LoTTE benchmarks using token-distribution-aware centroid allocation and a graph-plus-PQ index, with comparabl...
TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning
cs.CR 2026-04 unverdicted novelty 6.0

TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
DenseStep2M: A Scalable, Training-Free Pipeline for Dense Instructional Video Annotation
cs.CV 2026-04 unverdicted novelty 6.0

A scalable training-free pipeline using video segmentation, filtering, and off-the-shelf multimodal models creates DenseStep2M, a dataset of 100K videos and 2M detailed instructional steps that improves dense captioni...
Onyx: Cost-Efficient Disk-Oblivious ANN Search
cs.CR 2026-04 unverdicted novelty 6.0

Onyx inverts ANN-ORAM optimization priorities with a compact pruning representation and locality-aware shallow tree to deliver 1.7-9.9x lower cost and 2.3-12.3x lower latency for disk-oblivious ANN search.
Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale
cs.CV 2026-04 unverdicted novelty 6.0

Evidence for cross-modal representational convergence weakens substantially at scale and in realistic many-to-many settings, indicating models learn rich but distinct representations.
Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation
cs.IR 2026-04 unverdicted novelty 6.0

A jointly learned hierarchical index with cross-attention and residual quantization scales exact retrieval in foundational recommendation models, deployed at Meta with additional performance from test-time training on...
TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation
cs.CL 2026-04 unverdicted novelty 6.0

TSUBASA improves long-horizon personalization in LLMs via dynamic memory evolution for writing and context-distillation self-learning for reading, outperforming Mem0 and Memory-R1 on Qwen-3 benchmarks while reducing t...
Towards Knowledgeable Deep Research: Framework and Benchmark
cs.AI 2026-04 unverdicted novelty 6.0

The paper introduces the KDR task, HKA multi-agent framework, and KDR-Bench to enable LLM agents to integrate structured knowledge into deep research reports, with experiments showing outperformance over prior agents.
Retrieve-then-Adapt: Retrieval-Augmented Test-Time Adaptation for Sequential Recommendation
cs.IR 2026-04 unverdicted novelty 6.0

ReAd retrieves collaboratively similar items, builds an augmentation embedding via a lightweight module, and fuses it to refine sequential recommendation predictions, outperforming baselines on five datasets.
Memory in the LLM Era: Modular Architectures and Strategies in a Unified Framework
cs.CL 2026-04 unverdicted novelty 6.0

A unified framework for LLM agent memory is benchmarked, with a new hybrid method outperforming state-of-the-art on standard tasks.
Reddening maps of the Magellanic Clouds using spectral energy distribution fitting of red giants
astro-ph.GA 2026-05 unverdicted novelty 5.0

Reddening maps of the LMC and SMC are produced via SED fitting of RGB stars from SMASH and VMC photometry, yielding mean E(B-V) of 0.076 mag and 0.058 mag with relative spatial structure stable across three stellar at...
One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness
cs.CL 2026-04 unverdicted novelty 5.0

A single hub text can unreasonably match many images in CLIP-based similarity, exposing vulnerabilities in cross-modal encoders for caption evaluation and retrieval.
Translating Under Pressure: Domain-Aware LLMs for Crisis Communication
cs.CL 2026-04 unverdicted novelty 5.0

Domain-adapted LLMs with preference optimization for CEFR A2 English improve readability in crisis translations while preserving adequacy.
Geometric Analysis of Self-Supervised Vision Representations for Semantic Image Retrieval
cs.IR 2026-04 unverdicted novelty 5.0

Anisotropic self-supervised vision representations degrade approximate nearest-neighbor retrieval performance while more isotropic ones with local purity improve it.
Taming GPU Underutilization via Static Partitioning and Fine-grained CPU Offloading
cs.DC 2026-04 unverdicted novelty 5.0

Static MIG partitioning cuts GPU underutilization in scientific workloads but leaves interference and coarse-grained mismatches; a Nvlink-C2C offloading scheme is introduced to bridge those gaps.
Do We Need Bigger Models for Science? Task-Aware Retrieval with Small Language Models
cs.IR 2026-04 unverdicted novelty 5.0

Task-aware retrieval with small models partially compensates for reduced scale in scholarly QA but model capacity remains important for complex reasoning.
ScaleGANN: Accelerate Large-Scale ANN Indexing by Cost-effective Cloud GPUs
cs.DB 2026-05 unverdicted novelty 4.0

ScaleGANN accelerates graph-based ANN index construction up to 9x faster and 6x cheaper than DiskANN by using divide-and-merge on distributed low-cost spot GPUs with optimized partitioning and a cost-aware scheduler.
Peerispect: Claim Verification in Scientific Peer Reviews
cs.CL 2026-04 unverdicted novelty 4.0

Peerispect extracts claims from peer reviews, retrieves evidence from the manuscript, and verifies them via NLI in a modular pipeline with a visual interface.
Hypencoder Revisited: Reproducibility and Analysis of Non-Linear Scoring for First-Stage Retrieval
cs.IR 2026-04 conditional novelty 3.0

Reproducibility study confirms Hypencoder's non-linear query-specific scoring improves retrieval over bi-encoders on standard benchmarks but standard methods remain faster and hard-task results are mixed due to implem...

Reference graph

Works this paper leans on

94 extracted references · 94 canonical work pages · cited by 41 Pith papers · 2 internal anchors

[1]

Nearest neigh- bor search with compact codes: A decoder perspective

Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, and Herv ´e J ´egou. Nearest neigh- bor search with compact codes: A decoder perspective. InICMR, 2022

work page 2022
[2]

Op- timal data-dependent hashing for approximate near neighbors

Alexandr Andoni and Ilya Razenshteyn. Op- timal data-dependent hashing for approximate near neighbors. InProceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 793–801, 2015

work page 2015
[3]

Ann-benchmarks: A bench- marking tool for approximate nearest neighbor algorithms.Information Systems, 87:101374, 2020

Martin Aum ¨uller, Erik Bernhardsson, and Alexander Faithfull. Ann-benchmarks: A bench- marking tool for approximate nearest neighbor algorithms.Information Systems, 87:101374, 2020

work page 2020
[4]

Additive quantization for extreme vector compression

Artem Babenko and Victor Lempitsky. Additive quantization for extreme vector compression. In Conference on Computer Vision and Pattern Recogni- tion, 2014

work page 2014
[5]

Tree quantization for large-scale similarity search and classification

Artem Babenko and Victor Lempitsky. Tree quantization for large-scale similarity search and classification. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4240–4248, 2015

work page 2015
[6]

Efficient indexing of billion-scale datasets of deep descrip- tors

Artem Babenko and Victor Lempitsky. Efficient indexing of billion-scale datasets of deep descrip- tors. InConference on Computer Vision and Pattern Recognition, 2016

work page 2016
[7]

Speeding up the xbox recommender system using a euclidean transfor- mation for inner-product spaces

Yoram Bachrach, Yehuda Finkelstein, Ran Gilad- Bachrach, Liran Katzir, Noam Koenigstein, Nir Nice, and Ulrich Paquet. Speeding up the xbox recommender system using a euclidean transfor- mation for inner-product spaces. InProceedings of the 8th ACM Conference on Recommender systems, pages 257–264, 2014

work page 2014
[8]

Dedrift: Robust simi- larity search under content drift

Dmitry Baranchuk, Matthijs Douze, Yash Upad- hyay, and I Zeki Yalniz. Dedrift: Robust simi- larity search under content drift. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 11026–11035, 2023

work page 2023
[9]

Seamlessm4t: Massively multilingual & multimodal ma- chine translation,

Lo ¨ıc Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul- Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, et al. Seamlessm4t-massively multilingual & mul- timodal machine translation.arXiv preprint arXiv:2308.11596, 2023

work page arXiv 2023
[10]

Enriching word vec- tors with subword information.Transactions of the 15 association for computational linguistics, 5:135–146, 2017

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vec- tors with subword information.Transactions of the 15 association for computational linguistics, 5:135–146, 2017

work page 2017
[11]

Off the beaten path: Let’s replace term-based retrieval with k-nn search

Leonid Boytsov, David Novak, Yury Malkov, and Eric Nyberg. Off the beaten path: Let’s replace term-based retrieval with k-nn search. InPro- ceedings of the 25th ACM international on conference on information and knowledge management, pages 1099–1108, 2016

work page 2016
[12]

An approximate al- gorithm for maximum inner product search over streaming sparse vectors.arXiv preprint arXiv:2301.10622, 2023

Sebastian Bruch, Franco Maria Nardini, Amir Ingber, and Edo Liberty. An approximate al- gorithm for maximum inner product search over streaming sparse vectors.arXiv preprint arXiv:2301.10622, 2023

work page arXiv 2023
[13]

Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S. Yu. Hashnet: Deep learning to hash by continuation. InProceedings of the IEEE Inter- national Conference on Computer Vision (ICCV), Oct 2017

work page 2017
[14]

Deep clustering for unsupervised learning of visual features

Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. Deep clustering for unsupervised learning of visual features. InPro- ceedings of the European conference on computer vi- sion (ECCV), pages 132–149, 2018

work page 2018
[15]

Emerging properties in self- supervised vision transformers

Mathilde Caron, Hugo Touvron, Ishan Misra, Herv´e J ´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self- supervised vision transformers. InProceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021

work page 2021
[16]

Similarity estimation techniques from rounding algorithms

Moses Charikar. Similarity estimation techniques from rounding algorithms. InProc. ACM symp. Theory of computing, 2002

work page 2002
[17]

Roargraph: A projected bipartite graph for efficient cross-modal approx- imate nearest neighbor search.arXiv preprint arXiv:2408.08933, 2024

Meng Chen, Kai Zhang, Zhenying He, Yinan Jing, and X Sean Wang. Roargraph: A projected bipartite graph for efficient cross-modal approx- imate nearest neighbor search.arXiv preprint arXiv:2408.08933, 2024

work page arXiv 2024
[18]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning, Proceedings of Machine Learning Research, pages 1597–1607. PMLR, 2020

work page 2020
[19]

Ap- proximate nearest neighbor search by residual vector quantization.Sensors, 10(12):11259–11273, 2010

Yongjian Chen, Tao Guan, and Cheng Wang. Ap- proximate nearest neighbor search by residual vector quantization.Sensors, 10(12):11259–11273, 2010

work page 2010
[20]

Magicpig: Lsh sampling for efficient llm generation.arXiv preprint arXiv:2410.16179, 2024

Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuan- dong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, et al. Magicpig: Lsh sampling for efficient llm generation.arXiv preprint arXiv:2410.16179, 2024

work page arXiv 2024
[21]

Tpu- knn: K nearest neighbor search at peak flop/s

Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, and Sanjiv Kumar. Tpu- knn: K nearest neighbor search at peak flop/s. Advances in Neural Information Processing Systems, 35:15489–15501, 2022

work page 2022
[22]

Locality-sensitive hashing scheme based on p-stable distributions

Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. InPro- ceedings of the twentieth annual symposium on Com- putational geometry, pages 253–262, 2004

work page 2004
[23]

Arcface: Additive angular mar- gin loss for deep face recognition

Jiankang Deng, Jia Guo, Niannan Xue, and Ste- fanos Zafeiriou. Arcface: Additive angular mar- gin loss for deep face recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

work page 2019
[24]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language under- standing.arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[25]

Concept decompositions for large sparse text data using clustering.Machine learning, 42:143– 175, 2001

Inderjit S Dhillon and Dharmendra S Modha. Concept decompositions for large sparse text data using clustering.Machine learning, 42:143– 175, 2001

work page 2001
[26]

Effi- cient k-nearest neighbor graph construction for generic similarity measures

Wei Dong, Charikar Moses, and Kai Li. Effi- cient k-nearest neighbor graph construction for generic similarity measures. InProceedings of the 20th international conference on World wide web, pages 577–586, 2011

work page 2011
[27]

The yael li- brary

Matthijs Douze and Herv ´e J ´egou. The yael li- brary. InProceedings of the 22nd ACM international conference on Multimedia, pages 687–690, 2014

work page 2014
[28]

Polysemous codes

Matthijs Douze, Herv ´e J ´egou, and Florent Per- ronnin. Polysemous codes. InEuropean Confer- ence on Computer Vision, 2016

work page 2016
[29]

Link and code: Fast indexing with graphs and compact regression codes

Matthijs Douze, Alexandre Sablayrolles, and Herv´e J´egou. Link and code: Fast indexing with graphs and compact regression codes. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 3646–3654, 2018

work page 2018
[30]

The 2021 image similarity dataset and challenge.arXiv preprint arXiv:2106.09672, 2021

Matthijs Douze, Giorgos Tolias, Ed Pizzi, Zo ¨e Papakipos, Lowik Chanussot, Filip Radenovic, Tomas Jenicek, Maxim Maximov, Laura Leal- Taix´e, Ismail Elezi, et al. The 2021 image sim- ilarity dataset and challenge.arXiv preprint arXiv:2106.09672, 2021

work page arXiv 2021
[31]

InProceedings of the Workshop on NEW TEXT Wikis and blogs and other dynamic text sources

Paul-Ambroise Duquenne, Holger Schwenk, and Benoˆıt Sagot. Sentence-level multimodal and language-agnostic representations.arXiv preprint arXiv:2308.11466, 2023

work page arXiv 2023
[32]

Fast approximate nearest neighbor search with the navigating spreading-out graph,

Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. Fast approximate nearest neighbor search with the navigating spreading-out graph.arXiv preprint arXiv:1707.00143, 2017. 16

work page arXiv 2017
[33]

Optimized product quantization for ap- proximate nearest neighbor search

Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. Optimized product quantization for ap- proximate nearest neighbor search. InConference on Computer Vision and Pattern Recognition, 2013

work page 2013
[34]

Filtered-diskann: Graph algorithms for approx- imate nearest neighbor search with filters

Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premkumar Srinivasan, Amit Singh, and Harsha Vardhan Simhadri. Filtered-diskann: Graph algorithms for approx- imate nearest neighbor search with filters. In Proceedings of the ACM Web Conference 2023, 2023

work page 2023
[35]

Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval

Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEETrans. Pat- tern Analysis and Machine Intelligence, 2012

work page 2012
[36]

Accelerating large-scale inference with anisotropic vector quantization

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. Accelerating large-scale inference with anisotropic vector quantization. InInternational Conference on Machine Learning. PMLR, 2020

work page 2020
[37]

Asymmetric mapping quanti- zation for nearest neighbor search.IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 42(7):1783–1790, 2019

Weixiang Hong, Xueyan Tang, Jingjing Meng, and Junsong Yuan. Asymmetric mapping quanti- zation for nearest neighbor search.IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 42(7):1783–1790, 2019

work page 2019
[38]

Embedding-based retrieval in facebook search

Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. Embedding-based retrieval in facebook search. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2553–2561, 2020

work page 2020
[39]

Huijben, Matthijs Douze, Matthew J

Iris A.M. Huijben, Matthijs Douze, Matthew J. Muckley, Ruud J.G. van Sloun, and Jakob Ver- beek. Residual quantization with implicit neural codebooks. InInternational Conference on Machine Learning (ICML), 2024

work page 2024
[40]

Unsu- pervised dense information retrieval with con- trastive learning, 2021

Gautier Izacard, Mathilde Caron, Lucas Hos- seini, Sebastian Riedel, Piotr Bojanowski, Ar- mand Joulin, and Edouard Grave. Unsu- pervised dense information retrieval with con- trastive learning, 2021

work page 2021
[41]

Atlas: Few-shot learn- ing with retrieval augmented language models

Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Timo Schick, Jane Dwivedi-Yu, Armand Joulin, Sebastian Riedel, and Edouard Grave. Atlas: Few-shot learn- ing with retrieval augmented language models. Journal of Machine Learning Research, 24(251):1–43, 2023

work page 2023
[42]

Ood-diskann: Efficient and scalable graph anns for out-of-distribution queries, 2022

Shikhar Jaiswal, Ravishankar Krishnaswamy, Ankit Garg, Harsha Vardhan Simhadri, and Sheshansh Agrawal. Ood-diskann: Efficient and scalable graph anns for out-of-distribution queries, 2022

work page 2022
[43]

Hamming embedding and weak geo- metric consistency for large scale image search

Herve Jegou, Matthijs Douze, and Cordelia Schmid. Hamming embedding and weak geo- metric consistency for large scale image search. InComputer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10, pages 304–317. Springer, 2008

work page 2008
[44]

Product quantization for nearest neigh- bor search

Herv ´e J ´egou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neigh- bor search. IEEETrans. Pattern Analysis and Ma- chine Intelligence, 2010

work page 2010
[45]

Aggregating local image descriptors into compact codes.IEEE transactions on pattern analysis and machine intelligence, 34(9):1704–1716, 2011

Herv ´e J´egou, Florent Perronnin, Matthijs Douze, Jorge S ´anchez, Patrick P ´erez, and Cordelia Schmid. Aggregating local image descriptors into compact codes.IEEE transactions on pattern analysis and machine intelligence, 34(9):1704–1716, 2011

work page 2011
[46]

Searching in one billion vectors: re-rank with source coding

Herv ´e J´egou, Romain Tavenard, Matthijs Douze, and Laurent Amsaleg. Searching in one billion vectors: re-rank with source coding. InInterna- tional Conference on Acoustics, Speech, and Signal Processing, 2011

work page 2011
[47]

Billion-scale similarity search with GPUs.IEEE Trans

Jeff Johnson, Matthijs Douze, and Herv ´e J ´egou. Billion-scale similarity search with GPUs.IEEE Trans. on Big Data, 2019

work page 2019
[48]

General- ization through memorization: Nearest neighbor language models, 2020

Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. General- ization through memorization: Nearest neighbor language models, 2020

work page 2020
[49]

How to train your dragon: Di- verse augmentation towards generalizable dense retrieval, 2023

Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen tau Yih, and Xilun Chen. How to train your dragon: Di- verse augmentation towards generalizable dense retrieval, 2023

work page 2023
[50]

doi:10.48550/arXiv.2310.01352 , abstract =

Xi Victoria Lin, Xilun Chen, Mingda Chen, Wei- jia Shi, Maria Lomeli, Rich James, Pedro Ro- driguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, and Scott Yih. Ra- dit: Retrieval-augmented dual instruction tun- ing.ArXiv, abs/2310.01352, 2023

work page arXiv 2023
[51]

Improved residual vector quantization for high-dimensional approximate nearest neighbor search.arXiv preprint arXiv:1509.05195, 2015

Shicong Liu, Hongtao Lu, and Junru Shao. Improved residual vector quantization for high-dimensional approximate nearest neighbor search.arXiv preprint arXiv:1509.05195, 2015

work page arXiv 2015
[52]

Least squares quantization in PCM

Stuart Lloyd. Least squares quantization in PCM. IEEETransactions on Information Theory, 1982

work page 1982
[53]

Distinctive image features from scale-invariant keypoints.International Journal of Computer Vision, 60(2), 2004

David G Lowe. Distinctive image features from scale-invariant keypoints.International Journal of Computer Vision, 60(2), 2004

work page 2004
[54]

Image sim- ilarity search with compact data structures

Qin Lv, Moses Charikar, and Kai Li. Image sim- ilarity search with compact data structures. In Proceedings of the thirteenth ACM international con- ference on Information and knowledge management, pages 208–217, 2004. 17

work page 2004
[55]

Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs

Yu A Malkov and Dmitry A Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 42(4):824–836, 2018

work page 2018
[56]

Revisiting additive quantiza- tion

Julieta Martinez, Joris Clement, Holger H Hoos, and James J Little. Revisiting additive quantiza- tion. InEuropean Conference on Computer Vision, 2016

work page 2016
[57]

LSQ++: lower running time and higher recall in multi-codebook quanti- zation

Julieta Martinez, Shobhit Zakhmi, Holger H Hoos, and James J Little. LSQ++: lower running time and higher recall in multi-codebook quanti- zation. InEuropean Conference on Computer Vision, 2018

work page 2018
[58]

A survey of product quan- tization.ITE Transactions on Media Technology and Applications, 2018

Yusuke Matsui, Yusuke Uchida, Herv ´e J ´egou, and Shin’ichi Satoh. A survey of product quan- tization.ITE Transactions on Media Technology and Applications, 2018

work page 2018
[59]

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word rep- resentations in vector space.arXiv preprint arXiv:1301.3781, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[60]

Non- metric similarity graphs for maximum inner product search.Advances in Neural Information Processing Systems, 31, 2018

Stanislav Morozov and Artem Babenko. Non- metric similarity graphs for maximum inner product search.Advances in Neural Information Processing Systems, 31, 2018

work page 2018
[61]

Un- supervised neural quantization for compressed- domain similarity search

Stanislav Morozov and Artem Babenko. Un- supervised neural quantization for compressed- domain similarity search. InInternational Confer- ence on Computer Vision, 8 2019

work page 2019
[62]

Scalable nearest neighbor algorithms for high dimensional data

Marius Muja and David G Lowe. Scalable nearest neighbor algorithms for high dimensional data. IEEE transactions on pattern analysis and machine intelligence, 36(11):2227–2240, 2014

work page 2014
[63]

Cagra: Highly parallel graph construction and approxi- mate nearest neighbor search for gpus, 2023

Hiroyuki Ootomo, Akira Naruse, Corey Nolet, Ray Wang, Tamas Feher, and Yong Wang. Cagra: Highly parallel graph construction and approxi- mate nearest neighbor search for gpus, 2023

work page 2023
[64]

Maxime Oquab, Timoth ´ee Darcet, Theo Moutakanni, Huy V . Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Russell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang-Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nicolas Ballas, Gabriel Synnaeve, Ishan Misra, Herve Jegou, Julien Mairal, Patrick Lab...

work page 2023
[65]

Improving regularized sin- gular value decomposition for collaborative fil- tering

Arkadiusz Paterek. Improving regularized sin- gular value decomposition for collaborative fil- tering. InProceedings of KDD cup and workshop, 2007

work page 2007
[66]

Locality sensitive hashing: A comparison of hash function types and querying mechanisms

Lo ¨ıc Paulev´e, Herv ´e J ´egou, and Laurent Amsa- leg. Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern recognition letters, 31(11):1348–1358, 2010

work page 2010
[67]

KILT: a benchmark for knowledge intensive language tasks

Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rockt¨aschel, and Sebastian Riedel. KILT: a benchmark for knowledge intensive language tasks. InProceedings of the 2021 Conference of the North American Chapter of the Associat...

work page 2021
[68]

The 2023 video similarity dataset and challenge.arXiv preprint arXiv:2306.09489, 2023

Ed Pizzi, Giorgos Kordopatis-Zilos, Hiral Patel, Gheorghe Postelnicu, Sugosh Nagavara Ravin- dra, Akshay Gupta, Symeon Papadopoulos, Giorgos Tolias, and Matthijs Douze. The 2023 video similarity dataset and challenge.arXiv preprint arXiv:2306.09489, 2023

work page arXiv 2023
[69]

A self-supervised descriptor for image copy detec- tion

Ed Pizzi, Sreya Dutta Roy, Sugosh Nagavara Ravindra, Priya Goyal, and Matthijs Douze. A self-supervised descriptor for image copy detec- tion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14532–14542, 2022

work page 2022
[70]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021

work page 2021
[71]

Searching with expectations

Harsimrat Sandhawalia and Herv ´e J ´egou. Searching with expectations. InInternational Conference on Acoustics, Speech, and Signal Process- ing, 2010

work page 2010
[72]

Learning joint multilingual sentence representations with neural machine translation

Holger Schwenk and Matthijs Douze. Learning joint multilingual sentence representations with neural machine translation. InProceedings of the 2nd Workshop on Representation Learning for NLP, pages 157–167, Vancouver, Canada, August 2017. Association for Computational Linguistics

work page 2017
[73]

Smith, Luke Zettlemoyer, Scott Yih, and Mike Lewis

Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, and Mike Lewis. In- context pretraining: Language modeling beyond document boundaries.ArXiv, abs/2310.10638, 2023

work page arXiv 2023
[74]

Replug: Retrieval- augmented black-box language models, 2023

Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, and Wen tau Yih. Replug: Retrieval- augmented black-box language models, 2023. 18

work page 2023
[75]

Re- sults of the big ann: Neurips’23 competition

Harsha Vardhan Simhadri, Martin Aum ¨uller, Amir Ingber, Matthijs Douze, George Williams, Magdalen Dobson Manohar, Dmitry Baranchuk, Edo Liberty, Frank Liu, Ben Landrum, et al. Re- sults of the big ann: Neurips’23 competition. arXiv preprint arXiv:2409.17424, 2024

work page arXiv 2024
[76]

Results of the neurips’21 challenge on billion-scale approximate nearest neighbor search

Harsha Vardhan Simhadri, George Williams, Martin Aum ¨uller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krishnaswamny, Gopal Srinivasa, et al. Results of the neurips’21 challenge on billion-scale approximate nearest neighbor search. InNeurIPS 2021 Competitions and Demonstrations Track, pages 177–189. PMLR, 2022

work page 2021
[77]

Results of the neurips’21 challenge on billion-scale approximate nearest neighbor search

Harsha Vardhan Simhadri, George Williams, Martin Aum ¨uller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krishnaswamny, Gopal Srinivasa, Suhas Jayaram Subramanya, and Jing- dong Wang. Results of the neurips’21 challenge on billion-scale approximate nearest neighbor search. InProceedings of the NeurIPS 2021 Com- pe...

work page 2021
[78]

Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search, 2021

Aditi Singh, Suhas Jayaram Subramanya, Rav- ishankar Krishnaswamy, and Harsha Vardhan Simhadri. Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search, 2021

work page 2021
[79]

Diskann: Fast accurate billion-point nearest neighbor search on a single node

Suhas Jayaram Subramanya, Rohan Kadekodi, Ravishankar Krishaswamy, and Harsha Vardhan Simhadri. Diskann: Fast accurate billion-point nearest neighbor search on a single node. In Neurips, 2019

work page 2019
[80]

Au- tomating nearest neighbor search configuration with constrained optimization.arXiv preprint arXiv:2301.01702, 2023

Philip Sun, Ruiqi Guo, and Sanjiv Kumar. Au- tomating nearest neighbor search configuration with constrained optimization.arXiv preprint arXiv:2301.01702, 2023

work page arXiv 2023

Showing first 80 references.