pith. machine review for the scientific record. sign in

arxiv: 2604.01960 · v2 · submitted 2026-04-02 · 💻 cs.DB · cs.DS

Recognition: 2 theorem links

· Lean Theorem

BBC: Improving Large-k Approximate Nearest Neighbor Search with a Bucket-based Result Collector

Ziqi Yin , Gao Cong , Kai Zeng , Jinwei Zhu , Bin Cui

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:43 UTC · model grok-4.3

classification 💻 cs.DB cs.DS
keywords approximate nearest neighborlarge-k queriesquantization-based indexesbucket-based collectorre-ranking algorithmsANN searchdatabase indexingperformance acceleration
0
0 comments X

The pith

A bucket-based collector organizes ANN candidates by distance to speed up large-k queries in quantization indexes by up to 3.8 times.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Quantization-based approximate nearest neighbor methods slow down for large-k queries because standard top-k collectors incur high maintenance costs and pruning becomes less effective, forcing expensive re-ranking. The paper introduces BBC with a bucket-based result buffer that groups candidates by distance to the query, allowing faster superset maintenance and a lightweight final top-k selection. It adds two re-ranking algorithms tuned to different quantization types to shrink either the number of objects checked or the number of cache misses. Experiments on real datasets confirm speedups of up to 3.8x at recall 0.95 without accuracy loss. Readers care because many applications, from recommendation to search, routinely need dozens or hundreds of neighbors at interactive speeds.

Core claim

BBC improves quantization-based ANN indexes for large-k queries by replacing conventional top-k collectors with a bucket-based result buffer that organizes candidates into distance buckets, reducing ranking costs and cache misses while enabling lightweight final selection, together with two re-ranking algorithms that cut either candidate volume or cache misses depending on the underlying quantizer.

What carries the argument

Bucket-based result buffer that organizes candidates into buckets by distance to the query for efficient superset maintenance and top-k selection.

If this is right

  • Existing quantization-based ANN indexes gain substantial speed for large-k queries without altering their core structure.
  • Re-ranking overhead drops by limiting candidates examined or by improving memory access patterns.
  • Cache efficiency rises during candidate ranking and selection phases.
  • The same collector works across multiple quantization methods while preserving recall.
  • Final top-k extraction becomes a lightweight step after bucket organization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bucket organization could be tested on non-quantization ANN indexes that also produce large candidate sets.
  • Applications that vary k at runtime might benefit from adaptive bucket sizing tuned to expected result cardinality.
  • Index designers facing high-k workloads may need to treat result collection as a first-class optimization target rather than an afterthought.
  • Direct measurement of cache miss rates before and after BBC on new hardware would confirm the memory-efficiency mechanism.

Load-bearing premise

Grouping candidates by distance reduces ranking costs and cache misses enough to offset the overhead of maintaining the buckets.

What would settle it

Running BBC-augmented indexes against standard large-k ANN benchmarks and measuring no reduction in query time or cache misses would disprove the acceleration claim.

Figures

Figures reproduced from arXiv: 2604.01960 by Bin Cui, Gao Cong, Jinwei Zhu, Kai Zeng, Ziqi Yin.

Figure 2
Figure 2. Figure 2: Time Overhead Breakdown of four methods at dif [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the Proposed Bucket-based Result Collector. the number of re-ranked objects increases linearly, leading to a significant increase in the re-ranking cost, as shown in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The probability density function (PDF) of distances between the query and data vectors on the C4 dataset. report [91] and (2) experimental validation on real-world high￾dimensional datasets (see Exp-4). Theorem 3.1 (Expected Mean Absolute Error). Let 𝑞, 𝑜 ∈ R 𝑑 be random vectors with norms satisfying 𝛼 < ∥𝑞∥2, ∥𝑜 ∥2 < 𝛽, and let 𝑅 = ∥𝑞−𝑜 ∥2 ∈ [0, 2𝛽] denote their Euclidean distance. Assume that 𝑅 has a sub… view at source ↗
Figure 5
Figure 5. Figure 5: The accuracy-efficiency trade-off results under different 𝑘 (upper and right is better). the x86simdsort library. All experiments are run on a machine equipped with an AMD Ryzen Threadripper PRO 5965WX 7.0GHz processor (supporting AVX2) and 128 GB of RAM. 4.2 Experimental Results Exp-1: ANN Query Performance. The querying results over the four datasets are shown in [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Relative error of large-𝑘 ANN queries on the C4 dataset (lower is better). 1 2 3 4 5 Evaluated Objects (10 6 ) 4.7 4.8 4.9 5.0 5.1 5.2 Threshold k = 5,000 1 2 3 4 5 Evaluated Objects (10 6 ) 4.8 4.9 5.0 5.1 5.2 5.3 k = 10,000 1 2 3 4 5 Evaluated Objects (10 6 ) 4.9 5.0 5.1 5.2 5.3 5.4 k = 20,000 1 2 3 4 5 Evaluated Objects (10 6 ) 5.1 5.2 5.3 5.4 5.5 5.6 k = 40,000 1 2 3 4 5 Evaluated Objects (10 6 ) 5.3 5… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of threshold values generated by the result buffer and binary heap on the C4 Dataset. [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Overhead of top-𝑘 collectors under varying numbers of evaluated objects on the Deep100M dataset. achieving up to an order of magnitude improvement. For example, on the Deep100M dataset, when 𝑘 = 100,000 and 𝑛𝑝𝑟𝑜𝑏𝑒 = 210, for IVF+RaBitQ, RB takes only 2.0 ms, compared to 30.6 ms for Heap and 12.3 ms for Lazy, achieving an order-of-magnitude speedup. (2) RB remains efficient at higher values of 𝑘, while Lazy… view at source ↗
Figure 9
Figure 9. Figure 9: Re-ranking Time and Number of Re-ranked Objects on the C4 Dataset [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Top-𝑘 collection time under different numbers of evaluated objects on the Deep100M dataset [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
read the original abstract

Although Approximate Nearest Neighbor (ANN) search has been extensively studied, large-k ANN queries that aim to retrieve a large number of nearest neighbors remain underexplored, despite their numerous real-world applications. Existing ANN methods face significant performance degradation for such queries. In this work, we first investigate the reasons for the performance degradation of quantization-based ANN indexes: (1) the inefficiency of existing top-k collectors, which incurs significant overhead in candidate maintenance, and (2) the reduced pruning effectiveness of quantization methods, which leads to a costly re-ranking process. To address this, we propose a novel bucket-based result collector (BBC) to enhance the efficiency of existing quantization-based ANN indexes for large-k ANN queries. BBC introduces two key components: (1) a bucket-based result buffer that organizes candidates into buckets by their distances to the query. This design reduces ranking costs and improves cache efficiency, enabling high performance maintenance of a candidate superset and a lightweight final selection of top-k results. (2) two re-ranking algorithms tailored for different types of quantization methods, which accelerate their re-ranking process by reducing either the number of candidate objects to be re-ranked or cache misses. Extensive experiments on real-world datasets demonstrate that BBC accelerates existing quantization-based ANN methods by up to 3.8x at recall@k = 0.95 for large-k ANN queries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes BBC, a bucket-based result collector, to address performance degradation in quantization-based ANN indexes for large-k queries. It identifies two issues—inefficient top-k candidate maintenance and reduced pruning effectiveness leading to costly re-ranking—and introduces a distance-bucketed buffer for candidate organization plus two tailored re-ranking algorithms. Experiments on real-world datasets are reported to yield speedups of up to 3.8× at recall@k = 0.95.

Significance. If the claimed speedups are robustly verified with full experimental controls, the work would offer a practical, low-overhead enhancement to existing quantization indexes without altering their core structure. This could benefit database applications involving large-k retrieval (e.g., recommendation or analytics workloads) by improving cache behavior and reducing ranking costs in the high-candidate regime.

major comments (2)
  1. [Abstract] Abstract: the central performance claim of up to 3.8× speedup at recall@k=0.95 is presented without any description of datasets, quantization methods, baselines, number of trials, or variance; this leaves the offset between bucket-maintenance overhead and saved ranking/cache work unverifiable, especially for the large-superset regime the method targets.
  2. [Abstract] The design rationale (bucket insertion and re-ranking) assumes that organizing candidates by distance reduces ranking costs enough to offset linear bucket-maintenance overhead; no analytic bound or worst-case analysis is supplied for the case when quantization pruning returns a very large candidate superset, which is precisely the operating point emphasized in the paper.
minor comments (1)
  1. [Abstract] The abstract would benefit from a brief statement of the range of k values tested and the specific recall@k target used for the 3.8× figure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract and the request for stronger analytical support. We address each major comment below and will revise the manuscript to improve clarity and verifiability of the claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central performance claim of up to 3.8× speedup at recall@k=0.95 is presented without any description of datasets, quantization methods, baselines, number of trials, or variance; this leaves the offset between bucket-maintenance overhead and saved ranking/cache work unverifiable, especially for the large-superset regime the method targets.

    Authors: We agree that the abstract would benefit from additional context to make the speedup claim more self-contained. In the revision we will expand the abstract with a brief clause noting that the results are obtained on standard real-world datasets using common quantization-based indexes (with full details on baselines, number of trials, and variance reported in the experimental section). Due to strict length limits we cannot embed every experimental parameter in the abstract itself, but the added sentence will allow readers to immediately locate the supporting evidence and verify the net benefit of bucket maintenance versus ranking savings. revision: partial

  2. Referee: [Abstract] The design rationale (bucket insertion and re-ranking) assumes that organizing candidates by distance reduces ranking costs enough to offset linear bucket-maintenance overhead; no analytic bound or worst-case analysis is supplied for the case when quantization pruning returns a very large candidate superset, which is precisely the operating point emphasized in the paper.

    Authors: We acknowledge that a formal analytic bound would strengthen the design rationale. The current manuscript supplies extensive empirical measurements (Section 5) across increasing candidate-superset sizes that demonstrate the bucket-maintenance cost is more than offset by reduced ranking and cache-miss overhead. In the revision we will add a short cost-model paragraph in the design section that contrasts the amortized O(1) bucket insertion with the O(n log n) cost of full sorting or the full re-ranking scan, and we will explicitly discuss the regime where the superset size greatly exceeds k. While a complete worst-case proof is not provided, the added model together with the existing empirical data will address the concern for the targeted operating point. revision: yes

Circularity Check

0 steps flagged

No circularity: structural proposal with empirical validation

full rationale

The paper identifies performance bottlenecks in quantization-based ANN for large-k queries via investigation, then introduces BBC as a bucket-organized buffer plus two re-ranking algorithms. No equations, parameter fits, or derivations are present that reduce to inputs by construction. No self-citations serve as load-bearing uniqueness theorems or ansatzes. The speedup claims rest on experimental measurements rather than self-referential logic, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are stated in the abstract; the approach builds on standard quantization-based ANN indexes and common cache/ranking assumptions.

pith-pipeline@v0.9.0 · 5550 in / 986 out tokens · 51648 ms · 2026-05-13T20:43:39.806837+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

99 extracted references · 99 canonical work pages · 2 internal anchors

  1. [1]

    Charu C Aggarwal, Alexander Hinneburg, and Daniel A Keim. 2001. On the surprising behavior of distance metrics in high dimensional space. InInternational conference on database theory. 420–434

  2. [2]

    Cecilia Aguerrebere, Ishwar Singh Bhati, Mark Hildebrand, Mariano Tepper, and Theodore L. Willke. 2023. Similarity search in the blink of an eye with compressed indices.Proc. VLDB Endow.16, 11 (2023), 3433–3446

  3. [3]

    Fabien André, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2015. Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan.Proc. VLDB Endow.9, 4 (2015), 288–299

  4. [4]

    Fabien André, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. 2017. Accel- erated Nearest Neighbor Search with Quick ADC. InProceedings of the 2017 ACM on International Conference on Multimedia Retrieval, ICMR 2017, Bucharest, Romania, June 6-9, 2017. 159–166

  5. [5]

    Akhil Arora, Sakshi Sinha, Piyush Kumar, and Arnab Bhattacharya. 2018. HD- Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces.Proceedings of the VLDB Endowment11, 8 (2018), 906–919

  6. [6]

    Martin Aumüller, Erik Bernhardsson, and Alexander John Faithfull. 2020. ANN- Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. Inf. Syst.87 (2020)

  7. [7]

    Martin Aumüller and Matteo Ceccarello. 2023. Recent Approaches and Trends in Approximate Nearest Neighbor Search, with Remarks on Benchmarking.IEEE Data Eng. Bull.47, 3 (2023), 89–105

  8. [8]

    Ilias Azizi, Karima Echihabi, and Themis Palpanas. 2025. Graph-based vector search: An experimental evaluation of the state-of-the-art.Proceedings of the ACM on Management of Data3, 1 (2025), 1–31

  9. [9]

    Lempitsky

    Artem Babenko and Victor S. Lempitsky. 2014. Additive Quantization for Extreme Vector Compression. In2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014. 931–938

  10. [10]

    Kakade, and John Langford

    Alina Beygelzimer, Sham M. Kakade, and John Langford. 2006. Cover Trees for Nearest Neighbor. InMachine Learning, Proceedings of the Twenty-Third International Conference (ICML 2006), Pittsburgh, Pennsylvania, USA, June 25-29, 2006 (ACM International Conference Proceeding Series), Vol. 148. 97–104

  11. [11]

    Paolo Ciaccia, Marco Patella, and Pavel Zezula. 1997. M-tree: An E cient access method for similarity search in metric spaces. InProceedings of the 23rd VLDB conference, Athens, Greece. 426–435

  12. [12]

    Mirrokni

    Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. InProceedings of the 20th ACM Symposium on Computational Geometry, Brooklyn, New York, USA, June 8-11, 2004. 253–262

  13. [13]

    Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun

    Magdalen Dobson, Zheqi Shen, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun. 2023. Scaling Graph-Based ANNS Al- gorithms to Billion-Size Datasets: A Comparative Analysis.CoRRabs/2305.04359 (2023)

  14. [14]

    Karima Echihabi, Themis Palpanas, and Kostas Zoumpatianos. 2021. New Trends in High-d Vector Similarity Search: AI-Driven, Progressive, and Distributed.Proc. VLDB Endow.14, 12 (2021), 3198–3201

  15. [15]

    Hakan Ferhatosmanoglu, Ertem Tuncel, Divyakant Agrawal, and Amr El Abbadi

  16. [16]

    InProceedings of the ninth international conference on Information and knowledge management

    Vector approximation based indexing for non-uniform high dimensional data sets. InProceedings of the ninth international conference on Information and knowledge management. 202–209

  17. [17]

    Michael L Fredman and Robert Endre Tarjan. 1987. Fibonacci heaps and their uses in improved network optimization algorithms.Journal of the ACM (JACM) 34, 3 (1987), 596–615

  18. [18]

    Cong Fu, Changxu Wang, and Deng Cai. 2022. High Dimensional Similarity Search with Satellite System Graph: Efficiency, Scalability, and Unindexed Query Compatibility.IEEE Transactions on Pattern Analysis and Machine Intelligence44, 8 (2022), 4139–4150

  19. [19]

    Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2019. Fast Approximate Nearest Neighbor Search with the Navigating Spreading-out Graph.Proceedings of the VLDB Endowment12, 5 (2019), 461–474

  20. [20]

    Junhao Gan, Jianlin Feng, Qiong Fang, and Wilfred Ng. 2012. Locality-Sensitive Hashing Scheme Based on Dynamic Collision Counting. InProceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012. 541–552

  21. [21]

    Jianyang Gao, Yutong Gou, Yuexuan Xu, Yongyi Yang, Cheng Long, and Raymond Chi-Wing Wong. 2025. Practical and asymptotically optimal quantization of high-dimensional vectors in euclidean space for approximate nearest neighbor search.Proceedings of the ACM on Management of Data3, 3 (2025), 1–26

  22. [22]

    Jianyang Gao and Cheng Long. 2024. RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search.Proceedings of the ACM on Management of Data2, 3 (2024), 1–27

  23. [23]

    Weihao Gao, Xiangjun Fan, Chong Wang, Jiankai Sun, Kai Jia, Wenzi Xiao, Ruofan Ding, Xingyan Bin, Hui Yang, and Xiaobing Liu. 2021. Learning an end-to-end structure for retrieval in large-scale recommendations. InProceedings of the 30th ACM international conference on information & knowledge management. 524–533. 13

  24. [24]

    Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2014. Optimized Product Quantization.IEEE Trans. Pattern Anal. Mach. Intell.36, 4 (2014), 744–755

  25. [25]

    Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval.IEEE Transactions on Pattern Analysis and Machine Intelligence35, 12 (2013), 2916–2929

  26. [26]

    Yutong Gou, Jianyang Gao, Yuexuan Xu, and Cheng Long. 2025. SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search.Proc. ACM Manag. Data3, 1 (2025), 80:1–80:26

  27. [27]

    Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating Large-Scale Inference with Anisotropic Vector Quantization. InProceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, Vol. 119. 3887–3896

  28. [28]

    Ben Harwood and Tom Drummond. 2016. Fanng: Fast approximate nearest neighbour graphs. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5713–5722

  29. [29]

    Jae-Pil Heo, Youngwoon Lee, Junfeng He, Shih-Fu Chang, and Sung-Eui Yoon

  30. [30]

    Pattern Anal

    Spherical Hashing: Binary Code Embedding with Hyperspheres.IEEE Trans. Pattern Anal. Mach. Intell.37, 11 (2015), 2304–2316

  31. [31]

    Qiang Huang, Jianlin Feng, Yikai Zhang, Qiong Fang, and Wilfred Ng. 2015. Query-Aware Locality-Sensitive Hashing for Approximate Nearest Neighbor Search.Proc. VLDB Endow.9, 1 (2015), 1–12

  32. [32]

    Piotr Indyk and Rajeev Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. InProceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23-26,

  33. [33]

    Yahoo Japan. 2018. Neighborhood Graph and Tree for Indexing High-dimensional Data. https://github.com/yahoojapan/NGT. Accessed: 2024-04-17

  34. [34]

    Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. Diskann: Fast accurate billion-point nearest neighbor search on a single node.Advances in neural information pro- cessing Systems32 (2019)

  35. [35]

    H Jégou, M Douze, and C Schmid. 2011. Product Quantization for Nearest Neighbor Search.IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117–128

  36. [36]

    Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, and Gustavo Alonso. 2023. Co-design Hardware and Algorithm for Vector Search. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023, Denver, C...

  37. [37]

    Donald B Johnson. 1975. Priority queues with update and finding minimum spanning trees.Inform. Process. Lett.4, 3 (1975), 53–57

  38. [38]

    Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-Scale Similarity Search with GPUs.IEEE Transactions on Big Data7, 3 (2019), 535–547

  39. [39]

    Sujay Khandagale, Bhawna Juneja, Prabhat Agarwal, Aditya Subramanian, Jae- won Yang, and Yuting Wang. 2025. InteractRank: Personalized Web-Scale Search Pre-Ranking with Cross Interaction Features. InCompanion Proceedings of the ACM on Web Conference 2025, WWW 2025, Sydney, NSW, Australia, 28 April 2025 - 2 May 2025. 287–295

  40. [40]

    Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. InProceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 39–48

  41. [41]

    Daniel H Larkin, Siddhartha Sen, and Robert E Tarjan. 2014. A back-to-basics empirical study of priority queues. In2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM, 61–72

  42. [42]

    Jinhyuk Lee, Zhuyun Dai, Sai Meher Karthik Duddu, Tao Lei, Iftekhar Naim, Ming-Wei Chang, and Vincent Zhao. 2023. Rethinking the role of token retrieval in multi-vector retrieval.Advances in Neural Information Processing Systems36 (2023), 15384–15405

  43. [43]

    Kankanhalli, and Anthony K

    Yifan Lei, Qiang Huang, Mohan S. Kankanhalli, and Anthony K. H. Tung. 2020. Locality-Sensitive Hashing Scheme Based on Longest Circular Co-Substring. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, Online Conference [Portland, or, USA], June 14-19, 2020. 2589– 2599

  44. [44]

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rock- täschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems33 (2020), 9459–9474

  45. [45]

    Andersen, and Yuxiong He

    Conglong Li, Minjia Zhang, David G. Andersen, and Yuxiong He. 2020. Im- proving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination. InProceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14-19, 2020. 2539–2554

  46. [46]

    Jinfeng Li, Xiao Yan, Jie Zhang, An Xu, James Cheng, Jie Liu, Kelvin Kai Wing Ng, and Ti-Chung Cheng. 2018. A General and Efficient Querying Method for Learning to Hash. InProceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15,

  47. [47]

    Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2020. Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and Improvement.IEEE Transactions on Knowl- edge and Data Engineering32, 8 (2020), 1475–1488

  48. [48]

    Kejing Lu, Hongya Wang, Wei Wang, and Mineichi Kudo. 2020. VHP: Approxi- mate Nearest Neighbor Search via Virtual Hypersphere Partitioning.Proc. VLDB Endow.13, 9 (2020), 1443–1455

  49. [49]

    Varun Malhotra and Christos Kozyrakis. 2006. Library-based prefetching for pointer-intensive applications. InTechnical report. Computer Systems Laboratory, Stanford University

  50. [50]

    Yury Malkov, Alexander Ponomarenko, Andrey Logvinov, and Vladimir Krylov

  51. [51]

    45 (2014), 61–68

    Approximate Nearest Neighbor Algorithm Based on Navigable Small World Graphs. 45 (2014), 61–68

  52. [52]

    Malkov and D

    Yu A. Malkov and D. A. Yashunin. 2020. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence42, 4 (2020), 824– 836

  53. [53]

    Yu A Malkov and Dmitry A Yashunin. 2020. hnswlib: Hierarchical Navigable Small World graphs. https://github.com/nmslib/hnswlib. Accessed: 2025-09-09

  54. [54]

    Hoos, and James J

    Julieta Martinez, Shobhit Zakhmi, Holger H. Hoos, and James J. Little. 2018. LSQ++: Lower Running Time and Higher Recall in Multi-codebook Quantization. InComputer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI (Lecture Notes in Computer Science), Vol. 11220. 508–523

  55. [55]

    Yusuke Matsui, Ryota Hinami, and Shin’ichi Satoh. 2018. Reconfigurable inverted index. InProceedings of the 26th ACM international conference on Multimedia. 1715–1723

  56. [56]

    Yusuke Matsui, Yusuke Uchida, Hervé Jégou, and Shin’ichi Satoh. 2018. A Survey of Product Quantization.ITE Transactions on Media Technology and Applications 6, 1 (2018), 2–10

  57. [57]

    Yusuke Matsui, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2015. Pqtable: Fast exact asymmetric distance neighbor search for product quantization using hash tables. InProceedings of the IEEE International Conference on Computer Vision. 1940–1948

  58. [58]

    Luke Merrick. 2024. Embedding And Clustering Your Data Can Improve Con- trastive Pretraining.arXiv preprint arXiv:2407.18887(2024)

  59. [59]

    Jason Mohoney, Anil Pacaci, Shihabur Rahman Chowdhury, Ali Mousavi, Ihab F Ilyas, Umar Farooq Minhas, Jeffrey Pound, and Theodoros Rekatsinas. 2023. High-throughput vector similarity search in knowledge graphs.Proceedings of the ACM on Management of Data1, 2 (2023), 1–25

  60. [60]

    Hamid Mousavi and Carlo Zaniolo. 2011. Fast and accurate computation of equi-depth histograms over data streams. InProceedings of the 14th international conference on extending database technology. 69–80

  61. [61]

    M Muralikrishna and David J DeWitt. 1988. Equi-depth multidimensional his- tograms. InProceedings of the 1988 ACM SIGMOD international conference on Management of data. 28–36

  62. [62]

    Parth Nagarkar and K Selçuk Candan. 2018. Pslsh: An index structure for efficient execution of set queries in high-dimensional spaces. InProceedings of the 27th ACM International Conference on Information and Knowledge Management. 477– 486

  63. [63]

    James Jie Pan, Jianguo Wang, and Guoliang Li. 2024. Survey of vector database management systems.VLDB J.33, 5 (2024), 1591–1615

  64. [64]

    James Jie Pan, Jianguo Wang, and Guoliang Li. 2024. Vector Database Manage- ment Techniques and Systems. InCompanion of the 2024 International Conference on Management of Data. Santiago AA Chile, 597–604

  65. [65]

    Elmore, and Michael J

    John Paparrizos, Ikraduya Edian, Chunwei Liu, Aaron J. Elmore, and Michael J. Franklin. 2022. Fast Adaptive Similarity Search through Variance-Aware Quan- tization. In38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9-12, 2022. 2969–2983

  66. [66]

    Liana Patel, Siddharth Jha, Melissa Pan, Harshit Gupta, Parth Asawa, Carlos Guestrin, and Matei Zaharia. 2025. Semantic Operators and Their Optimization: Enabling LLM-Based Data Processing with Accuracy Guarantees in LOTUS. Proceedings of the VLDB Endowment18, 11 (2025), 4171–4184

  67. [67]

    Marco Patella and Paolo Ciaccia. 2009. Approximate similarity search: A multi- faceted problem.Journal of Discrete Algorithms7, 1 (2009), 36–48

  68. [68]

    Ninh Pham and Tao Liu. 2022. Falconn++: A locality-sensitive filtering ap- proach for approximate nearest neighbor search.Advances in Neural Information Processing Systems35 (2022), 31186–31198

  69. [69]

    Gregory Piatetsky-Shapiro and Charles Connell. 1984. Accurate estimation of the number of tuples satisfying a condition.ACM Sigmod Record14, 2 (1984), 256–276

  70. [70]

    Jeffrey Pound, Floris Chabert, Arjun Bhushan, Ankur Goswami, Anil Pacaci, and Shihabur Rahman Chowdhury. 2025. MicroNN: An On-device Disk-resident Updatable Vector Database. InCompanion of the 2025 International Conference on Management of Data. 608–621. 14

  71. [71]

    William Pugh. 1990. Skip lists: a probabilistic alternative to balanced trees. Commun. ACM33, 6 (1990), 668–676

  72. [72]

    Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.Journal of Machine Learning Research21, 140 (2020), 1–67

  73. [73]

    Parikshit Ram and Kaushik Sinha. 2019. Revisiting Kd-Tree for Nearest Neighbor Search. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8,

  74. [74]

    Patrick Schäfer, Jakob Brand, Ulf Leser, Botao Peng, and Themis Palpanas

  75. [75]

    Fast and Exact Similarity Search in less than a Blink of an Eye.CoRR abs/2411.17483 (2024)

  76. [76]

    Harsha Vardhan Simhadri, George Williams, Martin Aumüller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krish- naswamny, Gopal Srinivasa, et al. 2022. Results of the NeurIPS’21 challenge on billion-scale approximate nearest neighbor search. InNeurIPS 2021 Competitions and Demonstrations Track. 177–189

  77. [77]

    Nikos Sismanis, Nikos Pitsianis, and Xiaobai Sun. 2012. Parallel search of k- nearest neighbors with synchronous operations. In2012 IEEE Conference on High Performance Extreme Computing. IEEE, 1–6

  78. [78]

    Yifang Sun, Wei Wang, Jianbin Qin, Ying Zhang, and Xuemin Lin. 2014. SRS: solving c-approximate nearest neighbor queries in high dimensional euclidean space with a tiny index.Proceedings of the VLDB Endowment(2014)

  79. [79]

    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. InProceedings of the IEEE conference on computer vision and pattern recognition. 1–9

  80. [80]

    Xiaoxin Tang, Zhiyi Huang, David Eyers, Steven Mills, and Minyi Guo. 2015. Effi- cient selection algorithm for fast k-nn search on gpus. In2015 IEEE International Parallel and Distributed Processing Symposium. IEEE, 397–406

Showing first 80 references.