ASH: Asymmetric Scalar Hashing With Learned Dimensionality Reduction for High-Fidelity Vector Quantization

Mariano Tepper; Theodore Willke

arxiv: 2606.07870 · v1 · pith:7EKQMWACnew · submitted 2026-06-05 · 💻 cs.IR

ASH: Asymmetric Scalar Hashing With Learned Dimensionality Reduction for High-Fidelity Vector Quantization

Mariano Tepper , Theodore Willke This is my paper

Pith reviewed 2026-06-27 20:21 UTC · model grok-4.3

classification 💻 cs.IR

keywords vector quantizationapproximate nearest neighborscalar quantizationdimensionality reductionasymmetric hashingann searchlearned projection

0 comments

The pith

A learned orthonormal projection reduces database vector dimensions before scalar quantization while leaving queries unchanged, yielding higher recall than additive or data-agnostic scalar quantizers at equal compression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ASH as a data-driven method that first learns an orthonormal projection to shrink the dimensionality of stored vectors, then applies scalar quantization at a higher per-dimension bitrate. Queries remain in their original high-dimensional form, creating an asymmetric encoder-decoder pair. This design is shown to deliver better accuracy and faster similarity search than both product quantization and recent scalar techniques across multiple compression levels. The work matters for anyone building large-scale nearest-neighbor systems because it promises improved fidelity without the complexity of additive quantizers and with short training times.

Core claim

ASH is an asymmetric scalar hashing framework in which database vectors are projected onto a learned orthonormal basis that reduces dimension count, after which each coordinate is scalar-quantized; queries stay unprojected, allowing the similarity computation to remain exact in the original space while the stored representation uses fewer bits overall.

What carries the argument

The learned orthonormal projection that performs dimensionality reduction on database vectors before scalar quantization, combined with the asymmetric treatment of queries.

If this is right

Higher ANN recall is obtained at every tested compression regime compared with prior additive and scalar methods.
Similarity computations run efficiently via SIMD because the asymmetry keeps the query in its native form.
Learning and encoding steps remain short enough for practical deployment on new collections.
The same compression-accuracy trade-off holds across multiple standard benchmark datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may generalize to other quantization families if the projection step is inserted before their encoding.
Real-time search workloads could benefit from the reduced storage and faster lookups once the projection matrix is fixed.
If the projection matrix can be updated incrementally, the method might support streaming database updates without full retraining.

Load-bearing premise

That the learned orthonormal projection will improve reconstruction fidelity over fixed dimensionality reduction on new queries without overfitting to the training set.

What would settle it

On a held-out dataset or query distribution, ASH recall at a fixed compression ratio falls below the recall of the strongest previous additive or scalar quantizer.

Figures

Figures reproduced from arXiv: 2606.07870 by Mariano Tepper, Theodore Willke.

**Figure 1.** Figure 1: Learning the projection matrix W leads to significant improvements in search accuracy (10- recall@R) for B = D (top two rows) and B = D/2 (bottom two rows). In ASH, increasing the bitrate b with B fixed means decreasing the target dimensionality d. When D > d, the advantage of the learned parameters becomes wider. Notably, ASH with b = 2 consistently beats b = 1, meaning that reducing the dimensionality wh… view at source ↗

**Figure 2.** Figure 2: The algorithm presented in Section 3 converges as its iterations progress. Tracking the loss in Equation (24), we observe that after 20-30 iterations, we start getting diminishing returns. For B = D and b = 1, we can compare the obtained results with the expected loss in Equation (33) [23], observing clear improvements. 10 15 20 25 30 35 40 45 0.8 0.85 0.9 0.95 1 clusters 1 16 32 64 128 256 ada002-100k R 1… view at source ↗

**Figure 3.** Figure 3: The search accuracy (10-recall@R) increases with the number of ASH landmarks, defined in [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: The ASH estimator has a slight bias, see Equation [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: ASH outperforms PQ in search accuracy (10-recall@R). ASH is even competitive at several com [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: ASH outperforms LOPQ [44] in search accuracy (10-recall@R). Additionally, ASH is computationally more efficient at learning the quantizer and computing similarities. 10 15 20 25 30 35 40 45 0.7 0.75 0.8 0.85 0.9 0.95 1 Compression 32x (B=1536) 16x (B=3072) 8x (B=6144) ada002-100k R 10-recall@R 10 15 20 25 30 35 40 45 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Compression 32x (B=768) 16x (B=1536) 8x (B=3072) gecko-… view at source ↗

**Figure 7.** Figure 7: ASH outperforms EDEN [63] and TurboQuant [68] in search accuracy (10-recall@R). Often, ASH is competitive at several compression levels with EDEN and TurboQuant configurations that use twice the space. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: ASH outperforms LeanVec [59] in search accuracy (10-recall@R). ASH with b = 1 is competitive with LeanVec with b = 4, which uses four times more space (additional configurations in Figure D.8 of the appendix) [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: ASH outperforms RaBitQ [23, 24] and PQ [43] in search accuracy (10-recall@R) and throughput (queries per second, QPS), clearly improving the Pareto frontier. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

read the original abstract

For a long time, additive quantizers, such as product quantization, have been considered the gold standard in terms of accuracy and efficiency. Recently, scalar quantization has re-emerged from the depths of history with a new wave of data-agnostic techniques. Inscribed in this general framework, we turn our attention to data-driven methods, showing that new highs in recall and speed can be achieved by reducing the number of dimensions while increasing the bitrate per dimension. Critically, this dimensionality reduction needs to be learned from data to be successful. We present ASH (Asymmetric Scalar Hashing), a data-driven encoder-decoder framework that applies dimensionality reduction to database vectors via a learned orthonormal projection, followed by scalar quantization, while keeping queries in their original form. This asymmetric design enables higher accuracy than the best additive and scalar quantizers at iso-compression, while admitting highly efficient similarity computations via SIMD operations. ASH has short learning and encoding times, making it attractive for real-world deployment. Extensive experiments on a variety of datasets demonstrate that ASH achieves state-of-the-art ANN recall and speeds across all compression regimes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ASH adds a learned orthonormal projection before asymmetric scalar quantization and claims better recall than PQ or data-agnostic scalars, but the projection learning details matter for whether the gains hold up.

read the letter

The main point is that ASH learns an orthonormal projection to drop dimensions on the database vectors, then scalar-quantizes them at higher bits per dimension, while leaving queries untouched. This asymmetric setup is the core new piece, and the paper says it beats both additive quantizers and recent scalar methods on recall and speed at the same compression.

The practical side is handled reasonably. Short training and encoding times are highlighted, and the SIMD-friendly distance computation is a clear win for deployment. Running experiments across multiple datasets and compression regimes is the right way to support the SOTA claim.

The soft spot is exactly the one in the stress-test note. The abstract states the projection must be learned from data, yet gives no information on the objective, whether a validation split was used, or any regularization. If the projection is fit directly to the full database without safeguards, the reported gains could be dataset-specific and fail to transfer to new queries. The full paper needs to show that this does not happen.

The rest of the method looks standard: orthonormal projection plus scalar quantization. No obvious circularity or invented entities.

This is for engineers and researchers who build or tune vector search systems and want a drop-in quantizer with a better accuracy-efficiency curve. A reader who cares about implementation details and cross-dataset behavior will get the most out of it.

It deserves peer review. The claims are concrete enough to test, and the method is simple to reproduce once the projection learning is spelled out.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces ASH, a data-driven asymmetric scalar hashing framework for vector quantization in ANN search. Database vectors undergo learned orthonormal projection for dimensionality reduction followed by scalar quantization, while queries remain in original form; the asymmetric design enables efficient SIMD similarity search. The central claim is that this yields state-of-the-art recall and speed across compression regimes, outperforming additive and scalar quantizers, with short learning/encoding times, supported by experiments on multiple datasets.

Significance. If the empirical results hold and generalize, ASH could meaningfully advance scalar quantization approaches by showing that learned dimensionality reduction can outperform data-agnostic methods at iso-compression. The asymmetric formulation and emphasis on practical SIMD efficiency and short training times are concrete strengths that address deployment constraints. The work supplies a falsifiable prediction (superior recall/speed on standard ANN benchmarks) that can be directly tested.

major comments (2)

[§3] §3 (Method): The description of the learned orthonormal projection does not specify the optimization objective, loss function, or regularization used to fit the projection matrix. This is load-bearing for the central claim that 'this dimensionality reduction needs to be learned from data to be successful,' because without the objective it is impossible to assess whether the reported gains arise from genuine signal or from fitting to the database distribution.
[§4] §4 (Experiments): No information is provided on whether the projection was learned using a held-out validation set, cross-validation, or any safeguard against overfitting to the specific database vectors. The SOTA recall claims rest on the assumption that the learned projection improves fidelity on unseen queries; absent this detail the generalization argument cannot be evaluated.

minor comments (1)

Figure captions and axis labels in the experimental plots should explicitly state the compression ratio and dataset for each curve to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The two major comments identify important gaps in methodological and experimental detail that we will address through revisions to improve clarity and reproducibility.

read point-by-point responses

Referee: [§3] §3 (Method): The description of the learned orthonormal projection does not specify the optimization objective, loss function, or regularization used to fit the projection matrix. This is load-bearing for the central claim that 'this dimensionality reduction needs to be learned from data to be successful,' because without the objective it is impossible to assess whether the reported gains arise from genuine signal or from fitting to the database distribution.

Authors: We agree that the optimization objective, loss function, and regularization for the projection matrix are not explicitly detailed in §3. This is a valid observation. The projection is learned by minimizing reconstruction error after scalar quantization under an orthonormality constraint, but the manuscript does not state the precise formulation. We will revise §3 to specify the loss (MSE between original and reconstructed vectors), the optimization method, and how orthonormality is enforced. This will directly support the claim that data-driven reduction is necessary by making the objective transparent. revision: yes
Referee: [§4] §4 (Experiments): No information is provided on whether the projection was learned using a held-out validation set, cross-validation, or any safeguard against overfitting to the specific database vectors. The SOTA recall claims rest on the assumption that the learned projection improves fidelity on unseen queries; absent this detail the generalization argument cannot be evaluated.

Authors: The referee correctly notes the absence of details on validation or overfitting safeguards in §4. The projection is learned on the database vectors to adapt to their distribution, with evaluation on separate query sets, but no mention is made of held-out data or cross-validation during learning. We will revise the experimental section to describe the exact training procedure for the projection, including any use of validation splits or other safeguards, and discuss implications for generalization. If no such procedures were applied, we will state this explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The provided abstract and description contain no equations, fitting procedures, or self-citations that reduce any claimed prediction or result to its inputs by construction. The method is described at a high level as using a learned orthonormal projection for dimensionality reduction before scalar quantization, with success attributed to data-driven learning and empirical results on multiple datasets. No load-bearing steps match the enumerated circularity patterns, and the central claims remain independent of any self-referential reductions.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The central method depends on a learned projection matrix fitted to each dataset; no other free parameters, axioms, or invented entities are identifiable from the abstract.

free parameters (1)

orthonormal projection matrix
Learned from data to perform dimensionality reduction before scalar quantization.

pith-pipeline@v0.9.1-grok · 5723 in / 967 out tokens · 19541 ms · 2026-06-27T20:21:49.194304+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 4 canonical work pages · 1 internal anchor

[1]

The fastlanes compression layout: Decoding>100 billion integers per second with scalar code.Proc

Azim Afroozeh and Peter Boncz. The fastlanes compression layout: Decoding>100 billion integers per second with scalar code.Proc. VLDB Endow., 16(9):2132–2144, May 2023

2023
[2]

Similarity search in the blink of an eye with compressed indices.Proc

Cecilia Aguerrebere, Ishwar Singh Bhati, Mark Hildebrand, Mariano Tepper, and Theodore Willke. Similarity search in the blink of an eye with compressed indices.Proc. VLDB Endow., 16(11):3433– 3446, July 2023

2023
[3]

Locally-adaptive quantization for streaming vector search, February 2024

Cecilia Aguerrebere, Mark Hildebrand, Ishwar Singh Bhati, Theodore Willke, and Mariano Tepper. Locally-adaptive quantization for streaming vector search, February 2024. arXiv:2402.02044 [cs]

work page arXiv 2024
[4]

Nearest neighbor search with compact codes: A decoder perspective

Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, and Herv´ e J´ egou. Nearest neighbor search with compact codes: A decoder perspective. InProceedings of the 2022 International Conference on Multimedia Retrieval, pages 167–175, New York, NY, USA, June 2022. Association for Computing Machinery

2022
[5]

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions.Communications of the ACM, 51(1):117–122, January 2008

Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions.Communications of the ACM, 51(1):117–122, January 2008

2008
[6]

Quicker adc : Unlocking the hidden potential of product quantization with simd.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5):1666–1677, May 2021

Fabien Andre, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. Quicker adc : Unlocking the hidden potential of product quantization with simd.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5):1666–1677, May 2021

2021
[7]

Cache locality is not enough: high- performance nearest neighbor search with product quantization fast scan.Proc

Fabien Andr´ e, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. Cache locality is not enough: high- performance nearest neighbor search with product quantization fast scan.Proc. VLDB Endow., 9(4): 288–299, December 2015

2015
[8]

Accelerated nearest neighbor search with quick adc

Fabien Andr´ e, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. Accelerated nearest neighbor search with quick adc. InProceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pages 159–166, New York, NY, USA, June 2017. Association for Computing Machinery

2017
[9]

Additive quantization for extreme vector compression

Artem Babenko and Victor Lempitsky. Additive quantization for extreme vector compression. pages 931–938, 2014

2014
[10]

Tree quantization for large-scale similarity search and classifi- cation

Artem Babenko and Victor Lempitsky. Tree quantization for large-scale similarity search and classifi- cation. pages 4240–4248, 2015

2015
[11]

Multidimensional binary search trees used for associative searching.Commun

Jon Louis Bentley. Multidimensional binary search trees used for associative searching.Commun. ACM, 18(9):509–517, September 1975

1975
[12]

Cover trees for nearest neighbor

Alina Beygelzimer, Sham Kakade, and John Langford. Cover trees for nearest neighbor. InProceedings of the 23rd international conference on Machine learning, ICML ’06, pages 97–104, New York, NY, USA, June 2006. Association for Computing Machinery

2006
[13]

Carreira-Perpinan and Ramin Raziperchikolaei

Miguel A. Carreira-Perpinan and Ramin Raziperchikolaei. Hashing with binary autoencoders. pages 557–566, 2015. 18

2015
[14]

Charikar

Moses S. Charikar. Similarity estimation techniques from rounding algorithms. InProceedings of the thiry-fourth annual ACM symposium on Theory of computing, STOC ’02, pages 380–388, New York, NY, USA, May 2002. Association for Computing Machinery

2002
[15]

Spann: highly-efficient billion-scale approximate nearest neighbor search

Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. Spann: highly-efficient billion-scale approximate nearest neighbor search. InProceedings of the 35th International Conference on Neural Information Processing Systems, pages 5199–5212, Red Hook, NY, USA, December 2021. Curran Associates Inc

2021
[16]

Approximate nearest neighbor search by residual vector quantization.Sensors, 10(12):11259–11273, December 2010

Yongjian Chen, Tao Guan, and Cheng Wang. Approximate nearest neighbor search by residual vector quantization.Sensors, 10(12):11259–11273, December 2010

2010
[17]

msmarco-v2-embed-english-v3

CohereLabs. msmarco-v2-embed-english-v3. URLhttps://huggingface.co/datasets/CohereLabs/ msmarco-v2-embed-english-v3
[18]

Stochastic generative hashing

Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, and Le Song. Stochastic generative hashing. InProceedings of the 34th International Conference on Machine Learning, pages 913–922. PMLR, July 2017

2017
[19]

Stevens, and Saket Navlakha

Sanjoy Dasgupta, Charles F. Stevens, and Saket Navlakha. A neural algorithm for a fundamental computing problem.Science, 358(6364):793–796, November 2017

2017
[20]

Mirrokni

Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. InProceedings of the twentieth annual symposium on Computational geometry, SCG ’04, pages 253–262, New York, NY, USA, June 2004. Association for Computing Ma- chinery

2004
[21]

Jvector, May 2026

DataStax. Jvector, May 2026. URLhttps://github.com/datastax/jvector. original-date: 2023-08- 25T01:45:20Z

2026
[22]

The faiss library.IEEE Transactions on Big Data, 12(2):346–361, April 2026

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazar´ e, Maria Lomeli, Lucas Hosseini, and Herv´ e J´ egou. The faiss library.IEEE Transactions on Big Data, 12(2):346–361, April 2026

2026
[23]

Rabitq: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search.Proc

Jianyang Gao and Cheng Long. Rabitq: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search.Proc. ACM Manag. Data, 2(3):167:1–167:27, May 2024

2024
[24]

Practical and asymptotically optimal quantization of high-dimensional vectors in euclidean space for approximate nearest neighbor search.Proc

Jianyang Gao, Yutong Gou, Yuexuan Xu, Yongyi Yang, Cheng Long, and Raymond Chi-Wing Wong. Practical and asymptotically optimal quantization of high-dimensional vectors in euclidean space for approximate nearest neighbor search.Proc. ACM Manag. Data, 3(3):202:1–202:26, June 2025

2025
[25]

Optimized product quantization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(04):744–755, April 2014

Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. Optimized product quantization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(04):744–755, April 2014

2014
[26]

Iterative quantization: A procrustean approach to learning binary codes

Yunchao Gong and Svetlana Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. InCVPR 2011, pages 817–824, June 2011

2011
[27]

Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12):2916–2929, December 2013

2013
[28]

Asymmetric distances for binary embeddings

Albert Gordo and Florent Perronnin. Asymmetric distances for binary embeddings. InCVPR 2011, pages 729–736, June 2011

2011
[29]

load mask16, July 2025

Intel Intrinsics Guide. load mask16, July 2025. URLhttps://www.intel.com/content/www/us/en/ docs/intrinsics-guide/index.html#text=_load_mask16&ig_expand=3996

2025
[30]

mm512 add ps, July 2025

Intel Intrinsics Guide. mm512 add ps, July 2025. URLhttps://www.intel.com/content/www/us/ en/docs/intrinsics-guide/index.html#text=_mm512_add_ps&techs=AVX_512&ig_expand=143. 19

2025
[31]

mm512 fmadd ps, July 2025

Intel Intrinsics Guide. mm512 fmadd ps, July 2025. URLhttps://www.intel.com/content/www/us/ en/docs/intrinsics-guide/index.html#text=_mm512_loadu_ps&ig_expand=4103

2025
[32]

mm512 i32gather ps, July 2025

Intel Intrinsics Guide. mm512 i32gather ps, July 2025. URLhttps://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html#techs=AVX_512&text=_mm512_i32gather_ps&ig_ expand=3734

2025
[33]

mm512 loadu ps, July 2025

Intel Intrinsics Guide. mm512 loadu ps, July 2025. URLhttps://www.intel.com/content/www/ us/en/docs/intrinsics-guide/index.html#text=_mm512_fmadd_ps&avx512techs=AVX512F&ig_ expand=3111

2025
[34]

mm512 maskz loadu ps, July 2025

Intel Intrinsics Guide. mm512 maskz loadu ps, July 2025. URLhttps://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html#text=_mm512_maskz_loadu_ps&ig_expand=4105

2025
[35]

mm512 reduce add ps, July 2025

Intel Intrinsics Guide. mm512 reduce add ps, July 2025. URLhttps://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html#text=_mm512_reduce_add_ps&avx512techs= AVX512F&ig_expand=5303

2025
[36]

Accelerating large-scale inference with anisotropic vector quantization

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. Accelerating large-scale inference with anisotropic vector quantization. InProceedings of the 37th In- ternational Conference on Machine Learning, pages 3887–3896. PMLR, November 2020

2020
[37]

Approximate nearest neighbors: towards removing the curse of dimensionality

Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. InProceedings of the thirtieth annual ACM symposium on Theory of computing, STOC ’98, pages 604–613, New York, NY, USA, May 1998. Association for Computing Machinery

1998
[38]

A survey on locality sensitive hashing algorithms and their applications, February 2021

Omid Jafari, Preeti Maurya, Parth Nagarkar, Khandker Mushfiqul Islam, and Chidambaram Crushev. A survey on locality sensitive hashing algorithms and their applications, February 2021. arXiv:2102.08942 [cs]

work page arXiv 2021
[39]

Billion-scale similarity search with gpus.IEEE Trans- actions on Big Data, 7(3):535–547, July 2021

Jeff Johnson, Matthijs Douze, and Herv´ e J´ egou. Billion-scale similarity search with gpus.IEEE Trans- actions on Big Data, 7(3):535–547, July 2021

2021
[40]

Muon: An optimizer for hidden layers in neural networks, 2024

Keller Jordan, Yuchen Jin, Vlado Boza, Jiacheng You, Franz Cesista, Laker Newhouse, and Jeremy Bern- stein. Muon: An optimizer for hidden layers in neural networks, 2024. URLhttps://kellerjordan. github.io/posts/muon/

2024
[41]

Biing-Hwang Juang and A. Gray. Multiple stage vector quantization for speech coding. InICASSP ’82. IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 7, pages 597–600, May 1982

1982
[42]

Caselaw access project embeddings

justicedao. Caselaw access project embeddings. URLhttps://huggingface.co/datasets/ justicedao/Caselaw_Access_Project_embeddings
[43]

Product quantization for nearest neighbor search

Herve J´ egou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):117–128, January 2011

2011
[44]

Locally optimized product quantization for approximate nearest neighbor search

Yannis Kalantidis and Yannis Avrithis. Locally optimized product quantization for approximate nearest neighbor search. In2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 2329– 2336, June 2014

2014
[45]

Retrieval-augmented generation for knowledge-intensive nlp tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K¨ uttler, Mike Lewis, Wen-tau Yih, Tim Rockt¨ aschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. InAdvances in Neural Information Processing Systems, pages 9459–9474. Curran Associates Inc., Dec...

2020
[46]

Improved Residual Vector Quantization for High-dimensional Approximate Nearest Neighbor Search

Shicong Liu, Hongtao Lu, and Junru Shao. Improved residual vector quantization for high-dimensional approximate nearest neighbor search, September 2015. arXiv:1509.05195 [cs]. 20

work page internal anchor Pith review Pith/arXiv arXiv 2015
[47]

S. Lloyd. Least squares quantization in pcm.IEEE Transactions on Information Theory, 28(2):129–137, March 1982

1982
[48]

A survey on deep hashing methods.ACM Trans

Xiao Luo, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. A survey on deep hashing methods.ACM Trans. Knowl. Discov. Data, 17(1):15:1–15:50, February 2023

2023
[49]

Malkov and D

Yu A. Malkov and D. A. Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.IEEE Transactions on Pattern Analysis and Machine Intel- ligence, 42(4):824–836, April 2020

2020
[50]

Hoos, and James J

Julieta Martinez, Joris Clement, Holger H. Hoos, and James J. Little. Revisiting additive quantization. In Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling, editors,Computer Vision – ECCV 2016, pages 137–153, Cham, 2016. Springer International Publishing

2016
[51]

J. Max. Quantizing for minimum distortion.IRE Transactions on Information Theory, 6(1):7–12, March 1960

1960
[52]

Marius Muja and David G. Lowe. Scalable nearest neighbor algorithms for high dimensional data.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11):2227–2240, November 2014

2014
[53]

wiki mpnet embeddings

olmer. wiki mpnet embeddings. URLhttps://huggingface.co/datasets/olmer/wiki_mpnet_ embeddings
[54]

Training deep learning models with norm-constrained lmos

Thomas Pethick, Wanyun Xie, Kimon Antonakopoulos, Zhenyu Zhu, Antonio Silveti-Falls, and Volkan Cevher. Training deep learning models with norm-constrained lmos. InProceedings of the 42nd In- ternational Conference on Machine Learning, volume 267 ofICML’25, pages 49069–49104, Vancouver, Canada, July 2025. JMLR.org

2025
[55]

dbpedia-entities-openai3-text-embedding-3-large-1536-1m, 2024

Qdrant. dbpedia-entities-openai3-text-embedding-3-large-1536-1m, 2024. URLhttps://huggingface. co/datasets/Qdrant/dbpedia-entities-openai3-text-embedding-3-large-1536-1M

2024
[56]

dbpedia-entities-openai3-text-embedding-3-large-3072-1m, 2024

Qdrant. dbpedia-entities-openai3-text-embedding-3-large-3072-1m, 2024. URLhttps://huggingface. co/datasets/Qdrant/dbpedia-entities-openai3-text-embedding-3-large-3072-1M

2024
[57]

Faiss, May 2026

Meta Research. Faiss, May 2026. URLhttps://github.com/facebookresearch/faiss. Accessed: 2026-05-07

2026
[58]

Diskann: fast accurate billion-point nearest neighbor search on a single node

Suhas Jayaram Subramanya, Devvrit, Rohan Kadekodi, Ravishankar Krishaswamy, and Harsha Vard- han Simhadri. Diskann: fast accurate billion-point nearest neighbor search on a single node. InAdvances on Neural Information Processing Systems, pages 13766–13776, Red Hook, NY, USA, December 2019. Curran Associates Inc

2019
[59]

Mariano Tepper, Ishwar Singh Bhati, Cecilia Aguerrebere, Mark Hildebrand, and Theodore L. Willke. Leanvec: Searching vectors faster by making them fit.Transactions on Machine Learning Research, January 2024

2024
[60]

Gleanvec: Accelerating vector search with minimalist nonlinear dimensionality reduction, October 2024

Mariano Tepper, Ishwar Singh Bhati, Cecilia Aguerrebere, and Ted Willke. Gleanvec: Accelerating vector search with minimalist nonlinear dimensionality reduction, October 2024. arXiv:2410.22347 [cs]

work page arXiv 2024
[61]

Neural discrete representation learning

Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. Neural discrete representation learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pages 6309–6318, Red Hook, NY, USA, December 2017. Curran Associates Inc

2017
[62]

Drive: One-bit distributed mean estimation

Shay Vargaftik, Ran Ben-Basat, Amit Portnoy, Gal Mendelson, Yaniv Ben-Itzhak, and Michael Mitzen- macher. Drive: One-bit distributed mean estimation. InAdvances in Neural Information Processing Systems, volume 34, pages 362–377. Curran Associates, Inc., 2021. 21

2021
[63]

Eden: Communication-efficient and robust distributed mean estimation for federated learning

Shay Vargaftik, Ran Ben Basat, Amit Portnoy, Gal Mendelson, Yaniv Ben Itzhak, and Michael Mitzen- macher. Eden: Communication-efficient and robust distributed mean estimation for federated learning. InProceedings of the 39th International Conference on Machine Learning, pages 21984–22014. PMLR, June 2022

2022
[64]

Rabitq-library, 2025

VectorDB-NTU. Rabitq-library, 2025. URLhttps://github.com/VectorDB-NTU/RaBitQ-Library. Accessed: 2026-05-15

2025
[65]

A survey on learning to hash.IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):769–790, April 2018

Jingdong Wang, Ting Zhang, jingkuan song, Nicu Sebe, and Heng Tao Shen. A survey on learning to hash.IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):769–790, April 2018

2018
[66]

Spectral hashing

Yair Weiss, Antonio Torralba, and Rob Fergus. Spectral hashing. InProceedings of the 22nd Inter- national Conference on Neural Information Processing Systems, NIPS’08, pages 1753–1760, Red Hook, NY, USA, December 2008. Curran Associates Inc

2008
[67]

Circulant binary embedding

Felix Yu, Sanjiv Kumar, Yunchao Gong, and Shih-Fu Chang. Circulant binary embedding. InProceed- ings of the 31st International Conference on Machine Learning, pages 946–954. PMLR, June 2014

2014
[68]

Turboquant: Online vector quanti- zation with near-optimal distortion rate

Amir Zandieh, Majid Daliri, Majid Hadian, and Vahab Mirrokni. Turboquant: Online vector quanti- zation with near-optimal distortion rate. October 2025

2025
[69]

Composite quantization for approximate nearest neighbor search

Ting Zhang, Chao Du, and Jingdong Wang. Composite quantization for approximate nearest neighbor search. InProceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, pages II–846, Beijing, China, June 2014. JMLR.org

2014
[70]

Yu, Ruiqi Guo, Sanjiv Kumar, Shengjin Wang, and Shi-Fu Chang

Xu Zhang, Felix X. Yu, Ruiqi Guo, Sanjiv Kumar, Shengjin Wang, and Shi-Fu Chang. Fast orthogonal projection based on kronecker product. pages 2929–2937, 2015. A Additional similarity functions The Euclidean distance is straightforward to incorporate in the current framework. We start by decomposing the distance∥q−x i∥2 2 as follows ∥q−x i∥2 2 =∥q−µ ∗ i +µ...

2015

[1] [1]

The fastlanes compression layout: Decoding>100 billion integers per second with scalar code.Proc

Azim Afroozeh and Peter Boncz. The fastlanes compression layout: Decoding>100 billion integers per second with scalar code.Proc. VLDB Endow., 16(9):2132–2144, May 2023

2023

[2] [2]

Similarity search in the blink of an eye with compressed indices.Proc

Cecilia Aguerrebere, Ishwar Singh Bhati, Mark Hildebrand, Mariano Tepper, and Theodore Willke. Similarity search in the blink of an eye with compressed indices.Proc. VLDB Endow., 16(11):3433– 3446, July 2023

2023

[3] [3]

Locally-adaptive quantization for streaming vector search, February 2024

Cecilia Aguerrebere, Mark Hildebrand, Ishwar Singh Bhati, Theodore Willke, and Mariano Tepper. Locally-adaptive quantization for streaming vector search, February 2024. arXiv:2402.02044 [cs]

work page arXiv 2024

[4] [4]

Nearest neighbor search with compact codes: A decoder perspective

Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, and Herv´ e J´ egou. Nearest neighbor search with compact codes: A decoder perspective. InProceedings of the 2022 International Conference on Multimedia Retrieval, pages 167–175, New York, NY, USA, June 2022. Association for Computing Machinery

2022

[5] [5]

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions.Communications of the ACM, 51(1):117–122, January 2008

Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions.Communications of the ACM, 51(1):117–122, January 2008

2008

[6] [6]

Quicker adc : Unlocking the hidden potential of product quantization with simd.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5):1666–1677, May 2021

Fabien Andre, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. Quicker adc : Unlocking the hidden potential of product quantization with simd.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5):1666–1677, May 2021

2021

[7] [7]

Cache locality is not enough: high- performance nearest neighbor search with product quantization fast scan.Proc

Fabien Andr´ e, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. Cache locality is not enough: high- performance nearest neighbor search with product quantization fast scan.Proc. VLDB Endow., 9(4): 288–299, December 2015

2015

[8] [8]

Accelerated nearest neighbor search with quick adc

Fabien Andr´ e, Anne-Marie Kermarrec, and Nicolas Le Scouarnec. Accelerated nearest neighbor search with quick adc. InProceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pages 159–166, New York, NY, USA, June 2017. Association for Computing Machinery

2017

[9] [9]

Additive quantization for extreme vector compression

Artem Babenko and Victor Lempitsky. Additive quantization for extreme vector compression. pages 931–938, 2014

2014

[10] [10]

Tree quantization for large-scale similarity search and classifi- cation

Artem Babenko and Victor Lempitsky. Tree quantization for large-scale similarity search and classifi- cation. pages 4240–4248, 2015

2015

[11] [11]

Multidimensional binary search trees used for associative searching.Commun

Jon Louis Bentley. Multidimensional binary search trees used for associative searching.Commun. ACM, 18(9):509–517, September 1975

1975

[12] [12]

Cover trees for nearest neighbor

Alina Beygelzimer, Sham Kakade, and John Langford. Cover trees for nearest neighbor. InProceedings of the 23rd international conference on Machine learning, ICML ’06, pages 97–104, New York, NY, USA, June 2006. Association for Computing Machinery

2006

[13] [13]

Carreira-Perpinan and Ramin Raziperchikolaei

Miguel A. Carreira-Perpinan and Ramin Raziperchikolaei. Hashing with binary autoencoders. pages 557–566, 2015. 18

2015

[14] [14]

Charikar

Moses S. Charikar. Similarity estimation techniques from rounding algorithms. InProceedings of the thiry-fourth annual ACM symposium on Theory of computing, STOC ’02, pages 380–388, New York, NY, USA, May 2002. Association for Computing Machinery

2002

[15] [15]

Spann: highly-efficient billion-scale approximate nearest neighbor search

Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. Spann: highly-efficient billion-scale approximate nearest neighbor search. InProceedings of the 35th International Conference on Neural Information Processing Systems, pages 5199–5212, Red Hook, NY, USA, December 2021. Curran Associates Inc

2021

[16] [16]

Approximate nearest neighbor search by residual vector quantization.Sensors, 10(12):11259–11273, December 2010

Yongjian Chen, Tao Guan, and Cheng Wang. Approximate nearest neighbor search by residual vector quantization.Sensors, 10(12):11259–11273, December 2010

2010

[17] [17]

msmarco-v2-embed-english-v3

CohereLabs. msmarco-v2-embed-english-v3. URLhttps://huggingface.co/datasets/CohereLabs/ msmarco-v2-embed-english-v3

[18] [18]

Stochastic generative hashing

Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, and Le Song. Stochastic generative hashing. InProceedings of the 34th International Conference on Machine Learning, pages 913–922. PMLR, July 2017

2017

[19] [19]

Stevens, and Saket Navlakha

Sanjoy Dasgupta, Charles F. Stevens, and Saket Navlakha. A neural algorithm for a fundamental computing problem.Science, 358(6364):793–796, November 2017

2017

[20] [20]

Mirrokni

Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. InProceedings of the twentieth annual symposium on Computational geometry, SCG ’04, pages 253–262, New York, NY, USA, June 2004. Association for Computing Ma- chinery

2004

[21] [21]

Jvector, May 2026

DataStax. Jvector, May 2026. URLhttps://github.com/datastax/jvector. original-date: 2023-08- 25T01:45:20Z

2026

[22] [22]

The faiss library.IEEE Transactions on Big Data, 12(2):346–361, April 2026

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazar´ e, Maria Lomeli, Lucas Hosseini, and Herv´ e J´ egou. The faiss library.IEEE Transactions on Big Data, 12(2):346–361, April 2026

2026

[23] [23]

Rabitq: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search.Proc

Jianyang Gao and Cheng Long. Rabitq: Quantizing high-dimensional vectors with a theoretical error bound for approximate nearest neighbor search.Proc. ACM Manag. Data, 2(3):167:1–167:27, May 2024

2024

[24] [24]

Practical and asymptotically optimal quantization of high-dimensional vectors in euclidean space for approximate nearest neighbor search.Proc

Jianyang Gao, Yutong Gou, Yuexuan Xu, Yongyi Yang, Cheng Long, and Raymond Chi-Wing Wong. Practical and asymptotically optimal quantization of high-dimensional vectors in euclidean space for approximate nearest neighbor search.Proc. ACM Manag. Data, 3(3):202:1–202:26, June 2025

2025

[25] [25]

Optimized product quantization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(04):744–755, April 2014

Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. Optimized product quantization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(04):744–755, April 2014

2014

[26] [26]

Iterative quantization: A procrustean approach to learning binary codes

Yunchao Gong and Svetlana Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. InCVPR 2011, pages 817–824, June 2011

2011

[27] [27]

Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12):2916–2929, December 2013

2013

[28] [28]

Asymmetric distances for binary embeddings

Albert Gordo and Florent Perronnin. Asymmetric distances for binary embeddings. InCVPR 2011, pages 729–736, June 2011

2011

[29] [29]

load mask16, July 2025

Intel Intrinsics Guide. load mask16, July 2025. URLhttps://www.intel.com/content/www/us/en/ docs/intrinsics-guide/index.html#text=_load_mask16&ig_expand=3996

2025

[30] [30]

mm512 add ps, July 2025

Intel Intrinsics Guide. mm512 add ps, July 2025. URLhttps://www.intel.com/content/www/us/ en/docs/intrinsics-guide/index.html#text=_mm512_add_ps&techs=AVX_512&ig_expand=143. 19

2025

[31] [31]

mm512 fmadd ps, July 2025

Intel Intrinsics Guide. mm512 fmadd ps, July 2025. URLhttps://www.intel.com/content/www/us/ en/docs/intrinsics-guide/index.html#text=_mm512_loadu_ps&ig_expand=4103

2025

[32] [32]

mm512 i32gather ps, July 2025

Intel Intrinsics Guide. mm512 i32gather ps, July 2025. URLhttps://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html#techs=AVX_512&text=_mm512_i32gather_ps&ig_ expand=3734

2025

[33] [33]

mm512 loadu ps, July 2025

Intel Intrinsics Guide. mm512 loadu ps, July 2025. URLhttps://www.intel.com/content/www/ us/en/docs/intrinsics-guide/index.html#text=_mm512_fmadd_ps&avx512techs=AVX512F&ig_ expand=3111

2025

[34] [34]

mm512 maskz loadu ps, July 2025

Intel Intrinsics Guide. mm512 maskz loadu ps, July 2025. URLhttps://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html#text=_mm512_maskz_loadu_ps&ig_expand=4105

2025

[35] [35]

mm512 reduce add ps, July 2025

Intel Intrinsics Guide. mm512 reduce add ps, July 2025. URLhttps://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html#text=_mm512_reduce_add_ps&avx512techs= AVX512F&ig_expand=5303

2025

[36] [36]

Accelerating large-scale inference with anisotropic vector quantization

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. Accelerating large-scale inference with anisotropic vector quantization. InProceedings of the 37th In- ternational Conference on Machine Learning, pages 3887–3896. PMLR, November 2020

2020

[37] [37]

Approximate nearest neighbors: towards removing the curse of dimensionality

Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. InProceedings of the thirtieth annual ACM symposium on Theory of computing, STOC ’98, pages 604–613, New York, NY, USA, May 1998. Association for Computing Machinery

1998

[38] [38]

A survey on locality sensitive hashing algorithms and their applications, February 2021

Omid Jafari, Preeti Maurya, Parth Nagarkar, Khandker Mushfiqul Islam, and Chidambaram Crushev. A survey on locality sensitive hashing algorithms and their applications, February 2021. arXiv:2102.08942 [cs]

work page arXiv 2021

[39] [39]

Billion-scale similarity search with gpus.IEEE Trans- actions on Big Data, 7(3):535–547, July 2021

Jeff Johnson, Matthijs Douze, and Herv´ e J´ egou. Billion-scale similarity search with gpus.IEEE Trans- actions on Big Data, 7(3):535–547, July 2021

2021

[40] [40]

Muon: An optimizer for hidden layers in neural networks, 2024

Keller Jordan, Yuchen Jin, Vlado Boza, Jiacheng You, Franz Cesista, Laker Newhouse, and Jeremy Bern- stein. Muon: An optimizer for hidden layers in neural networks, 2024. URLhttps://kellerjordan. github.io/posts/muon/

2024

[41] [41]

Biing-Hwang Juang and A. Gray. Multiple stage vector quantization for speech coding. InICASSP ’82. IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 7, pages 597–600, May 1982

1982

[42] [42]

Caselaw access project embeddings

justicedao. Caselaw access project embeddings. URLhttps://huggingface.co/datasets/ justicedao/Caselaw_Access_Project_embeddings

[43] [43]

Product quantization for nearest neighbor search

Herve J´ egou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):117–128, January 2011

2011

[44] [44]

Locally optimized product quantization for approximate nearest neighbor search

Yannis Kalantidis and Yannis Avrithis. Locally optimized product quantization for approximate nearest neighbor search. In2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 2329– 2336, June 2014

2014

[45] [45]

Retrieval-augmented generation for knowledge-intensive nlp tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K¨ uttler, Mike Lewis, Wen-tau Yih, Tim Rockt¨ aschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. InAdvances in Neural Information Processing Systems, pages 9459–9474. Curran Associates Inc., Dec...

2020

[46] [46]

Improved Residual Vector Quantization for High-dimensional Approximate Nearest Neighbor Search

Shicong Liu, Hongtao Lu, and Junru Shao. Improved residual vector quantization for high-dimensional approximate nearest neighbor search, September 2015. arXiv:1509.05195 [cs]. 20

work page internal anchor Pith review Pith/arXiv arXiv 2015

[47] [47]

S. Lloyd. Least squares quantization in pcm.IEEE Transactions on Information Theory, 28(2):129–137, March 1982

1982

[48] [48]

A survey on deep hashing methods.ACM Trans

Xiao Luo, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. A survey on deep hashing methods.ACM Trans. Knowl. Discov. Data, 17(1):15:1–15:50, February 2023

2023

[49] [49]

Malkov and D

Yu A. Malkov and D. A. Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.IEEE Transactions on Pattern Analysis and Machine Intel- ligence, 42(4):824–836, April 2020

2020

[50] [50]

Hoos, and James J

Julieta Martinez, Joris Clement, Holger H. Hoos, and James J. Little. Revisiting additive quantization. In Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling, editors,Computer Vision – ECCV 2016, pages 137–153, Cham, 2016. Springer International Publishing

2016

[51] [51]

J. Max. Quantizing for minimum distortion.IRE Transactions on Information Theory, 6(1):7–12, March 1960

1960

[52] [52]

Marius Muja and David G. Lowe. Scalable nearest neighbor algorithms for high dimensional data.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11):2227–2240, November 2014

2014

[53] [53]

wiki mpnet embeddings

olmer. wiki mpnet embeddings. URLhttps://huggingface.co/datasets/olmer/wiki_mpnet_ embeddings

[54] [54]

Training deep learning models with norm-constrained lmos

Thomas Pethick, Wanyun Xie, Kimon Antonakopoulos, Zhenyu Zhu, Antonio Silveti-Falls, and Volkan Cevher. Training deep learning models with norm-constrained lmos. InProceedings of the 42nd In- ternational Conference on Machine Learning, volume 267 ofICML’25, pages 49069–49104, Vancouver, Canada, July 2025. JMLR.org

2025

[55] [55]

dbpedia-entities-openai3-text-embedding-3-large-1536-1m, 2024

Qdrant. dbpedia-entities-openai3-text-embedding-3-large-1536-1m, 2024. URLhttps://huggingface. co/datasets/Qdrant/dbpedia-entities-openai3-text-embedding-3-large-1536-1M

2024

[56] [56]

dbpedia-entities-openai3-text-embedding-3-large-3072-1m, 2024

Qdrant. dbpedia-entities-openai3-text-embedding-3-large-3072-1m, 2024. URLhttps://huggingface. co/datasets/Qdrant/dbpedia-entities-openai3-text-embedding-3-large-3072-1M

2024

[57] [57]

Faiss, May 2026

Meta Research. Faiss, May 2026. URLhttps://github.com/facebookresearch/faiss. Accessed: 2026-05-07

2026

[58] [58]

Diskann: fast accurate billion-point nearest neighbor search on a single node

Suhas Jayaram Subramanya, Devvrit, Rohan Kadekodi, Ravishankar Krishaswamy, and Harsha Vard- han Simhadri. Diskann: fast accurate billion-point nearest neighbor search on a single node. InAdvances on Neural Information Processing Systems, pages 13766–13776, Red Hook, NY, USA, December 2019. Curran Associates Inc

2019

[59] [59]

Mariano Tepper, Ishwar Singh Bhati, Cecilia Aguerrebere, Mark Hildebrand, and Theodore L. Willke. Leanvec: Searching vectors faster by making them fit.Transactions on Machine Learning Research, January 2024

2024

[60] [60]

Gleanvec: Accelerating vector search with minimalist nonlinear dimensionality reduction, October 2024

Mariano Tepper, Ishwar Singh Bhati, Cecilia Aguerrebere, and Ted Willke. Gleanvec: Accelerating vector search with minimalist nonlinear dimensionality reduction, October 2024. arXiv:2410.22347 [cs]

work page arXiv 2024

[61] [61]

Neural discrete representation learning

Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. Neural discrete representation learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pages 6309–6318, Red Hook, NY, USA, December 2017. Curran Associates Inc

2017

[62] [62]

Drive: One-bit distributed mean estimation

Shay Vargaftik, Ran Ben-Basat, Amit Portnoy, Gal Mendelson, Yaniv Ben-Itzhak, and Michael Mitzen- macher. Drive: One-bit distributed mean estimation. InAdvances in Neural Information Processing Systems, volume 34, pages 362–377. Curran Associates, Inc., 2021. 21

2021

[63] [63]

Eden: Communication-efficient and robust distributed mean estimation for federated learning

Shay Vargaftik, Ran Ben Basat, Amit Portnoy, Gal Mendelson, Yaniv Ben Itzhak, and Michael Mitzen- macher. Eden: Communication-efficient and robust distributed mean estimation for federated learning. InProceedings of the 39th International Conference on Machine Learning, pages 21984–22014. PMLR, June 2022

2022

[64] [64]

Rabitq-library, 2025

VectorDB-NTU. Rabitq-library, 2025. URLhttps://github.com/VectorDB-NTU/RaBitQ-Library. Accessed: 2026-05-15

2025

[65] [65]

A survey on learning to hash.IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):769–790, April 2018

Jingdong Wang, Ting Zhang, jingkuan song, Nicu Sebe, and Heng Tao Shen. A survey on learning to hash.IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):769–790, April 2018

2018

[66] [66]

Spectral hashing

Yair Weiss, Antonio Torralba, and Rob Fergus. Spectral hashing. InProceedings of the 22nd Inter- national Conference on Neural Information Processing Systems, NIPS’08, pages 1753–1760, Red Hook, NY, USA, December 2008. Curran Associates Inc

2008

[67] [67]

Circulant binary embedding

Felix Yu, Sanjiv Kumar, Yunchao Gong, and Shih-Fu Chang. Circulant binary embedding. InProceed- ings of the 31st International Conference on Machine Learning, pages 946–954. PMLR, June 2014

2014

[68] [68]

Turboquant: Online vector quanti- zation with near-optimal distortion rate

Amir Zandieh, Majid Daliri, Majid Hadian, and Vahab Mirrokni. Turboquant: Online vector quanti- zation with near-optimal distortion rate. October 2025

2025

[69] [69]

Composite quantization for approximate nearest neighbor search

Ting Zhang, Chao Du, and Jingdong Wang. Composite quantization for approximate nearest neighbor search. InProceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, pages II–846, Beijing, China, June 2014. JMLR.org

2014

[70] [70]

Yu, Ruiqi Guo, Sanjiv Kumar, Shengjin Wang, and Shi-Fu Chang

Xu Zhang, Felix X. Yu, Ruiqi Guo, Sanjiv Kumar, Shengjin Wang, and Shi-Fu Chang. Fast orthogonal projection based on kronecker product. pages 2929–2937, 2015. A Additional similarity functions The Euclidean distance is straightforward to incorporate in the current framework. We start by decomposing the distance∥q−x i∥2 2 as follows ∥q−x i∥2 2 =∥q−µ ∗ i +µ...

2015