pith. machine review for the scientific record. sign in

arxiv: 2604.20121 · v2 · submitted 2026-04-22 · 💻 cs.DB

Recognition: unknown

A GPU-Accelerated Framework for Multi-Attribute Range Filtered Approximate Nearest Neighbor Search

Haoran Yu, Yifan Zhu, Yunjun Gao, Zhonggen Li, Zixuan Xu

Pith reviewed 2026-05-09 23:30 UTC · model grok-4.3

classification 💻 cs.DB
keywords range filtered approximate nearest neighbor searchGPU accelerationvector databasegraph indexcell partitioningout-of-core processingmulti-attribute filtering
0
0 comments X

The pith

Garfield delivers a GPU-accelerated index for range-filtered nearest neighbor search that is 4.4 times smaller and 119.8 times faster than prior methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Garfield to address the problems of oversized indexes and limited CPU speed in range-filtered approximate nearest neighbor search. It builds a cell-based graph index that keeps storage and build costs linear by adding only a fixed number of links between cells. Queries run efficiently on the GPU through a reordering strategy that reuses good starting points across cells, and the design includes a streaming pipeline so datasets larger than GPU memory can still be handled without full loading. These choices let the system process filtered vector queries at much higher rates than existing CPU-only approaches while using less memory.

Core claim

Garfield partitions the dataset into cells and constructs a local graph index inside each cell, then adds only a constant number of cross-cell edges to keep total storage and construction time linear in the data size. During queries it applies cluster-guided ordering to the relevant cells so the GPU can traverse them sequentially while passing strong candidate points from one cell to the next as entry points. For data that exceeds GPU memory the framework uses a cell-oriented out-of-core pipeline that schedules cells to reduce active queries per batch and overlaps CPU-to-GPU index transfers with ongoing computation.

What carries the argument

The GMG index, a cell-partitioned graph structure that adds a constant number of cross-cell edges and supports cluster-guided ordering for GPU traversal.

If this is right

  • Index size and build time scale linearly with data volume instead of growing super-linearly.
  • GPU traversal reuses candidates across cells, turning limited memory bandwidth into sustained high throughput.
  • The out-of-core scheduler allows queries on datasets larger than GPU memory by streaming only the needed cells.
  • Overall query rates exceed those of CPU-based RFANNS systems by more than two orders of magnitude.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same cell-partitioning idea could be tested on update-heavy workloads to see whether local graph maintenance stays efficient.
  • If the ordering strategy generalizes, similar reuse techniques might improve other GPU graph searches that involve attribute predicates.
  • Hardware with larger GPU memory pools would reduce the frequency of out-of-core transfers and potentially raise the observed speedups further.

Load-bearing premise

That the constant cross-cell edges and cluster-guided ordering will keep delivering linear storage and high candidate reuse no matter how the data points and filter attributes are distributed in practice.

What would settle it

Measure index size growth and query throughput on a dataset whose points form many small irregular clusters and whose filters select only a tiny fraction of points per query; if the size exceeds linear scaling or the speedup drops below 10x relative to CPU baselines, the central claims do not hold.

Figures

Figures reproduced from arXiv: 2604.20121 by Haoran Yu, Yifan Zhu, Yunjun Gao, Zhonggen Li, Zixuan Xu.

Figure 1
Figure 1. Figure 1: Vanilla ANNS index (HNSW[33]) vs. RFANN indexes (iRangeGraph[52] and UNIFY[29]) on SIFT1M. (e.g., videos, documents) are often coupled with structured numeric attributes (e.g., timestamp, duration). Consequently, modern sys￾tems increasingly demand range-filtered ANNS (RFANNS) [44, 48], which answer vector similarity queries while enforcing structured attribute range predicates to retrieve more relevant re… view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the query processing in Garfield. Algorithm 2: Query processing of Garfield Input: dataset D = {𝑜1, ..., 𝑜𝑛 }, GMG index 𝐺, query 𝑄 = (𝑞, 𝐹 ), number of results 𝑘 Output: top-𝑘 ANN set 𝑅 /* Cell Selection */ 1 𝐶𝑄 ← ∅; 2 foreach 𝐶𝑖 ∈ cells of 𝐺 in parallel do 3 if 𝐹 intersects 𝐶𝑖 then 4 𝐶𝑄 ← 𝐶𝑄 ∪ {𝐶𝑖 }; 5 if |𝐶𝑄 | > 𝑆𝑡ℎ𝑟𝑒 then 6 𝑅 ′ ← perform ANNS on the whole 𝐺; 7 𝑅 ← top-𝑘 ANN after post-filte… view at source ↗
Figure 4
Figure 4. Figure 4: Pipeline of cluster-based cell ordering. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Pipeline of out-of-core RFANNS in Garfield. 5.1 Pipeline for Large-scale RFANNS The cell-by-cell query processing paradigm of the GMG index is inherently well-suited for out-of-core execution, as cells can be streamed to the GPU in batches. Nonetheless, designing an opti￾mal execution workflow remains non-trivial due to the PCIe I/O overhead between the GPU and CPU. In Garfield, the CPU DRAM stores the ori… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of all methods on 6 real-world datasets. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Impact of various numbers of filtered attributes. l = 1 l = 2 l = 4 l = 8 [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 13
Figure 13. Figure 13: Ablation of inter-cell edges and cell ordering. w/o Pipeline (m1) w/ Pipeline (m=1) w/ Pipeline (m=2) w/o Pipeline (m=2) w/ Pipeli (m4) w/o Pipeline (m=4) [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗
read the original abstract

Range-filtered approximate nearest neighbor search (RFANNS) is increasingly critical for modern vector databases. However, existing solutions suffer from severe index inflation and construction overhead. Furthermore, they rely exclusively on CPUs for the heavy indexing and query processing, significantly restricting the throughput due to the limited memory bandwidth and parallelism. In this paper, we present Garfield, a GPU-accelerated framework for multi-attribute range filtered ANNS that overcomes these bottlenecks through designing a lightweight index structure and hardware-aware execution pipeline. Garfield introduces the GMG index, which partitions data into cells and builds local graph indexes. It guarantees linear storage and indexing overhead by adding a constant number of cross-cell edges. For queries, Garfield utilizes a cluster-guided ordering strategy that reorders query-relevant cells, enabling a highly efficient cell-by-cell traversal on the GPU that aggressively reuses candidates as entry points across cells. To handle datasets exceeding GPU memory, Garfield features a cell-oriented out-of-core pipeline. It dynamically schedules cells to minimize the number of active queries per batch and overlaps GPU computation with CPU-to-GPU index streaming. Extensive evaluations demonstrate that Garfield reduces index size by 4.4x, while delivering 119.8x higher throughput than state-of-the-art RFANNS methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The manuscript presents Garfield, a GPU-accelerated framework for multi-attribute range filtered approximate nearest neighbor search (RFANNS). It introduces the GMG index, which partitions the dataset into cells and builds local graph indexes within cells, adding only a constant number of cross-cell edges to achieve linear storage and indexing overhead. For query processing, Garfield uses a cluster-guided ordering strategy to reorder relevant cells, enabling efficient cell-by-cell traversal on the GPU that reuses candidates across cells. An out-of-core pipeline is provided for datasets exceeding GPU memory, scheduling cells to overlap computation and data transfer. The paper reports that Garfield achieves a 4.4× reduction in index size and 119.8× higher throughput compared to state-of-the-art RFANNS methods.

Significance. If the empirical results are reproducible and generalize, this work would offer a significant advancement in handling range-filtered ANNS on GPUs, mitigating index inflation and CPU limitations common in vector databases. The design of a lightweight index with constant cross-cell edges and the hardware-aware query pipeline with candidate reuse are practical contributions that could influence future GPU-accelerated search systems. The out-of-core support further extends applicability to large-scale datasets.

major comments (3)
  1. The performance claims in the abstract (4.4x index size reduction and 119.8x throughput) are not supported by sufficient details in the experimental evaluation regarding baseline implementations, dataset characteristics, hardware configuration, or statistical significance, preventing independent verification of the central claims.
  2. The assertion that adding a constant number of cross-cell edges guarantees linear storage lacks a formal proof or bound on the number of edges under varying cell sizes or filter selectivities, which is load-bearing for the index size claim.
  3. The cluster-guided cell ordering strategy is claimed to enable high reuse, but no analysis or additional experiments are provided for cases where range filters result in low cell overlap or non-uniform distributions, potentially affecting the throughput gains.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thorough review and valuable feedback. We have revised the manuscript to address all major comments, providing additional details, analysis, and experiments as detailed in the point-by-point responses below.

read point-by-point responses
  1. Referee: The performance claims in the abstract (4.4x index size reduction and 119.8x throughput) are not supported by sufficient details in the experimental evaluation regarding baseline implementations, dataset characteristics, hardware configuration, or statistical significance, preventing independent verification of the central claims.

    Authors: We agree that the experimental section lacked sufficient details for full reproducibility. In the revised version, we have added comprehensive information on baseline implementations (including specific libraries, versions, and tuned parameters), dataset characteristics (sizes, dimensions, attribute distributions, and how they were generated), hardware configuration (exact GPU model, CPU specs, memory sizes, and software environment), and statistical significance (results averaged over 10 independent runs with standard deviations reported). Furthermore, we have released the full source code and evaluation scripts to allow independent verification of the reported speedups and index sizes. revision: yes

  2. Referee: The assertion that adding a constant number of cross-cell edges guarantees linear storage lacks a formal proof or bound on the number of edges under varying cell sizes or filter selectivities, which is load-bearing for the index size claim.

    Authors: We acknowledge the need for a more rigorous treatment. The original manuscript argued that since a constant number of cross-cell edges are added per cell, and the number of cells is proportional to the dataset size (for fixed cell size), the total storage remains linear. However, to strengthen this, we have included a formal bound in the revised Section 3: the number of cross-cell edges is at most C * num_cells, where C is a small constant (e.g., 2 * degree of the graph), independent of cell size. Note that filter selectivities do not impact index construction or size, as the index is built once without reference to queries. We have clarified this distinction and provided the proof sketch. revision: yes

  3. Referee: The cluster-guided cell ordering strategy is claimed to enable high reuse, but no analysis or additional experiments are provided for cases where range filters result in low cell overlap or non-uniform distributions, potentially affecting the throughput gains.

    Authors: We appreciate the referee highlighting this potential limitation. While the core design of the cluster-guided ordering aims to maximize reuse by traversing cells in a locality-preserving order, we recognize that additional validation is beneficial. In the revised manuscript, we have added a new subsection with experiments on low-overlap scenarios (using highly selective range filters that touch few cells) and non-uniform distributions (e.g., Zipfian attribute distributions). These experiments show that the throughput gains are largely preserved even in these challenging cases, thanks to the adaptive scheduling in the out-of-core pipeline. We also provide a brief analytical discussion on expected reuse based on cell overlap. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from system design and evaluation

full rationale

The paper describes a GPU-accelerated RFANNS framework (Garfield) built around the GMG index (cell partitioning + local graphs + constant cross-cell edges for linear storage) and a cluster-guided cell ordering query pipeline with out-of-core scheduling. The headline claims (4.4x index reduction, 119.8x throughput) are direct empirical measurements on reported datasets and filter workloads, not quantities obtained by fitting parameters inside the paper's own equations and then re-deriving them as predictions. No self-definitional loops, fitted-input-as-prediction steps, or load-bearing self-citations appear in the abstract or construction description. The linear-storage guarantee is an explicit algorithmic property (constant edges per cell), not a circular reduction. The work is self-contained against external benchmarks via standard experimental comparison.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the design assumption that data admits a cell partitioning whose local graphs plus a constant number of cross-cell edges suffice for correct approximate search, and that GPU memory bandwidth and parallelism can be exploited via the described ordering and streaming without hidden bottlenecks.

free parameters (1)
  • number of cross-cell edges
    Chosen as a small constant to guarantee linear storage overhead; exact value not specified in abstract.
axioms (1)
  • domain assumption The input dataset can be partitioned into cells such that local graph indexes plus limited cross-cell links preserve approximate nearest-neighbor quality under range filters.
    Invoked by the GMG index construction described in the abstract.
invented entities (1)
  • GMG index no independent evidence
    purpose: Lightweight cell-partitioned graph index with constant cross-cell edges for GPU-efficient RFANNS.
    New structure introduced by the paper; no independent evidence outside the claimed experiments.

pith-pipeline@v0.9.0 · 5531 in / 1411 out tokens · 53713 ms · 2026-05-09T23:30:27.788696+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

73 extracted references

  1. [1]

    2010. SIFT. http://corpus-texmex.irisa.fr

  2. [2]

    2019. YouTube. https://research.google.com/youtube8m/download.html

  3. [3]

    2025. DBLP. https://open.aminer.cn/open/article?id=655db2202ab17a072284bc0c

  4. [4]

    Anas Ait Aomar, Karima Echihabi, Marco Arnaboldi, Ioannis Alagiannis, Damien Hilloulin, and Manal Cherkaoui. 2025. RWalks: Random walks as attribute diffusers for filtered vector search. InSIGMOD. 212:1–212:26

  5. [5]

    Ilias Azizi, Karima Echihabi, and Themis Palpanas. 2025. Graph-based vector search: An experimental evaluation of the state-of-the-art. InSIGMOD. 43:1– 43:31

  6. [6]

    Artem Babenko and Victor Lempitsky. 2016. Efficient indexing of billion-scale datasets of deep descriptors. InCVPR. 2055–2063

  7. [7]

    Yuzheng Cai, Jiayang Shi, Yizhuo Chen, and Weiguo Zheng. 2024. Navigating labels and vectors: A unified approach to filtered approximate nearest neighbor search. InSIGMOD. 246:1–246:27

  8. [8]

    Cheng Chen, Chenzhe Jin, Yunan Zhang, Sasha Podolsky, Chun Wu, Szu- Po Wang, Eric Hanson, Zhou Sun, Robert Walzer, and Jianguo Wang. 2024. Singlestore-v: An integrated vector database system in singlestore.PVLDB17, 12 (2024), 3772–3785

  9. [9]

    Jatin Chhugani, Anthony D Nguyen, Victor W Lee, William Macy, Mostafa Hagog, Yen-Kuang Chen, Akram Baransi, Sanjeev Kumar, and Pradeep Dubey

  10. [10]

    PVLDB1, 2 (2008), 1313–1324

    Efficient implementation of sorting on multi-core SIMD CPU architecture. PVLDB1, 2 (2008), 1313–1324

  11. [11]

    Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2019. Fast approximate nearest neighbor search with the navigating spreading-out graph.PVLDB12, 5 (2019), 461–474

  12. [12]

    Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premku- mar Srinivasan, et al. 2023. Filtered-diskann: Graph algorithms for approximate nearest neighbor search with filters. InWWW. 3406–3416

  13. [13]

    Zengyang Gong, Yuxiang Zeng, and Lei Chen. 2025. Accelerating approximate nearest neighbor search in hierarchical graphs: Efficient level navigation with shortcuts.PVLDB18, 10 (2025), 3518–3530

  14. [14]

    Yutong Gou, Jianyang Gao, Yuexuan Xu, and Cheng Long. 2025. SymphonyQG: Towards symphonious integration of quantization and graph for approximate nearest neighbor search. InSIGMOD. 80:1–80:26

  15. [15]

    Yuxing Han, Ziniu Wu, Peizhi Wu, Rong Zhu, Jingyi Yang, Liang Wei Tan, Kai Zeng, Gao Cong, Yanzhao Qin, Andreas Pfadler, et al. 2022. Cardinality estimation in DBMS: A comprehensive benchmark evaluation.PVLDB15, 4 (2022), 752–765

  16. [16]

    Guoyu Hu, Shaofeng Cai, Tien Tuan Anh Dinh, Zhongle Xie, Cong Yue, Gang Chen, and Beng Chin Ooi. 2025. HAKES: Scalable vector database for embedding search service.PVLDB18, 9 (2025), 3049–3062

  17. [17]

    Haodi Jiang, Hao Guo, Minhui Xie, Jiwu Shu, and Youyou Lu. 2025. High- throughput, cost-effective billion-scale vector search with a single GPU. InSIG- MOD. 334:1–334:27

  18. [18]

    Mengxu Jiang, Zhi Yang, Fangyuan Zhang, Guanhao Hou, Jieming Shi, Wenchao Zhou, Feifei Li, and Sibo Wang. 2025. DIGRA: A dynamic graph indexing for approximate nearest neighbor search with range filter. InSIGMOD. 148:1–148:26

  19. [19]

    Wenqi Jiang, Zhenhao He, Shuai Zhang, Kai Zeng, Liang Feng, Jiansong Zhang, Tongxuan Liu, Yong Li, Jingren Zhou, Ce Zhang, et al. 2021. Fleetrec: Large-scale recommendation inference on hybrid gpu-fpga clusters. InSIGKDD. 3097–3105

  20. [20]

    Wenqi Jiang, Marco Zeller, Roger Waleffe, Torsten Hoefler, and Gustavo Alonso

  21. [21]

    Chameleon: A heterogeneous and disaggregated accelerator system for retrieval-augmented language models.PVLDB18, 1 (2024), 42–52

  22. [22]

    Yicheng Jin, Yongji Wu, Wenjun Hu, Bruce Maggs, Jun Yang, Xiao Zhang, and Danyang Zhuo. 2026. Curator: Efficient vector search with low-selectivity filters. InSIGMOD. 21:1–21:27

  23. [23]

    V Karthik, Saim Khan, Somesh Singh, Harsha Vardhan Simhadri, and Jyothi Vedurada. 2025. BANG: Billion-scale approximate nearest neighbour search using a single GPU.IEEE Transactions on Big Data11, 6 (2025), 3142–3157

  24. [24]

    Kyoungmin Kim, Sangoh Lee, Injung Kim, and Wook-Shin Han. 2024. Asm: Harmonizing autoregressive model, sampling, and multi-dimensional statistics merging for cardinality estimation. InSIGMOD. 45:1–45:27

  25. [25]

    Sukjin Kim, Seongyeon Park, Si Ung Noh, Junguk Hong, Taehee Kwon, Hunseong Lim, and Jinho Lee. 2025. PathWeaver: A high-throughput multi-GPU system for graph-based approximate nearest neighbor search. InUSENIX ATC. 1501–1517

  26. [26]

    Hai Lan, Shixun Huang, Zhifeng Bao, and Renata Borovica-Gajic. 2024. Cardi- nality estimation for similarity search on high-dimensional data objects: The impact of reference objects.PVLDB18, 3 (2024), 544–556

  27. [27]

    Mocheng Li, Xiao Yan, Baotong Lu, Yue Zhang, James Cheng, and Chenhao Ma

  28. [28]

    InSIGMOD

    Attribute filtering in approximate nearest neighbor search: An in-depth experimental study. InSIGMOD. 298:1–298:26

  29. [29]

    Peizheng Li, Chaoyi Chen, Hao Yuan, Zhenbo Fu, Hang Shen, Xinbo Yang, Qiange Wang, Xin Ai, Yanfeng Zhang, Yingyou Wen, et al. 2025. Neutronrag: Towards understanding the effectiveness of RAG from a data retrieval perspective. In SIGMOD Companion. 163–166

  30. [30]

    Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yuyao Zhang, Peitian Zhang, Yutao Zhu, and Zhicheng Dou. 2025. From matching to generation: A survey on generative information retrieval.TOIS43, 3 (2025), 1–62

  31. [31]

    Zhonggen Li, Xiangyu Ke, Yifan Zhu, Bocheng Yu, Baihua Zheng, and Yunjun Gao. 2025. Scalable graph indexing using GPUs for approximate nearest neighbor search. InSIGMOD. 360:1–360:27

  32. [32]

    Anqi Liang, Pengcheng Zhang, Bin Yao, Zhongpu Chen, Yitong Song, and Guangxu Cheng. 2024. Unify: Unified index for range filtered approximate nearest neighbors search.PVLDB18, 4 (2024), 1118–1130

  33. [33]

    Qiyu Liu, Yanlin Qi, Siyuan Han, Jingshu Peng, Jin Li, and Lei Chen. 2025. Not small enough? SegPQ: A learned approach to compress product quantization codebooks.PVLDB18, 11 (2025), 3730–3743

  34. [34]

    Duo Lu, Helena Caminal, Manos Chatzakis, Yannis Papakonstantinou, Yannis Chronis, Vaibhav Jain, and Fatma Özcan. 2026. An in-depth study of filter- agnostic vector search on a postgresql database system.arXiv(2026)

  35. [35]

    Ruiyao Ma, Yifan Zhu, Baihua Zheng, Lu Chen, Congcong Ge, and Yunjun Gao

  36. [36]

    GTI: Graph-based tree index with logarithm updates for nearest neighbor search in high-dimensional spaces.PVLDB18, 4 (2024), 986–999

  37. [37]

    Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.TPAMI 42, 4 (2018), 824–836

  38. [38]

    Magdalen Dobson Manohar, Zheqi Shen, Guy Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun. 2024. Parlayann: Scalable and de- terministic parallel graph-based approximate nearest neighbor search algorithms. InPPoPP. 270–285

  39. [39]

    Xupeng Miao, Yining Shi, Hailin Zhang, Xin Zhang, Xiaonan Nie, Zhi Yang, and Bin Cui. 2022. HET-GMP: A graph-based system approach to scaling large embedding model training. InSIGMOD. 470–480

  40. [40]

    Hiroyuki Ootomo, Akira Naruse, Corey Nolet, Ray Wang, Tamas Feher, and Yong Wang. 2024. Cagra: Highly parallel graph construction and approximate nearest neighbor search for gpus. InICDE. 4236–4247

  41. [41]

    James Jie Pan, Jianguo Wang, and Guoliang Li. 2024. Survey of vector database management systems.VLDBJ33, 5 (2024), 1591–1615

  42. [42]

    Liana Patel, Peter Kraft, Carlos Guestrin, and Matei Zaharia. 2024. Acorn: Per- formant and predicate-agnostic search over vector embeddings and structured data. InSIGMOD. 120:1–120:27

  43. [43]

    Yun Peng, Byron Choi, Tsz Nam Chan, Jianye Yang, and Jianliang Xu. 2023. Efficient approximate nearest neighbor search in multi-dimensional databases. InSIGMOD. 54:1–54:27

  44. [44]

    Zhencan Peng, Miao Qiao, Wenchao Zhou, Feifei Li, and Dong Deng. 2025. Dynamic range-filtering approximate nearest neighbor search.PVLDB18, 10 (2025), 3256–3268

  45. [45]

    Runwen Qiu and Jing Tang. 2025. Efficient approximate nearest neighbor search via hemi-sphere centroids graph. InSIGMOD. 321:1–321:26

  46. [46]

    Gaurav Sehgal and Semih Salihoglu. 2025. NaviX: A native vector index design for graph dbmss with robust predicate-agnostic search performance.PVLDB18, 11 (2025), 4438–4450

  47. [47]

    Bing Tian, Haikun Liu, Yuhang Tang, Shihai Xiao, Zhuohui Duan, Xiaofei Liao, Hai Jin, Xuecang Zhang, Junhua Zhu, and Yu Zhang. 2025. Towards high- throughput and low-latency billion-scale vector search via CPU/GPU collabora- tive filtering and re-ranking. InFAST. 171–185

  48. [48]

    Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xi- angyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. 2021. Milvus: A purpose-built vector data management system. InSIGMOD. 2614–2627

  49. [49]

    Mengzhao Wang, Lingwei Lv, Xiaoliang Xu, Yuxiang Wang, Qiang Yue, and Jiongkang Ni. 2023. An efficient and robust framework for approximate nearest neighbor search with attribute constraint.NeurIPS36 (2023), 15738–15751

  50. [50]

    Mengzhao Wang, Xiaoliang Xu, Qiang Yue, and Yuxiang Wang. 2021. A com- prehensive survey and experimental comparison of graph-based approximate nearest neighbor search.PVLDB14, 11 (2021), 1964–1978

  51. [51]

    Ziqi Wang, Jingzhe Zhang, and Wei Hu. 2025. WoW: A window-to-window incremental index for range-filtering approximate nearest neighbor search. In SIGMOD. 378:1–378:27

  52. [52]

    Chuangxian Wei, Bin Wu, Sheng Wang, Renjie Lou, Chaoqun Zhan, Feifei Li, and Yuanzhe Cai. 2020. AnalyticDB-V: A hybrid analytical engine towards query fusion for structured and unstructured data.PVLDB13, 12 (2020), 3152–3165

  53. [53]

    Jiuqi Wei, Xiaodong Lee, Zhenyu Liao, Themis Palpanas, and Botao Peng. 2025. Subspace collision: An efficient and accurate framework for high-dimensional approximate nearest neighbor search. InSIGMOD. 79:1–79:29

  54. [54]

    Wei Wu, Junlin He, Yu Qiao, Guoheng Fu, Li Liu, and Jin Yu. 2022. HQANN: Efficient and robust similarity search for hybrid queries with structured and unstructured constraints. InCIKM. 4580–4584

  55. [55]

    Jingyi Xi, Chenghao Mo, Ben Karsin, Artem Chirkin, Mingqin Li, and Minjia Zhang. 2025. VecFlow: A high-performance vector data management system for filtered-search on GPUs. InSIGMOD. 271:1–271:27

  56. [56]

    Yuexuan Xu, Jianyang Gao, Yutong Gou, Cheng Long, and Christian S Jensen

  57. [57]

    InSIGMOD

    iRangeGraph: Improvising range-dedicated graphs for range-filtering nearest neighbor search. InSIGMOD. 239:1–239:26. 13

  58. [58]

    Chen Yang, Sunhao Dai, Yupeng Hou, Wayne Xin Zhao, Jun Xu, Yang Song, and Hengshu Zhu. 2024. Revisiting reciprocal recommender systems: Metrics, formulation, and method. InSIGKDD. 3714–3723

  59. [59]

    Wen Yang, Tao Li, Gai Fang, and Hong Wei. 2020. Pase: Postgresql ultra-high- dimensional approximate nearest neighbor search extension. InSIGMOD. 2241– 2253

  60. [60]

    Chunxiao Ye, Xiao Yan, and Eric Lo. 2025. Compass: General filtered search across vector and structured data.arXiv(2025)

  61. [61]

    Yuanhang Yu, Dawei Cheng, Ying Zhang, Lu Qin, Wenjie Zhang, and Xuemin Lin. 2026. Efficient approximate nearest neighbor search under multi-attribute range filter.arXiv(2026)

  62. [62]

    Yuanhang Yu, Dong Wen, Ying Zhang, Lu Qin, Wenjie Zhang, and Xuemin Lin

  63. [63]

    GPU-accelerated proximity graph approximate nearest neighbor search and construction. InICDE. 552–564

  64. [64]

    Yuxiang Zeng, Yongxin Tong, and Lei Chen. 2023. Litehst: A tree embedding based method for similarity search. InSIGMOD. 35:1–35:26

  65. [65]

    Fangyuan Zhang, Mengxu Jiang, Guanhao Hou, Jieming Shi, Hua Fan, Wenchao Zhou, Feifei Li, and Sibo Wang. 2025. Efficient dynamic indexing for range filtered approximate nearest neighbor search. InSIGMOD. 152:1–152:26

  66. [66]

    Huayi Zhang, Lei Cao, Yizhou Yan, Samuel Madden, and Elke A Rundensteiner

  67. [67]

    InSIGMOD

    Continuously adaptive similarity search. InSIGMOD. 2601–2616

  68. [68]

    Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, et al. 2023. Vbase: Unifying online vector similarity search and relational queries via relaxed monotonicity. InOSDI. 377–395

  69. [69]

    Weijie Zhao, Shulong Tan, and Ping Li. 2020. Song: Approximate nearest neighbor search on gpu. InICDE. 1033–1044

  70. [70]

    Xi Zhao, Yao Tian, Kai Huang, Bolong Zheng, and Xiaofang Zhou. 2023. Towards efficient index construction and approximate nearest neighbor search in high- dimensional spaces.PVLDB16, 8 (2023), 1979–1991

  71. [71]

    Yingli Zhou, Yaodong Su, Youran Sun, Shu Wang, Taotao Wang, Runyuan He, Yongwei Zhang, Sicong Liang, Xilin Liu, Yuchi Ma, et al. 2025. In-depth analysis of graph-based RAG in a unified framework.PVLDB18, 13 (2025), 5623–5637

  72. [72]

    Jiaxu Zhu, Jiayu Yuan, Kaiwen Yang, Xiaobao Chen, Shihuan Yu, Hongchang Lv, Yan Li, and Bolong Zheng. 2025. An experimental evaluation of hybrid querying on vectors.PVLDB19, 2 (2025), 183–195

  73. [73]

    Chaoji Zuo, Miao Qiao, Wenchao Zhou, Feifei Li, and Dong Deng. 2024. Serf: Seg- ment graph for range-filtering approximate nearest neighbor search. InSIGMOD. 69:1–69:26. 14