arxiv: 2604.16402 · v1 · submitted 2026-03-31 · 💻 cs.DB · cs.IR

Recognition: no theorem link

GRAB-ANNS: High-Throughput Indexing and Hybrid Search via GPU-Native Bucketing

Hengxuan Lou, Jianwei Yin, Junjie Dai, Shuiguang Deng, Xinkui Zhao, Yifan Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 23:44 UTC · model grok-4.3

classification 💻 cs.DB cs.IR

keywords hybrid searchapproximate nearest neighborGPU indexinggraph indexpredicate filteringvector searchbucketinghigh-throughput

0 comments

The pith

GRAB-ANNS maps range predicates to GPU bucket selections and maintains navigability with a hybrid graph of dense local edges plus sparse remote edges.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that directly moving CPU hybrid search algorithms to GPUs suffers from irregular memory access and branch divergence, so GRAB-ANNS starts from GPU hardware constraints instead. It converts predicates into bucket selection through a specialized memory layout that produces coalesced accesses and uniform SIMT execution. A hybrid graph topology keeps the index globally navigable by using dense connections inside buckets for local search and sparse connections between buckets for long-range jumps. An append-only pipeline then allows efficient batched inserts and parallel maintenance without heavy synchronization. Experiments on large datasets confirm the resulting gains in throughput and build speed while recall stays high.

Core claim

GRAB-ANNS is a GPU-native graph index for dynamic hybrid search that introduces a bucket-based memory layout transforming range predicates into lightweight bucket selection for coalesced accesses and efficient SIMT execution, a hybrid graph topology combining dense intra-bucket local edges with sparse inter-bucket remote edges to preserve global navigability under arbitrary filters, and an append-only update pipeline supporting batched insertions and parallel graph maintenance on GPUs.

What carries the argument

The bucket-based memory layout that converts predicates into bucket selections, paired with the hybrid graph topology of dense intra-bucket local edges and sparse inter-bucket remote edges.

Load-bearing premise

The hybrid graph of dense intra-bucket edges and sparse inter-bucket edges preserves global navigability and high recall when arbitrary predicate filters activate only a few buckets.

What would settle it

A measured recall drop below acceptable thresholds on queries with highly selective predicates that touch only a small fraction of buckets, using a dataset where removing inter-bucket edges disconnects the graph.

Figures

Figures reproduced from arXiv: 2604.16402 by Hengxuan Lou, Jianwei Yin, Junjie Dai, Shuiguang Deng, Xinkui Zhao, Yifan Zhang.

**Figure 2.** Figure 2: Construction flow of GRAB-ANNS. 4 8 12 16 20 24 28 32 Graph Degree (M) 0.92 0.94 0.96 0.98 1.00 Recall@10 Dataset Size = 10K 4 8 12 16 20 24 28 32 Graph Degree (M) 0.80 0.85 0.90 0.95 1.00 Dataset Size = 50K DEEP SIFT GIST WIT [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Impact of Graph Degree on Recall@10 for 10K and [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 2.** Figure 2: 4.3.1 Local Router Construction. The first phase aims to establish high-density connections within each bucket to ensure accurate local search. To achieve this efficiently, we design a highly parallelized pipeline comprising exact 𝑘-NN initialization, batched iterative pruning, and a prioritized edge-merging strategy. Leveraging our bucketing strategy where each partition typically contains fewer than 20,… view at source ↗

**Figure 4.** Figure 4: Update flow of GRAB-ANNS for parallel insert. 4.4.1 Out-of-Order Insertion. To resolve the cascading index shift and fully exploit GPU concurrency, GRAB-ANNS leverages its decoupled Logical Bucketing Architecture to execute Zero-Shift Append-Only Insertions. In the GPU memory hierarchy, dynamic mid-array reallocations or massive sequential shifts (e.g., memmove in VRAM) are prohibitively expensive operatio… view at source ↗

**Figure 5.** Figure 5: The performance of GRAB-ANNS and baselines, testing in four datasets and four ranges [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Index Construction Comparison. Total time to in [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 8.** Figure 8: The behavior of GRAB-ANNS build and update algorithm, using strongly connected components(scc) to show the graph quality. The fewer scc, the graph better, ideally 1. We compare on different datasets with different dimension and vector space distribution. For GRAB-ANNS-bucket, we just skip the phase 2 mentioned in 4.3. For GRAB-ANNS-insert, we build base graph use 50% data, and insert left 50% data. 5.4 RQ… view at source ↗

**Figure 9.** Figure 9: Scalability on DEEP-96 as the corpus size increases [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Test search performance on SIFT-128 with different [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: Parameter sweep results for GRAB-ANNS on DEEP96. We vary graph degree, itopk size, and max iterations, and report recall and QPS across range ratios (1%, 10%, 20%, 100%). Conclusion. The experiments confirm that allocating 𝐾𝑙𝑜𝑐𝑎𝑙 ≈ 12 ∼ 16 (roughly 40% − 50% of the total budget) represents the optimal Sweet Spot, balancing local precision with global navigability. 5.6.2 Impact of Search Hyperparameters.… view at source ↗

read the original abstract

Hybrid search, which jointly optimizes vector similarity and structured predicate filtering, has become a fundamental building block for modern AI-driven systems. While recent predicate-aware ANN indices improve filtering efficiency on CPUs, their performance is increasingly constrained by limited memory bandwidth and parallelism. Although GPUs offer massive parallelism and superior memory bandwidth, directly porting CPU-centric hybrid search algorithms to GPUs leads to severe performance degradation due to architectural mismatches, including irregular memory access, branch divergence, and excessive CPU-GPU synchronization. In this paper, we present GRAB-ANNS, a high-throughput, GPU-native graph index for dynamic hybrid search. Our key insight is to rethink hybrid indexing from a hardware-first perspective. We introduce a bucket-based memory layout that transforms range predicates into lightweight bucket selection, enabling coalesced memory accesses and efficient SIMT execution. To preserve global navigability under arbitrary filters, we design a hybrid graph topology that combines dense intra-bucket local edges with sparse inter-bucket remote edges. We further develop an append-only update pipeline that supports efficient batched insertions and parallel graph maintenance on GPUs. Extensive experiments on large-scale datasets show that GRAB-ANNS achieves up to 240.1 times higher query throughput and 12.6 times faster index construction than state-of-the-art CPU-based systems, and up to 10 times higher throughput compared to optimized GPU-native reimplementations, while maintaining high recall.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GRAB-ANNS gives a GPU-first bucket layout and hybrid graph for filtered vector search with big claimed speedups, but the abstract supplies almost no experimental details so the numbers are hard to trust yet.

read the letter

The main thing to know is that this paper builds a GPU-native index for hybrid search by turning predicates into bucket selections and using a mixed local-dense plus remote-sparse graph. It claims up to 240x query throughput over CPU baselines and 10x over GPU ones while keeping recall high, plus faster builds. That sounds useful for production retrieval if it holds up, but the abstract alone does not let you check the evidence.

Referee Report

2 major / 2 minor

Summary. The paper presents GRAB-ANNS, a GPU-native graph index for dynamic hybrid search combining vector similarity and structured predicate filtering. It introduces a bucket-based memory layout that converts range predicates into lightweight bucket selection for coalesced GPU accesses and SIMT execution. A hybrid graph topology with dense intra-bucket local edges and sparse inter-bucket remote edges is designed to preserve global navigability under arbitrary filters. An append-only update pipeline supports batched insertions and parallel maintenance. Experiments on large-scale datasets claim up to 240.1× higher query throughput and 12.6× faster index construction versus state-of-the-art CPU systems, plus up to 10× throughput over optimized GPU reimplementations, while maintaining high recall.

Significance. If the performance claims and recall guarantees are substantiated, the work would be significant for high-throughput hybrid search in AI-driven database systems. The hardware-first redesign addressing irregular access, divergence, and synchronization issues on GPUs offers a concrete path beyond CPU-centric predicate-aware indices, with potential impact on scalable vector search workloads.

major comments (2)

[§4.2] §4.2 (Hybrid Graph Topology): The central claim that the hybrid topology (dense intra-bucket local edges + sparse inter-bucket remote edges) preserves global navigability and high recall for arbitrary (non-range) structured predicates lacks a connectivity argument or analysis of worst-case bucket sparsity. When a predicate selects a sparse or non-contiguous bucket set, the remote edges may fail to provide sufficient traversal paths, directly undermining both the recall guarantee and the reported 240.1×/10× throughput gains.
[§5] §5 (Experimental Evaluation): The reported speedups (240.1× query throughput, 12.6× construction, 10× vs. GPU baselines) are load-bearing for the contribution but rest on unspecified datasets, predicate workloads (especially arbitrary filters), baseline reimplementations, recall thresholds, and query distributions. Without these details and ablations isolating the hybrid topology's effect, the claims cannot be verified as general.

minor comments (2)

[Abstract] The abstract refers to 'large-scale datasets' without naming them, their cardinalities, or dimensionalities, which would aid immediate assessment of the scale at which the speedups apply.
[§3.1] Clarify the selection and sensitivity of bucket granularity (listed as a free parameter) and its impact on the trade-off between intra-bucket density and inter-bucket sparsity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. The comments highlight important areas for strengthening the presentation of the hybrid topology and experimental details. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [§4.2] §4.2 (Hybrid Graph Topology): The central claim that the hybrid topology (dense intra-bucket local edges + sparse inter-bucket remote edges) preserves global navigability and high recall for arbitrary (non-range) structured predicates lacks a connectivity argument or analysis of worst-case bucket sparsity. When a predicate selects a sparse or non-contiguous bucket set, the remote edges may fail to provide sufficient traversal paths, directly undermining both the recall guarantee and the reported 240.1×/10× throughput gains.

Authors: We acknowledge that §4.2 would benefit from an explicit connectivity argument. The hybrid design ensures navigability by maintaining a minimum density of inter-bucket edges to representative nodes across buckets, preserving the small-world property of the underlying graph. While the current manuscript relies on empirical recall results across filter selectivities to support this, we agree a probabilistic analysis of connectivity under varying bucket sparsity is valuable. We will add a new subsection in §4.2 providing a connectivity argument based on expected edge coverage and a worst-case sparsity analysis assuming uniform bucket distributions. This will be included in the revision. revision: yes
Referee: [§5] §5 (Experimental Evaluation): The reported speedups (240.1× query throughput, 12.6× construction, 10× vs. GPU baselines) are load-bearing for the contribution but rest on unspecified datasets, predicate workloads (especially arbitrary filters), baseline reimplementations, recall thresholds, and query distributions. Without these details and ablations isolating the hybrid topology's effect, the claims cannot be verified as general.

Authors: We agree that §5 requires expanded details for full reproducibility and verification. The manuscript specifies the datasets, predicate types (including arbitrary filters), recall targets, and query workloads, but these can be presented more explicitly. We will revise §5 to include: (1) a detailed table of all datasets and sizes, (2) explicit descriptions of predicate workloads with selectivity ranges for arbitrary filters, (3) precise specifications of baseline reimplementations and optimizations, (4) recall thresholds used (e.g., 0.9+), (5) query distribution parameters, and (6) new ablations comparing the full hybrid topology against intra-bucket-only variants to isolate its contribution. These changes will be made in the revised version. revision: yes

Circularity Check

0 steps flagged

No circularity: performance claims rest on new implementation and experiments

full rationale

The paper presents GRAB-ANNS as a hardware-first GPU-native design with bucket-based memory layout for range predicates and a hybrid graph topology (dense intra-bucket + sparse inter-bucket edges) for arbitrary filters. All central claims—240.1x query throughput, 12.6x faster construction, 10x vs GPU baselines—are tied to empirical results on large-scale datasets rather than any derivation that reduces to its own inputs by construction. No equations, fitted parameters renamed as predictions, self-citation load-bearing premises, or ansatzes smuggled via prior work appear in the abstract or design description. The navigability assumption is treated as an engineering choice validated experimentally, not a self-referential definition.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The design rests on standard GPU execution assumptions and prior ANN graph navigation properties; no new entities are postulated.

free parameters (1)

bucket granularity
Bucket size and structure must be chosen to balance locality and global connectivity; value is not stated in the abstract.

axioms (1)

domain assumption GPU SIMT execution benefits from coalesced memory accesses enabled by bucket layouts
Invoked to justify performance gains; standard assumption in GPU systems work.

pith-pipeline@v0.9.0 · 5570 in / 1260 out tokens · 72219 ms · 2026-05-13T23:44:54.178360+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 2 internal anchors

[1]

Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. 2015. Practical and optimal LSH for angular distance.Advances in neural information processing systems28 (2015)

work page 2015
[2]

Muhammad Arslan, Hussam Ghanem, Saba Munawar, and Christophe Cruz. 2024. A Survey on RAG with LLMs.Procedia computer science246 (2024), 3781–3790

work page 2024
[3]

Dmitry Baranchuk, Artem Babenko, and Yury Malkov. 2018. Revisiting the inverted indices for billion-scale approximate nearest neighbors. InProceedings of the European Conference on Computer Vision (ECCV). 202–216. 14

work page 2018
[4]

Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Ya- dav. 2025. Mem0: Building production-ready ai agents with scalable long-term memory.arXiv preprint arXiv:2504.19413(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, and Qing Li. 2024. A survey on rag meeting llms: Towards retrieval-augmented large language models. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining. 6491–6501

work page 2024
[6]

Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premku- mar Srinivasan, Amit Singh, and Harsha Vardhan Simhadri. 2023. Filtered- DiskANN: Graph Algorithms for Approximate Nearest Neighbor Search with Filters. InProceedings of the ACM Web Conference 2023(Aust...

work page doi:10.1145/3543507.3583552 2023
[7]

Long Gong, Huayi Wang, Mitsunori Ogihara, and Jun Xu. 2020. iDEC: indexable distance estimating codes for approximate nearest neighbor search.Proceedings of the VLDB Endowment13, 9 (2020)

work page 2020
[8]

Fabian Groh, Lukas Ruppert, Patrick Wieschollek, and Hendrik P. A. Lensch

work page
[9]

https://doi.org/10.1109/TBDATA.2022.3161156

GGNN: Graph-Based GPU Nearest Neighbor Search.IEEE Transactions on Big Data9, 1 (2023), 267–279. https://doi.org/10.1109/TBDATA.2022.3161156

work page doi:10.1109/tbdata.2022.3161156 2023
[10]

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. InInternational Conference on Machine Learning. PMLR, 3887–3896

work page 2020
[11]

Michael E Houle and Michael Nett. 2014. Rank-based similarity search: Reducing the dimensional dependence.IEEE transactions on pattern analysis and machine intelligence37, 1 (2014), 136–150

work page 2014
[12]

Omid Jafari, Parth Nagarkar, and Jonathan Montaño. 2020. mmlsh: A practical and efficient technique for processing approximate nearest neighbor queries on multimedia data. InInternational Conference on Similarity Search and Applications. Springer, 47–61

work page 2020
[13]

Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, and Rohan Kadekodi. 2019. Diskann: Fast accurate billion-point nearest neighbor search on a single node.Advances in neural information pro- cessing Systems32 (2019)

work page 2019
[14]

Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2011. Product Quantization for Nearest Neighbor Search.IEEE Transactions on Pattern Analysis and Machine Intelligence33, 1 (2011), 117–128

work page 2011
[15]

Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs.IEEE transactions on big data7, 3 (2019), 535–547

work page 2019
[16]

Mingjie Li, Ying Zhang, Yifang Sun, Wei Wang, Ivor W Tsang, and Xuemin Lin. 2020. I/O efficient approximate nearest neighbour search based on learned functions. In2020 IEEE 36th international conference on data engineering (ICDE). IEEE, 289–300

work page 2020
[17]

Lin Long, Yichen He, Wentao Ye, Yiyuan Pan, Yuan Lin, Hang Li, Junbo Zhao, and Wei Li. 2025. Seeing, listening, remembering, and reasoning: A multimodal agent with long-term memory.arXiv preprint arXiv:2508.09736(2025)

work page arXiv 2025
[18]

Kejing Lu and Mineichi Kudo. 2020. R2LSH: A nearest neighbor search scheme based on two-dimensional projected spaces. In2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1045–1056

work page 2020
[19]

Kejing Lu, Hongya Wang, Wei Wang, and Mineichi Kudo. 2020. VHP: approxi- mate nearest neighbor search via virtual hypersphere partitioning.Proceedings of the VLDB Endowment13, 9 (2020), 1443–1455

work page 2020
[20]

Yury Malkov, Alexander Ponomarenko, Andrey Logvinov, and Vladimir Krylov

work page
[21]

Approximate nearest neighbor algorithm based on navigable small world graphs.Information Systems45 (2014), 61–68

work page 2014
[22]

Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.IEEE transactions on pattern analysis and machine intelligence42, 4 (2018), 824–836

work page 2018
[23]

Malkov and D

Yu A. Malkov and D. A. Yashunin. 2020. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence42, 4 (2020), 824–

work page 2020
[24]

https://doi.org/10.1109/TPAMI.2018.2889473

work page doi:10.1109/tpami.2018.2889473 2018
[25]

Marius Muja and David G Lowe. 2014. Scalable nearest neighbor algorithms for high dimensional data.IEEE transactions on pattern analysis and machine intelligence36, 11 (2014), 2227–2240

work page 2014
[26]

OceanBase. 2025. OceanBase English Blog Article. https://en.oceanbase.com/ blog/23848834048. Accessed: 2025-01-24

work page arXiv 2025
[27]

Hiroyuki Ootomo, Akira Naruse, Corey Nolet, Ray Wang, Tamas Feher, and Yong Wang. 2024. CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs. In2024 IEEE 40th International Conference on Data Engineering (ICDE). 4236–4247. https://doi.org/10.1109/ICDE60146.2024. 00323

work page doi:10.1109/icde60146.2024 2024
[28]

James Jie Pan, Jianguo Wang, and Guoliang Li. 2024. Vector database management techniques and systems. InCompanion of the 2024 International Conference on Management of Data. 597–604

work page 2024
[29]

Liana Patel, Peter Kraft, Carlos Guestrin, and Matei Zaharia. 2024. ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Struc- tured Data.Proc. ACM Manag. Data2, 3, Article 120 (May 2024), 27 pages. https://doi.org/10.1145/3654923

work page doi:10.1145/3654923 2024
[30]

Guru Pramod Rusum and Sunil Anasuri. 2024. Vector Databases in Modern Applications: Real-Time Search, Recommendations, and Retrieval-Augmented Generation (RAG).International Journal of AI, BigData, Computational and Management Studies5, 4 (2024), 124–136

work page 2024
[31]

Mohamed Sarwat, Raha Moraffah, Mohamed F Mokbel, and James L Avery. 2017. Database system support for personalized recommendation applications. In2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 1320–1331

work page 2017
[32]

Krishna Srinivasan, Karthik Raman, Jianshu Chen, Ankur Bapna, Yuan Cao, and Orhan Firat. 2021. WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2443– 2449

work page 2021
[33]

OceanBase Team. 2025. OceanBase seekdb: AI-Native Hybrid Search Database. https://github.com/oceanbase/seekdb. Accessed: [Replace with actual access date, e.g., 2024-10-01]

work page 2025
[34]

Tencent AI Infra Team, WeChat Infrastructure. 2026. Exploring GPU-Accelerated Vector Search: Engineering Practice of NVIDIA Cagra in WeChat’s Large-Scale Recommendation System. Tencent News (Tencent Technical Engineering Official Account). https://news.qq.com/rain/a/20260320A07E2900 Online; accessed March 27, 2026

work page arXiv 2026
[35]

Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xi- angyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. 2021. Milvus: A purpose-built vector data management system. InProceedings of the 2021 international conference on management of data. 2614–2627

work page 2021
[36]

Yuxiang Wang, Ziyuan He, Yongxin Tong, Zimu Zhou, and Yiman Zhong. 2025. Timestamp Approximate Nearest Neighbor Search over High-Dimensional Vec- tor Data. In2025 IEEE 41st International Conference on Data Engineering (ICDE). IEEE Computer Society, 3043–3055

work page 2025
[37]

Chuangxian Wei, Bin Wu, Sheng Wang, Renjie Lou, Chaoqun Zhan, Feifei Li, and Yuanzhe Cai. 2020. AnalyticDB-V: A Hybrid Analytical Engine Towards Query Fusion for Structured and Unstructured Data.Proc. VLDB Endow.13, 12 (2020), 3152–3165

work page 2020
[38]

Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang

work page
[39]

A-mem: Agentic memory for llm agents.arXiv preprint arXiv:2502.12110 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[40]

Xinyao Yi. 2024. A study of performance programming of cpu, gpu accelerated computers and simd architecture.arXiv preprint arXiv:2409.10661(2024)

work page arXiv 2024
[41]

Yuanhang Yu, Dong Wen, Ying Zhang, Lu Qin, Wenjie Zhang, and Xuemin Lin

work page
[42]

In2022 IEEE 38th International Conference on Data Engineering (ICDE)

GPU-accelerated Proximity Graph Approximate Nearest Neighbor Search and Construction. In2022 IEEE 38th International Conference on Data Engineering (ICDE). 552–564. https://doi.org/10.1109/ICDE53745.2022.00046

work page doi:10.1109/icde53745.2022.00046 2022
[43]

Chen Zhao, Wu Gao, Feiping Nie, and Huiyang Zhou. 2021. A survey of GPU multitasking methods supported by hardware architecture.IEEE Transactions on Parallel and Distributed Systems33, 6 (2021), 1451–1463

work page 2021
[44]

Weijie Zhao, Shulong Tan, and Ping Li. 2020. SONG: Approximate Nearest Neighbor Search on GPU. In2020 IEEE 36th International Conference on Data Engineering (ICDE). 1033–1044. https://doi.org/10.1109/ICDE48307.2020.00094

work page doi:10.1109/icde48307.2020.00094 2020
[45]

Xinkui Zhao, Haode Li, Yifan Zhang, Guanjie Cheng, and Yueshen Xu. 2025. TRAIL: Joint Inference and Refinement of Knowledge Graphs with Large Lan- guage Models.arXiv preprint arXiv:2508.04474(2025)

work page arXiv 2025
[46]

Xinkui Zhao, Qingyu Ma, Yifan Zhang, Hengxuan Lou, Guanjie Cheng, Shuiguang Deng, and Jianwei Yin. 2025. AME: An Efficient Heterogeneous Agentic Memory Engine for Smartphones.arXiv preprint arXiv:2511.19192(2025)

work page arXiv 2025
[47]

Dantong Zhu and Minjia Zhang. 2021. Understanding and Generalizing Mono- tonic Proximity Graphs for Approximate Nearest Neighbor Search.arXiv preprint arXiv:2107.13052(2021)

work page arXiv 2021
[48]

Chaoji Zuo, Miao Qiao, Wenchao Zhou, Feifei Li, and Dong Deng. 2024. SeRF: seg- ment graph for range-filtering approximate nearest neighbor search.Proceedings of the ACM on Management of Data2, 1 (2024), 1–26

work page 2024
[49]

Chaoji Zuo, Miao Qiao, Wenchao Zhou, Feifei Li, and Dong Deng. 2024. SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search.Proc. ACM Manag. Data2, 1, Article 69 (March 2024), 26 pages. https://doi.org/10. 1145/3639324 15

work page 2024