pith. machine review for the scientific record. sign in

arxiv: 2604.20401 · v1 · submitted 2026-04-22 · 💻 cs.CR · cs.AI

Recognition: unknown

Onyx: Cost-Efficient Disk-Oblivious ANN Search

Deevashwer Rathee, G. Edward Suh, Jean-Luc Watson, Raluca Ada Popa, Zirui Neil Zhao

Authors on Pith no claims yet

Pith reviewed 2026-05-10 00:18 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords approximate nearest neighbor searchoblivious RAMtrusted execution environmentsdisk access patternscost-efficient searchANN pruningORAM tree design
0
0 comments X

The pith

Reversing optimization priorities lets ANN prune bandwidth and ORAM cut accesses, yielding up to 9.9x lower cost for disk-oblivious search.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that pairing ANN search with ORAM for SSDs becomes inefficient when ANN minimizes access count and ORAM minimizes bandwidth, because each layer is poorly matched to that task. It instead assigns bandwidth minimization to the ANN layer and access-count minimization to the ORAM layer. The ANN component uses a compact intermediate representation to drop most high-bandwidth accesses early while keeping recall intact. The ORAM component uses a locality-aware shallow tree to lower the number of accesses while still working with bandwidth-efficient ORAM schemes. If correct, this inversion makes private similarity search on untrusted external storage practical for large AI workloads that must protect query patterns.

Core claim

Onyx inverts prior ORAM-ANN designs by minimizing bandwidth consumption in the ANN layer and access count in the ORAM layer. Onyx-ANNS achieves the first goal with a compact intermediate representation that proactively prunes the majority of bandwidth-intensive accesses without meaningfully reducing recall. Onyx-ORAM achieves the second goal with a locality-aware shallow tree that reduces access count while remaining compatible with bandwidth-efficient ORAM primitives. The resulting system delivers 1.7-9.9x lower cost and 2.3-12.3x lower latency than the state-of-the-art oblivious ANN search system.

What carries the argument

The inverted assignment of goals: bandwidth minimization performed by a compact pruning representation inside the ANN layer, paired with access-count minimization performed by a locality-aware shallow tree inside the ORAM layer.

If this is right

  • Secure ANN search over external SSDs inside TEEs becomes viable at commercial cost and latency levels.
  • Cloud providers can host privacy-sensitive similarity search without exposing query patterns to the host OS.
  • ANN indexes can be restructured to trade early filtering accuracy for lower bandwidth while still meeting recall targets.
  • ORAM constructions gain headroom when paired with data layouts that already reduce logical access count.
  • Deployment of large-scale private vector databases on commodity hardware no longer requires prohibitive SSD over-provisioning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same priority swap could be tested on other oblivious data structures such as range trees or graph indices where one layer tolerates approximation and the other benefits from locality.
  • Integration with specific TEEs like Intel SGX or AMD SEV might require only minor adjustments to the shallow-tree layout to preserve ORAM compatibility.
  • Real hardware traces from production SSDs could be used to measure whether the access-count savings persist when page sizes and queue depths vary.
  • If the pruning representation generalizes, it might reduce bandwidth in non-oblivious ANN systems as well, improving baseline performance before ORAM is added.

Load-bearing premise

The compact intermediate representation can discard most bandwidth-heavy accesses without lowering final recall, and the shallow locality-aware tree stays compatible with efficient ORAM under realistic SSD workloads.

What would settle it

An experiment on standard ANN benchmarks that shows recall falling below the target threshold when the pruning representation is applied, or a workload trace demonstrating that the reduced access count fails to lower end-to-end latency once ORAM protocol costs are included.

Figures

Figures reproduced from arXiv: 2604.20401 by Deevashwer Rathee, G. Edward Suh, Jean-Luc Watson, Raluca Ada Popa, Zirui Neil Zhao.

Figure 1
Figure 1. Figure 1: Prior designs for oblivious ANNS and Onyx. Com￾pass [131] suffers from poor performance due to high net￾work overhead, and mirroring its design into a TEE yields a system that needs significant CPU and SSD resources to sustain reasonable performance. Onyx proposes co-designed primitives for ANNS and ORAM that jointly improve re￾source utilization, yielding high performance and low cost. the inherently low … view at source ↗
Figure 2
Figure 2. Figure 2: I/O overhead design space for ORAM schemes at 𝐵 = 512 Bytes. Gray iso-throughput contours show the maximum I/O multipliers that our evaluation SSD (§ 6.1) can theoretically sustain at a given throughput. Arrows trace Onyx-ORAM’s step-by-step improvements from RingORAM, annotated with the factor change in each multiplier. which for 𝑑 = 8 is ≈ 1. As a result, this step yields a 3× ac￾cess count reduction wit… view at source ↗
Figure 3
Figure 3. Figure 3: Onyx-ORAM vs prior work for 20M blocks (1-SSD). LBM (local bucket metadata), FBR (full bucket reads), ST (shallow tree) refer to steps in § 4.3; Onyx: Ring+LBM+FBR+ST-8; Ring: RingORAM; Path: locality￾optimized PathORAM. around 2K QPS because the high access count per logical op￾eration saturates the SSD’s IOPS. On the other hand, PathO￾RAM is bandwidth-bound: its throughput degrades with in￾creasing block… view at source ↗
Figure 4
Figure 4. Figure 4: (Top) Pareto frontier of throughput vs. latency of Onyx and baselines (1-SSD; top-left is better). Dots below the curves represent sub-optimal parameterizations; single markers are the configuration for a scheme that maximizes both latency and throughput. (Bottom) Throughput vs. recall of Onyx and baselines (1-SSD). Queries/$ Latency (ms) (a) SIFT (b) MS-MARCO (c) WIKI (d) DEEP [PITH_FULL_IMAGE:figures/fu… view at source ↗
Figure 5
Figure 5. Figure 5: Pareto frontier of cost-normalized throughput (Queries/$) vs. latency for Onyx and baselines, highlighting the most cost-effective configurations (vCPU and SSD resources allocated). Better configurations are towards the top-left. blocks). This overhead is more noticeable on datasets with smaller hints (e.g., DEEP), where ORAM client state is a larger fraction of total memory usage. Compass-in-TEE has the h… view at source ↗
Figure 6
Figure 6. Figure 6: Importance of ORAM-ANN co-design: pareto frontier of throughput vs. latency for Onyx and single￾component (Onyx-ORAM/ANN) constructions. Better con￾figurations are towards the top-left; single markers indicate a scheme’s latency- and throughput-optimal configuration. primary bottleneck: Onyx-ORAM + Disk approaches Onyx (10–20% lower performance), while replacing Onyx-ORAM with RingORAM or PathORAM degrades… view at source ↗
Figure 7
Figure 7. Figure 7: complements [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: (Top) Pareto frontier of throughput vs. latency of Onyx and baselines (4-SSD; top-left is better). Dots below the curves represent sub-optimal parameterizations; single markers are the configuration for a scheme that maximizes both latency and throughput. (Bottom) Throughput vs. recall of Onyx and baselines (4-SSD). Throughput (QPS) (a) SIFT (b) MS-MARCO (c) WIKI (d) DEEP Latency (ms) [PITH_FULL_IMAGE:fig… view at source ↗
Figure 9
Figure 9. Figure 9: ORAM-ANN co-design ablation (4-SSD). Pareto frontier of throughput vs. latency for Onyx and single￾component (Onyx-ORAM and Onyx-ANN) constructions. Better configurations are towards the top-left. with large vectors (MS-MARCO 4.5×, WIKI 2.8×) where the bandwidth reduction is largest (3.8–5.0×, § 5.4), but Onyx￾ANNS also improves on smaller-vector datasets (SIFT 1.2×, DEEP 1.1×). Without increasing memory f… view at source ↗
Figure 10
Figure 10. Figure 10: Onyx-ANNS ablation (1-SSD). Each curve varies the search queue depth (smallest QD value); other labeled values are pruning hint sizes. Pareto frontiers highlight the best-performing configurations for each approach. FreshDiskANN Update Primitives. FreshDiskANN’s dy￾namic updates are built from two primitives: GreedySearch and RobustPrune. At a high level, GreedySearch traverses the graph to find a candida… view at source ↗
Figure 11
Figure 11. Figure 11: Security game for disk-access privacy for a disk￾resident ANN search system ANNS. Hybrid 𝐻2: replace ciphertexts with encryptions of ze￾ros. Identical to 𝐻1 except that every ciphertext written to disk encrypts a fixed string 0 𝐵 (where 𝐵 is the slot size) instead of the real payload (addrs, data). The challenger now maintains the entire ORAM tree on disk internally, so that it can follow the same access … view at source ↗
read the original abstract

Approximate nearest neighbor (ANN) search in AI systems increasingly handles sensitive data on third-party infrastructure. Trusted execution environments (TEEs) offer protection, but cost-efficient deployments must rely on external SSDs, which leaks user queries through disk access patterns to the host. Oblivious RAM (ORAM) can hide these access patterns but at a high cost; when paired with existing disk-based ANN search techniques, it makes poor use of SSD resources, yielding high latency and poor cost-efficiency. The core challenge for efficient oblivious ANN search over SSDs is balancing both bandwidth and access count. The state-of-the-art ORAM-ANN design minimizes access count at the ANN level and bandwidth at the ORAM level, each trading-off the other, leaving the combined system with both resources overutilized. We propose inverting this design, minimizing bandwidth consumption in the ANN layer and access count in the ORAM layer, since each component is better suited for its new role: ANN's inherent approximation allows for more bandwidth efficiency, while ORAM has no fundamental lower bounds on access count (as opposed to bandwidth). To this end, we propose a cost-efficient approach, Onyx, with two new co-designed components: Onyx-ANNS introduces a compact intermediate representation that proactively prunes the majority of bandwidth-intensive accesses without hurting recall, and Onyx-ORAM proposes a locality-aware shallow tree design that reduces access count while remaining compatible with bandwidth-efficient ORAM techniques. Compared to the state-of-the-art oblivious ANN search system, Onyx achieves $1.7-9.9\times$ lower cost and $2.3-12.3\times$ lower latency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents Onyx, a system for cost-efficient disk-oblivious approximate nearest neighbor (ANN) search over SSDs in trusted execution environments. It identifies that prior ORAM-ANN designs overutilize both bandwidth and access count by minimizing access count at the ANN level and bandwidth at the ORAM level. Onyx inverts the targets: Onyx-ANNS uses a compact intermediate representation to proactively prune the majority of bandwidth-intensive accesses without hurting recall, while Onyx-ORAM uses a locality-aware shallow tree to reduce access count while remaining compatible with bandwidth-efficient ORAM primitives. The paper reports empirical gains of 1.7-9.9× lower cost and 2.3-12.3× lower latency versus the state-of-the-art oblivious ANN search system.

Significance. If the empirical results and underlying assumptions hold under realistic workloads, this work would be significant for practical deployment of secure ANN search on untrusted third-party infrastructure. The co-design that leverages ANN approximation for bandwidth savings and ORAM flexibility for access-count reduction addresses a key resource trade-off in disk-oblivious systems and could enable more cost-effective TEE-based AI services.

major comments (2)
  1. The central performance claims depend on Onyx-ANNS's compact intermediate representation pruning the majority of bandwidth-intensive accesses while keeping recall essentially unchanged. The manuscript must supply concrete recall targets, workload descriptions, ablation results on pruning aggressiveness, and error bars to demonstrate that recall degradation is negligible rather than assumed.
  2. Onyx-ORAM's locality-aware shallow tree is claimed to cut access count without violating the bandwidth efficiency or security of the underlying ORAM primitive on realistic SSD hardware. The evaluation should include access-pattern analysis, SSD-specific microbenchmarks, and direct comparison showing that the shallow tree does not increase bandwidth consumption or introduce new leakage under the tested configurations.
minor comments (2)
  1. The abstract and introduction should explicitly name the state-of-the-art baseline system being compared against, rather than referring to it generically.
  2. Notation for the compact intermediate representation and shallow tree parameters should be defined consistently in the design sections with clear mappings to the claimed resource savings.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will incorporate the requested details into the revised manuscript to better substantiate our claims.

read point-by-point responses
  1. Referee: The central performance claims depend on Onyx-ANNS's compact intermediate representation pruning the majority of bandwidth-intensive accesses while keeping recall essentially unchanged. The manuscript must supply concrete recall targets, workload descriptions, ablation results on pruning aggressiveness, and error bars to demonstrate that recall degradation is negligible rather than assumed.

    Authors: We agree that these supporting details are necessary. In the revision we will add explicit recall targets (e.g., 0.95 and 0.99), full workload descriptions drawn from standard ANN benchmarks (SIFT, GloVe, etc.), ablation tables varying the pruning aggressiveness parameter, and error bars computed over repeated runs to confirm that recall remains essentially unchanged. revision: yes

  2. Referee: Onyx-ORAM's locality-aware shallow tree is claimed to cut access count without violating the bandwidth efficiency or security of the underlying ORAM primitive on realistic SSD hardware. The evaluation should include access-pattern analysis, SSD-specific microbenchmarks, and direct comparison showing that the shallow tree does not increase bandwidth consumption or introduce new leakage under the tested configurations.

    Authors: We will expand the evaluation with access-pattern traces for the shallow tree, SSD microbenchmarks on representative hardware, and side-by-side comparisons against the baseline ORAM that quantify bandwidth usage and confirm the absence of new leakage channels. These results will be added to the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an empirical systems design for disk-oblivious ANN search, introducing Onyx-ANNS (compact IR pruning) and Onyx-ORAM (locality-aware shallow tree) to invert prior minimization targets. Performance claims (1.7-9.9× cost, 2.3-12.3× latency reductions) are framed as results of experimental comparisons to state-of-the-art systems rather than any mathematical derivation, prediction, or fitted parameter. No equations, self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the text. The design rationale relies on layer-specific suitability arguments and external benchmarks, keeping the central claims independent of the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on the abstract alone, the design rests on standard assumptions from ANN search (approximation tolerance) and ORAM (security definitions) without introducing new free parameters, axioms, or invented entities that are explicitly quantified.

pith-pipeline@v0.9.0 · 5620 in / 1142 out tokens · 26844 ms · 2026-05-10T00:18:42.531710+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

147 extracted references · 12 canonical work pages · 2 internal anchors

  1. [1]

    Fletcher, Kartik Nayak, Benny Pinkas, and Ling Ren

    Ittai Abraham, Christopher W. Fletcher, Kartik Nayak, Benny Pinkas, and Ling Ren. 2017. Asymptotically Tight Bounds for Composing ORAM with PIR. InPublic Key Cryptography (1) (Lecture Notes in Computer Science). Springer, 91–120

  2. [2]

    Philip Adams, Menghao Li, Shi Zhang, Li Tan, Qi Chen, Mingqin Li, Zengzhong Li, Knut Magne Risvik, and Harsha Vardhan Simhadri

  3. [3]

    DISTRIBUTEDANN: Efficient Scaling of a Single DISKANN Graph Across Thousands of Computers.CoRRabs/2509.06046 (2025)

  4. [4]

    Sebastian Angel, Aditya Basu, Weidong Cui, Trent Jaeger, Stella Lau, Srinath T. V. Setty, and Sudheesh Singanamalla. 2023. Nimble: Roll- back Protection for Confidential Cloud Services. InOSDI. USENIX Association, 193–208

  5. [5]

    Eyers, Rüdiger Kapitza, Peter R

    Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, André Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Dan O’Keeffe, Mark Stillwell, David Goltzsche, David M. Eyers, Rüdiger Kapitza, Peter R. Pietzuch, and Christof Fetzer. 2016. SCONE: Secure Linux Containers with Intel SGX. InOSDI. USENIX Association, 689– 703

  6. [6]

    Gilad Asharov, Ilan Komargodski, and Yehuda Michelson. 2023. Fu- tORAMa: A Concretely Efficient Hierarchical Oblivious RAM. InCCS. ACM, 3313–3327

  7. [7]

    Martin Aumüller, Erik Bernhardsson, and Alexander John Faithfull

  8. [8]

    InSISAP (Lecture Notes in Computer Science)

    ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. InSISAP (Lecture Notes in Computer Science). Springer, 34–49

  9. [9]

    Compass Authors. 2024. Compass Faiss Fork.https://github.com/ Clive2312/faiss. Custom Faiss fork used for Compass index construc- tion

  10. [10]

    Lempitsky

    Artem Babenko and Victor S. Lempitsky. 2016. Efficient Indexing of Billion-Scale Datasets of Deep Descriptors. InCVPR. IEEE Computer Society, 2055–2063

  11. [11]

    Andrew Baumann, Marcus Peinado, and Galen C. Hunt. 2014. Shield- ing Applications from an Untrusted Cloud with Haven. InOSDI. USENIX Association, 267–283

  12. [12]

    Mihir Bellare and Chanathip Namprempre. 2000. Authenticated Encryption: Relations among Notions and Analysis of the Generic Composition Paradigm. InASIACRYPT (Lecture Notes in Computer Science). Springer, 531–545

  13. [13]

    Laura Blackstone, Seny Kamara, and Tarik Moataz. 2020. Revisiting Leakage Abuse Attacks. InNDSS. The Internet Society

  14. [14]

    Jo Van Bulck, Frank Piessens, and Raoul Strackx. 2017. SGX-Step: A Practical Attack Framework for Precise Enclave Execution Control. InSysTEX@SOSP. ACM, 4:1–4:6

  15. [15]

    David Cash, Paul Grubbs, Jason Perry, and Thomas Ristenpart. 2015. Leakage-Abuse Attacks Against Searchable Encryption. InCCS. ACM, 668–679

  16. [16]

    Javad Ghareh Chamani, Ioannis Demertzis, Dimitrios Papadopoulos, Charalampos Papamanthou, and Rasool Jalili. 2024. GraphOS: To- wards Oblivious Graph Processing.IACR Cryptol. ePrint Arch.2024 (2024), 642.https://eprint.iacr.org/2024/642

  17. [17]

    Razenshteyn, and M

    Hao Chen, Ilaria Chillotti, Yihe Dong, Oxana Poburinnaya, Ilya P. Razenshteyn, and M. Sadegh Riazi. 2020. SANNS: Scaling Up Se- cure Approximate k-Nearest Neighbors Search. InUSENIX Security Symposium. USENIX Association, 2111–2128

  18. [18]

    Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly- efficient Billion-scale Approximate Nearest Neighborhood Search. In NeurIPS. 5199–5212

  19. [19]

    Weikeng Chen and Raluca Ada Popa. 2020. Metal: A Metadata-Hiding File-Sharing System. InNDSS. The Internet Society

  20. [20]

    Pau-Chen Cheng, Wojciech Ozga, Enriquillo Valdez, Salman Ahmed, Zhongshu Gu, Hani Jamjoom, Hubertus Franke, and James Bottomley

  21. [21]

    Intel TDX demystified: A top-down approach.ACM Computing Surveys, 56(9):238:1–238:33, 2024

    Intel TDX Demystified: A Top-Down Approach.Comput. Surveys56, 9 (2024). doi:10.1145/3652597

  22. [22]

    Jalen Chuang, Alex Seto, Nicolas Berrios, Stephan van Schaik, Christina Garman, and Daniel Genkin. 2026. TEE.fail: Breaking Trusted Execution Environments via DDR5 Memory Bus Interpo- sition. In47th IEEE Symposium on Security and Privacy (IEEE S&P ’26). 13 IEEE Computer Society.https://tee.fail

  23. [23]

    Cohere. 2023. Wikipedia Embeddings (English, 768-dimensional). https://cohere.com/blog/embedding-archives-wikipedia

  24. [24]

    Scott Constable, Jo Van Bulck, Xiang Cheng, Yuan Xiao, Cedric Xing, Ilya Alexandrovich, Taesoo Kim, Frank Piessens, Mona Vij, and Mark Silberstein. 2023. AEX-Notify: Thwarting Precise Single-Stepping At- tacks through Interrupt Awareness for Intel SGX Enclaves. InUSENIX Security Symposium. USENIX Association, 4051–4068

  25. [25]

    Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin. 2021. MS MARCO: Benchmarking Ranking Models in the Large-Data Regime. InSIGIR. ACM, 1566–1576

  26. [26]

    Natacha Crooks, Matthew Burke, Ethan Cecchetti, Sitar Harel, Rachit Agarwal, and Lorenzo Alvisi. 2018. Obladi: Oblivious Serializable Transactions in the Cloud. InOSDI. USENIX Association, 727–743

  27. [27]

    Marc Damie, Florian Hahn, and Andreas Peter. 2021. A Highly Ac- curate Query-Recovery Attack against Searchable Encryption using Non-Indexed Documents. InUSENIX Security. USENIX Association, 143–160

  28. [28]

    DataStax. 2024. JVector: Graph-based vector search for Java.https: //github.com/datastax/jvector

  29. [29]

    Emma Dauterman, Vivian Fang, Ioannis Demertzis, Natacha Crooks, and Raluca Ada Popa. 2021. Snoopy: Surpassing the Scalability Bot- tleneck of Oblivious Storage. InSOSP. ACM, 655–671

  30. [30]

    Jesse De Meulemeester, David Oswald, Ingrid Verbauwhede, and Jo Van Bulck. 2026. Battering RAM: Low-Cost Interposer Attacks on Confidential Computing via Dynamic Memory Aliasing. In47th IEEE Symposium on Security and Privacy (S&P)

  31. [31]

    Jack Doerner and Abhi Shelat. 2017. Scaling ORAM for Secure Com- putation. InCCS. ACM, 523–535

  32. [32]

    Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2024. The Faiss library. (2024). arXiv:2401.08281 [cs.LG]

  33. [33]

    Engelsma, Anil K

    Joshua J. Engelsma, Anil K. Jain, and Vishnu Naresh Boddeti. 2022. HERS: Homomorphically Encrypted Representation Search.IEEE Trans. Biom. Behav. Identity Sci.4, 3 (2022), 349–360

  34. [34]

    Saba Eskandarian and Matei Zaharia. 2019. ObliDB: Oblivious Query Processing for Secure Databases.Proc. VLDB Endow.13, 2 (2019), 169–183

  35. [35]

    Abu-Ghazaleh, and Dmitry Ponomarev

    Dmitry Evtyushkin, Ryan Riley, Nael B. Abu-Ghazaleh, and Dmitry Ponomarev. 2018. BranchScope: A New Side-Channel Attack on Directional Branch Predictor. InASPLOS. ACM, 693–707

  36. [36]

    Stefan Gast, Hannes Weissteiner, Robin Leander Schröder, and Daniel Gruss. 2025. CounterSEVeillance: Performance-Counter Attacks on AMD SEV-SNP. InNDSS. The Internet Society

  37. [37]

    Goldman, Shai Halevi, Charanjit S

    Craig Gentry, Kenny A. Goldman, Shai Halevi, Charanjit S. Jutla, Mariana Raykova, and Daniel Wichs. 2013. Optimizing ORAM and Using It Efficiently for Secure Computation. InPrivacy Enhancing Technologies (Lecture Notes in Computer Science). Springer, 1–18

  38. [38]

    Oded Goldreich and Rafail Ostrovsky. 1996. Software Protection and Simulation on Oblivious RAMs.J. ACM43, 3 (1996), 431–473

  39. [39]

    Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premkumar Srinivasan, Amit Singh, and Har- sha Vardhan Simhadri. 2023. Filtered-DiskANN: Graph Algorithms for Approximate Nearest Neighbor Search with Filters. InWWW. ACM, 3406–3416

  40. [40]

    Goodrich and Michael Mitzenmacher

    Michael T. Goodrich and Michael Mitzenmacher. 2011. Privacy- Preserving Access of Outsourced Data via Oblivious RAM Simulation. InICALP (2) (Lecture Notes in Computer Science). Springer, 576–587

  41. [41]

    Google. 2025. Gemini Personal Intelligence.https://gemini.google/ overview/personal-intelligence/

  42. [42]

    Google Cloud. 2024. Confidential Computing.https://cloud.google. com/confidential-computing. Accessed March 2026

  43. [43]

    Google Cloud. 2026. Compute Engine Pricing.https://cloud.google. com/compute/vm-instance-pricing

  44. [44]

    Paul Grubbs, Thomas Ristenpart, and Vitaly Shmatikov. 2017. Why Your Encrypted Database Is Not Secure. InHotOS. ACM, 162–168

  45. [45]

    Harsha Simhadri. 2026. Overview of the DiskANN Project (2018–present).https://harsha-simhadri.org/diskann-overview.html

  46. [46]

    Alexandra Henzinger, Emma Dauterman, Henry Corrigan-Gibbs, and Nickolai Zeldovich. 2023. Private Web Search with Tiptoe. InSOSP. ACM, 396–416

  47. [47]

    Yizheng Huang and Jimmy Xiangji Huang. 2026. A Survey on Retrieval-Augmented Text Generation for Large Language Models. Comput. Surveys(2026). doi:10.1145/3805774

  48. [48]

    Tyler Hunt, Zhiting Zhu, Yuanzhong Xu, Simon Peter, and Emmett Witchel. 2016. Ryoan: A Distributed Sandbox for Untrusted Compu- tation on Secret Data. InOSDI. USENIX Association, 533–549

  49. [49]

    Intel. 2024. Scalable Vector Search (SVS).https://github.com/intel/ ScalableVectorSearch

  50. [50]

    Mohammad Saiful Islam, Mehmet Kuzu, and Murat Kantarcioglu

  51. [51]

    Access Pattern disclosure on Searchable Encryption: Ramifica- tion, Attack and Mitigation. InNDSS. The Internet Society

  52. [52]

    Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2011. Product Quantization for Nearest Neighbor Search.IEEE Trans. Pattern Anal. Mach. Intell.33, 1 (2011), 117–128

  53. [53]

    Hervé Jégou, Romain Tavenard, Matthijs Douze, and Laurent Am- saleg. 2011. Searching in one billion vectors: Re-rank with source coding. InICASSP. IEEE, 861–864

  54. [54]

    Grace Jia, Alex Wong, and Anurag Khandelwal. 2025. Found in Translation: A Generative Language Modeling Approach to Memory Access Pattern Attacks. InUSENIX Security Symposium. USENIX Association, 7957–7975

  55. [55]

    David Kaplan. 2020. SEV-SNP: Strengthening VM Isolation with Integrity Protection and More.https://www.amd.com/content/ dam/amd/en/documents/epyc-business-docs/white-papers/SEV- SNP-strengthening-vm-isolation-with-integrity-protection-and- more.pdf

  56. [56]

    Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. InSIGIR. ACM, 39–48

  57. [57]

    Kim, Chris Fallin, Ji-Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu

    Yoongu Kim, Ross Daly, Jeremie S. Kim, Chris Fallin, Ji-Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. 2014. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. InISCA. IEEE Computer Society, 361–372

  58. [58]

    Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, and Yuval Yarom. 2019. Spectre Attacks: Exploiting Speculative Execution. InIEEE Symposium on Security and Privacy. IEEE, 1–19

  59. [59]

    Ilan Komargodski and Wei-Kai Lin. 2021. A Logarithmic Lower Bound for Oblivious RAM (for All Parameters). InCRYPTO (4) (Lecture Notes in Computer Science). Springer, 579–609

  60. [60]

    Steven Lambregts, Huanhuan Chen, Jianting Ning, and Kaitai Liang

  61. [61]

    InESORICS (1) (Lecture Notes in Computer Sci- ence)

    VAL: Volume and Access Pattern Leakage-Abuse Attack with Leaked Documents. InESORICS (1) (Lecture Notes in Computer Sci- ence). Springer, 653–676

  62. [62]

    Kasper Green Larsen and Jesper Buus Nielsen. 2018. Yes, There is an Oblivious RAM Lower Bound!. InCRYPTO (2) (Lecture Notes in Computer Science). Springer, 523–542

  63. [63]

    Fang, Chia-Che Tsai, and Raluca Ada Popa

    Dayeol Lee, Dongha Jung, Ian T. Fang, Chia-Che Tsai, and Raluca Ada Popa. 2020. An Off-Chip Attack on Hardware Enclaves via the Mem- ory Bus. InUSENIX Security Symposium. USENIX Association, 487– 504

  64. [64]

    Sangho Lee, Ming-Wei Shih, Prasun Gera, Taesoo Kim, Hyesoon Kim, and Marcus Peinado. 2017. Inferring Fine-grained Control Flow 14 Inside SGX Enclaves with Branch Shadowing. InUSENIX Security Symposium. USENIX Association, 557–574

  65. [65]

    Woomin Lee, Taehun Kim, Seunghee Shin, Junbeom Hur, and Youngjoo Shin. 2025. T-Time: A Fine-Grained Timing-Based Controlled-Channel Attack Against Intel TDX. InESORICS (3) (Lec- ture Notes in Computer Science). Springer, 323–341

  66. [66]

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela

  67. [67]

    InNeurIPS

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. InNeurIPS

  68. [68]

    Jingyu Li, Zhicong Huang, Min Zhang, Cheng Hong, Jian Liu, Tao Wei, and Wenguang Chen. 2025. Panther: Private Approximate Nearest Neighbor Search in the Single Server Setting. InCCS. ACM, 365–379

  69. [69]

    Xiang Li, Yunqian Luo, and Mingyu Gao. 2024. Bulkor: Enabling Bulk Loading for Path ORAM. InSP. IEEE, 4258–4276

  70. [70]

    Linux Kernel Documentation. 2024. Bounce Buffers for Confidential Computing.https://www.kernel.org/doc/html/latest/core-api/dma- api.html

  71. [71]

    Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Anders Fogh, Jann Horn, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg. 2018. Meltdown: Reading Kernel Memory from User Space. InUSENIX Security Sym- posium. USENIX Association, 973–990

  72. [72]

    Steve Lu and Rafail Ostrovsky. 2013. Distributed Oblivious RAM for Secure Two-Party Computation. InTCC (Lecture Notes in Computer Science). Springer, 377–396

  73. [73]

    Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun

    Magdalen Dobson Manohar, Zheqi Shen, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun. 2024. ParlayANN: Scalable and Deterministic Parallel Graph-Based Approx- imate Nearest Neighbor Search Algorithms. InPPoPP. ACM, 270–285

  74. [74]

    Sommer, Arthur Gervais, Ari Juels, and Srdjan Capkun

    Sinisa Matetic, Mansoor Ahmed, Kari Kostiainen, Aritra Dhar, David M. Sommer, Arthur Gervais, Ari Juels, and Srdjan Capkun

  75. [75]

    InUSENIX Security Symposium

    ROTE: Rollback Protection for Trusted Execution. InUSENIX Security Symposium. USENIX Association, 1289–1306

  76. [76]

    McGrew and John Viega

    David A. McGrew and John Viega. 2004. The Security and Perfor- mance of the Galois/Counter Mode (GCM) of Operation. InProgress in Cryptology - INDOCRYPT 2004 (Lecture Notes in Computer Science). Springer, 343–355. doi:10.1007/978-3-540-30556-9_27

  77. [77]

    Oswald, Thomas Eisen- barth, Ingrid Verbauwhede, and Jo Van Bulck

    Jesse De Meulemeester, Luca Wilke, David F. Oswald, Thomas Eisen- barth, Ingrid Verbauwhede, and Jo Van Bulck. 2025. BadRAM: Practi- cal Memory Aliasing Attacks on Trusted Execution Environments. InSP. IEEE, 4117–4135

  78. [78]

    Micron. 2023. Micron 7450 MAX NVMe SSD.https://www.micron. com/products/storage/ssd/data-center-ssd/7450-ssd

  79. [79]

    Microsoft. 2024. Azure Confidential Computing.https://learn. microsoft.com/en-us/azure/confidential-computing/. Accessed March 2026

  80. [80]

    Microsoft. 2024. Retrace your steps with Recall.https: //support.microsoft.com/en-us/windows/retrace-your-steps- with-recall-aa03f8a0-a78b-4b3e-b0a1-2eb8ac48701c

Showing first 80 references.