pith. machine review for the scientific record. sign in

arxiv: 2604.19494 · v1 · submitted 2026-04-21 · 💻 cs.DC · cs.OS

Recognition: unknown

DPC: A Distributed Page Cache over CXL

Arash Tavakkol, Giorgio Negro, Ji Zhang, Julien Eudine, Onur Mutlu, Shai Bergman, Zhe Yang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:23 UTC · model grok-4.3

classification 💻 cs.DC cs.OS
keywords distributed page cacheCXLsingle-copy invariantremote memory mappingfile system cachecluster DRAMdata sharingcoherence overhead
0
0 comments X

The pith

DPC keeps one DRAM copy of each file page across a cluster by using CXL remote mappings instead of replicas.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Modern distributed file systems keep uncoordinated local copies of hot file pages on every node. This creates massive redundancy that wastes aggregate DRAM and forces slow lock-based coherence protocols. DPC instead treats the whole cluster's memory as one cache budget: each page lives in exactly one owner node's DRAM, and other nodes reach it through CXL remote mappings. The design keeps standard file-system interfaces while removing both the duplication and the heavy synchronization. Evaluation on a multi-host CXL emulation shows that this single-copy approach produces large speedups on data-sharing workloads.

Core claim

DPC is an OS-level distributed page cache built on CXL 3.0 memory semantics that enforces a single-copy invariant at page granularity: each file page has exactly one owner node holding the sole resident DRAM copy, and other nodes access it via CXL-based remote mappings rather than creating replicas. The system is implemented end-to-end on a CXL emulation framework that models multi-host fabrics and preserves standard file-system interfaces and semantics. Across real-world and representative data-sharing workloads, DPC delivers speedups of up to 12.4X, with a geometric-mean speedup of 5.6X.

What carries the argument

The single-copy invariant at page granularity, where one owner node holds the only DRAM resident copy and others reach it through CXL remote mappings.

Load-bearing premise

CXL remote mappings must deliver low enough latency and coherence cost that single-copy remote access beats local replication plus lock-based protocols.

What would settle it

A measurement on real CXL 3.0 hardware showing that a conventional replicated cache with locks is faster than DPC on the same data-sharing workloads would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2604.19494 by Arash Tavakkol, Giorgio Negro, Ji Zhang, Julien Eudine, Onur Mutlu, Shai Bergman, Zhe Yang.

Figure 1
Figure 1. Figure 1: Baseline architectures. Top: traditional cluster where [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Page-level directory state machine. For each page, a [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: The components of the DPC Client. 3.2 DPC Client design [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Experimental setup for CXL emulation, with mem [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Read latency, bandwidth, and IOPS using fio with libaio. CM CM-R CH-R 0 200 400 Latency [ s] 7046 7146 6873 2 2 Host Virtiofsd Cache Read I/O Virtiofs NFS JuiceFS DPC_SC DPC (a) Latency bs=4k,qd=1,jobs=1 CM CM-R CH-R 0 2 4 Bandwidth [GB/s] 18G 18G Virtiofs NFS JuiceFS DPC_SC DPC (b) Bandwidth bs=128k,qd=32,jobs=8 CM CM-R CH-R 10 0 10 1 10 2 10 3 KIOPS Virtiofs NFS JuiceFS DPC_SC DPC (c) IOPS bs=4k,qd=32,jo… view at source ↗
Figure 7
Figure 7. Figure 7: Read latency, bandwidth, and IOPS using fio with mmap. CM CM-R CH-R 0 200 400 Latency [ s] 9 9 2387 9 9 9 2392 9 9 9 2389 9 9 Host Virtiofsd Unlock LookLock Cache Virtiofs NFS JuiceFS DPC_SC DPC (a) Latency bs=4k,qd=1,jobs=1 CM CM-R CH-R 0 1 2 Bandwidth [GB/s] Virtiofs NFS JuiceFS DPC_SC DPC (b) Bandwidth bs=128k,qd=32,jobs=8 CM CM-R CH-R 10 0 10 1 10 2 KIOPS Virtiofs NFS JuiceFS DPC_SC DPC (c) IOPS bs=4k,… view at source ↗
Figure 8
Figure 8. Figure 8: Write latency, bandwidth, and IOPS using [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Write latency, bandwidth, and IOPS using [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Relative speedup of application benchmarks over single-node Virtiofs. For multi-node configurations, bars report [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
read the original abstract

Modern distributed file systems rely on uncoordinated, per node page caches that replicate hot data locally across the cluster. While ensuring fast local access, this architecture underutilizes aggregate cluster DRAM capacity through massive data redundancy and incurs prohibitive coherence overhead via heavyweight, lock-based protocols. In this paper, we focus on the design of a distributed page cache that treats the entire cluster's main memory as a single cache budget while preserving standard file-system interfaces and semantics. We present Distributed Page Cache (DPC), an OS-level, distributed page cache built on top of Compute Express Link (CXL) 3.0 memory semantics. DPC enforces a single-copy invariant at page granularity: each file page has exactly one owner node holding the sole resident DRAM copy, and other nodes access it via CXL-based remote mappings rather than creating replicas of the page. DPC is implemented end-to-end on a CXL-based emulation framework that models multi-host CXL 3.0 memory fabrics, enabling detailed evaluation in the absence of widespread hardware. Across real-world and representative data-sharing workloads, DPC delivers speedups of up to 12.4X, with a geometric-mean speedup of 5.6X.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents Distributed Page Cache (DPC), an OS-level system built on CXL 3.0 memory semantics that enforces a single-copy invariant for file pages across a cluster: each page has exactly one owner node holding the sole DRAM resident copy, with other nodes accessing it via remote mappings rather than local replicas. This replaces uncoordinated per-node caches and lock-based coherence in traditional distributed file systems. The design is implemented and evaluated end-to-end on a CXL-based multi-host emulation framework, with reported speedups of up to 12.4X (geometric mean 5.6X) across real-world and representative data-sharing workloads.

Significance. If the emulation results hold under real CXL 3.0 fabrics, DPC could meaningfully improve aggregate DRAM utilization and reduce coherence costs in distributed systems by eliminating page replication. The empirical evaluation on an emulation platform provides concrete evidence for the viability of single-copy remote access in this setting, which may inform future hardware-software co-design for CXL-enabled clusters.

major comments (2)
  1. [Evaluation] Evaluation section: the headline speedups (12.4X max, 5.6X geo-mean) are reported without details on baseline implementations (e.g., how lock-based coherence or replication is realized), workload selection criteria, number of runs, or error bars. This directly affects auditability of the central performance claim.
  2. [Emulation Framework] Emulation framework description: the single-copy invariant's advantage over local replication rests on the assumption that CXL 3.0 remote mappings deliver sufficiently low latency and coherence overhead; no validation of the emulator against real multi-host CXL hardware, sensitivity analysis on latency/bandwidth parameters, or comparison to alternative models is provided, which is load-bearing for the reported speedups.
minor comments (2)
  1. [Abstract] Abstract and introduction: the description of 'real-world and representative data-sharing workloads' could be expanded with concrete names or characteristics to allow readers to assess representativeness.
  2. [Design] The paper would benefit from a clearer statement of the exact file-system interfaces and semantics that are preserved under the single-copy model.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and will revise the paper to improve clarity and auditability where possible.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the headline speedups (12.4X max, 5.6X geo-mean) are reported without details on baseline implementations (e.g., how lock-based coherence or replication is realized), workload selection criteria, number of runs, or error bars. This directly affects auditability of the central performance claim.

    Authors: We agree that additional details are required for full auditability. In the revised manuscript we will expand the Evaluation section with explicit descriptions of how the lock-based coherence and replication baselines are implemented, the criteria used to select the real-world and representative data-sharing workloads, the number of runs performed for each experiment, and error bars on all reported results. revision: yes

  2. Referee: [Emulation Framework] Emulation framework description: the single-copy invariant's advantage over local replication rests on the assumption that CXL 3.0 remote mappings deliver sufficiently low latency and coherence overhead; no validation of the emulator against real multi-host CXL hardware, sensitivity analysis on latency/bandwidth parameters, or comparison to alternative models is provided, which is load-bearing for the reported speedups.

    Authors: We acknowledge that direct validation on real multi-host CXL 3.0 hardware is not possible at present. Our emulation framework is derived from the CXL 3.0 specification and prior published emulation studies. In the revision we will add a sensitivity analysis over realistic latency and bandwidth ranges together with a comparison to alternative emulation models to strengthen the supporting evidence for the reported speedups. revision: partial

standing simulated objections not resolved
  • Direct validation of the emulation framework against real multi-host CXL 3.0 hardware, which is not yet commercially available.

Circularity Check

0 steps flagged

No circularity: empirical system evaluation with independent measurements

full rationale

The paper presents a system design for a distributed page cache using CXL 3.0, enforces a single-copy invariant via remote mappings, implements it on an emulation framework, and reports measured speedups (up to 12.4X, geo-mean 5.6X) across workloads. No mathematical derivations, equations, fitted parameters, or self-citation chains exist that reduce any claim to its own inputs by construction. Performance results rest on runtime benchmarks rather than self-referential definitions or renamings of prior results. This is a standard empirical systems paper whose central claims are falsifiable via external hardware measurements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The design rests on standard assumptions about OS page-cache semantics and CXL memory semantics rather than new mathematical axioms or invented physical entities.

axioms (1)
  • domain assumption CXL 3.0 memory semantics provide coherent, low-overhead remote page mappings that preserve file-system correctness
    Invoked when the paper states that remote mappings replace local replicas while keeping standard interfaces.

pith-pipeline@v0.9.0 · 5526 in / 1338 out tokens · 33353 ms · 2026-05-10T01:23:22.811206+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

73 extracted references · 3 canonical work pages · 1 internal anchor

  1. [1]

    Anderson, Marco Canini, Jongyul Kim, De- jan Kosti´c, Youngjin Kwon, Simon Peter, Waleed Reda, Henry N

    Thomas E. Anderson, Marco Canini, Jongyul Kim, De- jan Kosti´c, Youngjin Kwon, Simon Peter, Waleed Reda, Henry N. Schuh, and Emmett Witchel. Assise: Per- formance and Availability via Client-local NVM in a Distributed File System. In14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 1011–1027. USENIX Association, November 2020

  2. [2]

    Study of CXL Mem- ory Sharing with FamFS and its Use cases

    Ramesh Aravind and Groves John. Study of CXL Mem- ory Sharing with FamFS and its Use cases. In2024 IEEE 31st International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW), pages 73–77, 2024

  3. [3]

    Mustafa Rafique, and Sudharshan Vazhkudai

    Moiz Arif, Kevin Assogba, M. Mustafa Rafique, and Sudharshan Vazhkudai. Exploiting CXL-based Memory for Distributed Deep Learning. InProceedings of the 51st International Conference on Parallel Processing, ICPP ’22, New York, NY , USA, 2023. Association for Computing Machinery

  4. [4]

    Flexible I/O Tester

    Jens Axboe. Flexible I/O Tester. https://github. com/axboe/fio. Software, licensed under GNU GPL v2.0

  5. [5]

    Peter J. Braam. The Lustre storage architecture. Techni- cal Report arXiv:1903.01955, arXiv, 2019

  6. [6]

    O’Reilly Media, Inc

    Jeff Carpenter and Eben Hewitt.Cassandra: The Defini- tive Guide,(Revised). " O’Reilly Media, Inc.", 2022

  7. [7]

    https://computeexpresslink.org/wp-content/ uploads/2024/02/CXL-2.0-Specification.pdf

    Compute express link 2.0 specification. https://computeexpresslink.org/wp-content/ uploads/2024/02/CXL-2.0-Specification.pdf

  8. [8]

    https://computeexpresslink.org/wp-content/ uploads/2024/02/CXL-3.0-Specification.pdf

    Compute express link 3.0 specification. https://computeexpresslink.org/wp-content/ uploads/2024/02/CXL-3.0-Specification.pdf

  9. [9]

    https://computeexpresslink.org/wp-content/ uploads/2024/02/CXL-3.1-Specification.pdf

    Compute express link 3.1 specification. https://computeexpresslink.org/wp-content/ uploads/2024/02/CXL-3.1-Specification.pdf

  10. [10]

    Dahlin, Randolph Y

    Michael D. Dahlin, Randolph Y . Wang, Thomas E. An- derson, and David A. Patterson. Cooperative caching: using remote client memory to improve file system per- formance. InProceedings of the 1st USENIX Confer- ence on Operating Systems Design and Implementation, OSDI ’94, page 19–es, USA, 1994. USENIX Associa- tion

  11. [11]

    Symbiosis: The art of application and kernel cache cooperation

    Yifan Dai, Jing Liu, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. Symbiosis: The art of application and kernel cache cooperation. In22nd USENIX Conference on File and Storage Technologies (FAST 24), pages 51– 69, 2024

  12. [12]

    Albert Danial. cloc. https://github.com/ AlDanial/cloc, 2021

  13. [13]

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    DeepSeek-AI, Aixin Liu, Bei Feng, , et al. DeepSeek- V2: A Strong, Economical, and Efficient Mixture-of- Experts Language Model. https://arxiv.org/abs/ 2405.04434, 2024

  14. [14]

    DAX Devices

    The Linux Kernel development community. DAX Devices. https://www.kernel.org/doc/html/ latest/driver-api/cxl/allocation/dax.html

  15. [15]

    A buffer cache management scheme exploiting both temporal and spatial localities.ACM Transactions on Storage (TOS), 3(2):5–es, 2007

    Xiaoning Ding, Song Jiang, and Feng Chen. A buffer cache management scheme exploiting both temporal and spatial localities.ACM Transactions on Storage (TOS), 3(2):5–es, 2007

  16. [16]

    RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

    Siying Dong, Mark Callaghan, Dhruba Borthakur, et al. RocksDB: A Persistent Key-Value Store for Flash and RAM Storage. http://rocksdb.org/, 2013. Face- book Open Source

  17. [17]

    Evolution of development priorities in key- value stores serving large-scale applications: The RocksDB Experience

    Siying Dong, Andrew Kryczka, Yanqin Jin, and Michael Stumm. Evolution of development priorities in key- value stores serving large-scale applications: The RocksDB Experience. In19th USENIX Conference on File and Storage Technologies (FAST 21), pages 33–49. USENIX Association, 2021

  18. [18]

    Evolution of Development Priorities in Key- value Stores Serving Large-scale Applications: The RocksDB Experience

    Siying Dong, Andrew Kryczka, Yanqin Jin, and Michael Stumm. Evolution of Development Priorities in Key- value Stores Serving Large-scale Applications: The RocksDB Experience. In19th USENIX Conference on File and Storage Technologies (FAST 21), pages 33–49. USENIX Association, February 2021

  19. [19]

    FaRM: Fast remote mem- ory

    Aleksandar Dragojevi´c, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast remote mem- ory. In11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401–414, 2014

  20. [20]

    DPFS: DPU-Powered File System Virtualization

    Peter-Jan Gootzen, Jonas Pfefferle, Radu Stoica, and Animesh Trivedi. DPFS: DPU-Powered File System Virtualization. InProceedings of the 16th ACM Inter- national Conference on Systems and Storage, SYSTOR ’23, page 1–7, New York, NY , USA, 2023. Association for Computing Machinery. 13

  21. [21]

    Direct Access, High-Performance Memory Disaggregation with DirectCXL

    Donghyun Gouk, Sangwon Lee, Miryeong Kwon, and Myoungsoo Jung. Direct Access, High-Performance Memory Disaggregation with DirectCXL. In2022 USENIX Annual Technical Conference (USENIX ATC 22), pages 287–294, Carlsbad, CA, July 2022. USENIX Association

  22. [22]

    Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G. Shin. Efficient Memory Dis- aggregation with Infiniswap. In14th USENIX Sympo- sium on Networked Systems Design and Implementation (NSDI 17), pages 649–667, Boston, MA, March 2017. USENIX Association

  23. [23]

    RDMA over commodity ethernet at scale

    Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. RDMA over commodity ethernet at scale. InProceedings of the 2016 ACM SIGCOMM Conference, pages 202–215, 2016

  24. [24]

    A CXL- Powered Database System: Opportunities and Challenges

    Yunyan Guo and Guoliang Li. A CXL- Powered Database System: Opportunities and Challenges. In 2024 IEEE 40th International Conference on Data En- gineering (ICDE), ICDE’24, pages 5593–5604, May 2024

  25. [25]

    Tsirkin, and David A

    Stefan Hajnoczi, Michael S. Tsirkin, and David A. Gilbert. virtiofs: A shared file system for virtual ma- chines. https://virtio-fs.gitlab.io/, 2019. Red Hat. Accessed: 2026-03-27

  26. [26]

    Network File System (NFS) Version 4 Minor Version 1 Protocol

    IETF NFS Working Group. Network File System (NFS) Version 4 Minor Version 1 Protocol. https:// datatracker.ietf.org/doc/html/rfc5661, 2009

  27. [27]

    Morgan Kaufmann, 2010

    Bruce Jacob, David Wang, and Spencer Ng.Memory systems: cache, DRAM, disk. Morgan Kaufmann, 2010

  28. [28]

    DULO: an effective buffer cache management scheme to exploit both temporal and spa- tial locality

    Song Jiang, Xiaoning Ding, Feng Chen, Enhua Tan, and Xiaodong Zhang. DULO: an effective buffer cache management scheme to exploit both temporal and spa- tial locality. InProceedings of the 4th conference on USENIX Conference on File and Storage Technologies, volume 4, pages 8–8, 2005

  29. [29]

    A locality-aware cooperative cache management protocol to improve network file system performance

    Song Jiang, Fabrizio Petrini, Xiaoning Ding, and Xi- aodong Zhang. A locality-aware cooperative cache management protocol to improve network file system performance. In26th IEEE International Conference on Distributed Computing Systems (ICDCS’06), pages 42–42. IEEE, 2006

  30. [30]

    Juicefs: A high-performance cloud- native distributed posix file system

    Juicedata, Inc. Juicefs: A high-performance cloud- native distributed posix file system. https://juicefs. com/

  31. [31]

    JuiceFS: A distributed POSIX file sys- tem for cloud

    Juicedata Inc. JuiceFS: A distributed POSIX file sys- tem for cloud. https://github.com/juicedata/ juicefs, 2021. Open-sourced 2021. Accessed: 2026- 03-27

  32. [32]

    Optimizing JuiceFS Read Performance

    JuiceFS Engineering Team. Optimizing JuiceFS Read Performance. https://juicefs.com/en/blog/ engineering/optimize-read-performance

  33. [33]

    Product Quantization for Nearest Neighbor Search

    Herve Jégou, Matthijs Douze, and Cordelia Schmid. Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):117–128, 2011

  34. [34]

    Kafka: A distributed messaging system for log processing

    Jay Kreps, Neha Narkhede, and Jun Rao. Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB Workshop, 2011

  35. [35]

    Predicting Future File-System Actions From Prior Events

    Tom M Kroeger and Darrell DE Long. Predicting Future File-System Actions From Prior Events. InUSENIX Annual Technical Conference, pages 319–328, 1996

  36. [36]

    The case for efficient file access pattern modeling

    Tom M Kroeger and Darrell DE Long. The case for efficient file access pattern modeling. InProceedings of the Seventh Workshop on Hot Topics in Operating Systems, pages 14–19. IEEE, 1999

  37. [37]

    Cassandra: A decentralized structured storage system.ACM SIGOPS Operating Systems Review, 44(2):35–40, 2010

    Avinash Lakshman and Prashant Malik. Cassandra: A decentralized structured storage system.ACM SIGOPS Operating Systems Review, 44(2):35–40, 2010

  38. [38]

    P2cache: An application-directed page cache for improving performance of data-intensive ap- plications

    Dusol Lee, Inhyuk Choi, Chanyoung Lee, Sungjin Lee, and Jihong Kim. P2cache: An application-directed page cache for improving performance of data-intensive ap- plications. InProceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems, pages 31–36, 2023

  39. [39]

    Berger, Lisa Hsu, Daniel Ernst, Pantea Zardoshti, Stanko Novakovic, Monish Shah, Samir Rajadnya, Scott Lee, Ishwar Agarwal, Mark D

    Huaicheng Li, Daniel S. Berger, Lisa Hsu, Daniel Ernst, Pantea Zardoshti, Stanko Novakovic, Monish Shah, Samir Rajadnya, Scott Lee, Ishwar Agarwal, Mark D. Hill, Marcus Fontoura, and Ricardo Bianchini. Pond: CXL-Based Memory Pooling Systems for Cloud Plat- forms. InProceedings of the 28th ACM International Conference on Architectural Support for Programmi...

  40. [40]

    StreamCache: Re- visiting Page Cache for File Scanning on Fast Storage Devices

    Zhiyue Li and Guangyan Zhang. StreamCache: Re- visiting Page Cache for File Scanning on Fast Storage Devices. In2024 USENIX Annual Technical Conference (USENIX ATC 24), pages 1119–1134, Santa Clara, CA, July 2024. USENIX Association

  41. [41]

    Dissecting cxl memory per- formance at scale: Analysis, modeling, and optimization

    Jinshu Liu, Hamid Hadian, Hanchen Xu, Daniel S Berger, and Huaicheng Li. Dissecting cxl memory per- formance at scale: Analysis, modeling, and optimization. arXiv preprint arXiv:2409.14317, 2024. 14

  42. [42]

    InfiniFS: An Efficient Metadata Service for Large-Scale Distributed Filesystems

    Wenhao Lv, Youyou Lu, Yiming Zhang, Peile Duan, and Jiwu Shu. InfiniFS: An Efficient Metadata Service for Large-Scale Distributed Filesystems. In20th USENIX Conference on File and Storage Technologies (FAST 22), pages 313–328, Santa Clara, CA, February 2022. USENIX Association

  43. [43]

    Effec- tively Prefetching Remote Memory with Leap

    Hasan Al Maruf and Mosharaf Chowdhury. Effec- tively Prefetching Remote Memory with Leap. In2020 USENIX Annual Technical Conference (USENIX ATC 20), pages 843–857. USENIX Association, July 2020

  44. [44]

    TPP: Transparent Page Place- ment for CXL-Enabled Tiered-Memory

    Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Jo- hannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, and Prakash Chauhan. TPP: Transparent Page Place- ment for CXL-Enabled Tiered-Memory. InProceedings of the 28th ACM International Conference on Architec- tural Support for Programming Languages and Opera...

  45. [45]

    Us- ing One-Sided RDMA Reads to Build a Fast, CPU- Efficient Key-Value Store

    Christopher Mitchell, Yifeng Geng, and Jinyang Li. Us- ing One-Sided RDMA Reads to Build a Fast, CPU- Efficient Key-Value Store. In2013 USENIX Annual Technical Conference (USENIX ATC 13), pages 103– 114, 2013

  46. [46]

    DOCA SNAP Virtio-fs Service Guide

    NVIDIA Corporation. DOCA SNAP Virtio-fs Service Guide. https://docs.nvidia.com/doca/sdk/ nvidia-doca-snap-virtio-fs-service-guide. pdf, 2024

  47. [47]

    Virtual I/O Device (VIRTIO) Specification, Ver- sion 1.3

    OASIS Virtual I/O Device (VIRTIO) Technical Com- mittee. Virtual I/O Device (VIRTIO) Specification, Ver- sion 1.3. https://docs.oasis-open.org/virtio/ virtio/v1.3/virtio-v1.3.pdf

  48. [48]

    IO- Lite: A unified I/O buffering and caching system.ACM Transactions on Computer Systems (TOCS), 18(1):37– 66, 2000

    Vivek S Pai, Peter Druschel, and Willy Zwaenepoel. IO- Lite: A unified I/O buffering and caching system.ACM Transactions on Computer Systems (TOCS), 18(1):37– 66, 2000

  49. [49]

    Op- timizing Memory-mapped I/O for Fast Storage De- vices

    Anastasios Papagiannis, Giorgos Xanthakis, Giorgos Saloustros, Manolis Marazakis, and Angelos Bilas. Op- timizing Memory-mapped I/O for Fast Storage De- vices. In2020 USENIX Annual Technical Conference (USENIX ATC 20), ATC’20, pages 813–827, 2020

  50. [50]

    ScalaCache: Scalable User-Space Page Cache Management with Software- Hardware Coordination

    Li Peng, Yuda An, You Zhou, Chenxi Wang, Qiao Li, Chuanning Cheng, and Jie Zhang. ScalaCache: Scalable User-Space Page Cache Management with Software- Hardware Coordination. In2024 USENIX Annual Tech- nical Conference (USENIX ATC 24), pages 1185–1202, 2024

  51. [51]

    Scalecache: A scalable page cache for multiple solid-state drives

    Kiet Tuan Pham, Seokjoo Cho, Sangjin Lee, Lan Anh Nguyen, Hyeongi Yeo, Ipoom Jeong, Sungjin Lee, Nam Sung Kim, and Yongseok Son. Scalecache: A scalable page cache for multiple solid-state drives. In Proceedings of the Nineteenth European Conference on Computer Systems, EuroSys ’24, page 641–656, New York, NY , USA, 2024. Association for Computing Ma- chinery

  52. [52]

    Inter-VM Shared Memory device (ivshmem)

    QEMU Project. Inter-VM Shared Memory device (ivshmem). https://www.qemu.org/docs/master/ system/devices/ivshmem.html

  53. [53]

    QEMU: Open Source Processor Emu- lator.https://www.qemu.org/

    QEMU Project. QEMU: Open Source Processor Emu- lator.https://www.qemu.org/

  54. [54]

    KVM: The Linux vir- tual machine monitor.Proceedings Linux Symposium, 15, 01 2007

    Avi Qumranet, Yaniv Qumranet, Dor Qumranet, Uri Qumranet, and Anthony Liguori. KVM: The Linux vir- tual machine monitor.Proceedings Linux Symposium, 15, 01 2007

  55. [55]

    IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion

    Kai Ren, Qing Zheng, Swapnil Patil, and Garth Gibson. IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion. InSC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Anal- ysis, pages 237–248, 2014

  56. [56]

    virtio: Towards a de-facto standard for virtual I/O devices.ACM SIGOPS Operating Systems Review, 42(5):95–103, 2008

    Rusty Russell. virtio: Towards a de-facto standard for virtual I/O devices.ACM SIGOPS Operating Systems Review, 42(5):95–103, 2008

  57. [57]

    Design and implementation of the Sun network filesystem

    Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon. Design and implementation of the Sun network filesystem. InProceedings of the USENIX Summer Conference, pages 119–130. USENIX Associa- tion, 1985

  58. [58]

    GPFS: A shared- disk file system for large computing clusters

    Frank Schmuck and Roger Haskin. GPFS: A shared- disk file system for large computing clusters. InPro- ceedings of the USENIX Conference on File and Storage Technologies (FAST ’02), pages 231–244, Monterey, CA,

  59. [59]

    Lustre: Building a file system for 1000-node clusters

    Philip Schwan et al. Lustre: Building a file system for 1000-node clusters. InProceedings of the 2003 Linux symposium, volume 2003, pages 380–386, 2003

  60. [60]

    Disk cache-miss ratio analysis and de- sign considerations.ACM Transactions on Computer Systems (TOCS), 3(3):161–203, 1985

    Alan J Smith. Disk cache-miss ratio analysis and de- sign considerations.ACM Transactions on Computer Systems (TOCS), 3(3):161–203, 1985

  61. [61]

    Curran Associates Inc., Red Hook, NY , USA, 2019

    Suhas Jayaram Subramanya, Devvrit, Rohan Kadekodi, Ravishankar Krishaswamy, and Harsha Vardhan Simhadri.DiskANN: fast accurate billion-point nearest neighbor search on a single node, chapter 1233, pages 1–11. Curran Associates Inc., Red Hook, NY , USA, 2019. 15

  62. [62]

    Demystifying cxl memory with genuine cxl-ready systems and devices

    Yan Sun, Yifan Yuan, Zeduo Yu, Reese Kuper, Chi- hun Song, Jinghan Huang, Houxiang Ji, Siddharth Agar- wal, Jiaqi Lou, Ipoom Jeong, Ren Wang, Jung Ho Ahn, Tianyin Xu, and Nam Sung Kim. Demystifying cxl memory with genuine cxl-ready systems and devices. InProceedings of the 56th Annual IEEE/ACM Interna- tional Symposium on Microarchitecture, MICRO ’23, pag...

  63. [63]

    Filebench: A flexible framework for file system bench- marking.USENIX ;login:, 41(1):6–12, 2016

    Vasily Tarasov, Erez Zadok, and Spencer Shepler. Filebench: A flexible framework for file system bench- marking.USENIX ;login:, 41(1):6–12, 2016

  64. [64]

    To FUSE or not to FUSE: Performance of User-Space file systems

    Bharath Kumar Reddy Vangoor, Vasily Tarasov, and Erez Zadok. To FUSE or not to FUSE: Performance of User-Space file systems. In15th USENIX Conference on File and Storage Technologies (FAST 17), pages 59–72, 2017

  65. [65]

    virtio-fs - shared file system for virtual machines.https://virtio-fs.gitlab.io/

    virtio-fs Project. virtio-fs - shared file system for virtual machines.https://virtio-fs.gitlab.io/

  66. [66]

    virtiofsd - vhost -user virtio-fs de- vice daemon

    virtio-fs Project. virtiofsd - vhost -user virtio-fs de- vice daemon. https://gitlab.com/virtio-fs/ virtiofsd

  67. [67]

    Unlocking the Potential of CXL for Disaggre- gated Memory in Cloud-Native Databases

    Xinjun Yang, Yingqiang Zhang, Hao Chen, Feifei Li, Gerry Fan, Yang Kong, Bo Wang, Jing Fang, Yuhui Wang, Tao Huang, Wenpu Hu, Jim Kao, and Jianping Jiang. Unlocking the Potential of CXL for Disaggre- gated Memory in Cloud-Native Databases. InCompan- ion of the 2025 International Conference on Manage- ment of Data, pages 689–702, Berlin Germany, June

  68. [68]

    Sno- eren, and Kimberly Keeton

    Anil Yelam, Kan Wu, Zhiyuan Guo, Suli Yang, Rajath Shashidhara, Wei Xu, Stanko Novakovic, Alex C. Sno- eren, and Kimberly Keeton. Pageflex: flexible and effi- cient user-space delegation of linux paging policies with ebpf. InProceedings of the 2025 USENIX Conference on Usenix Annual Technical Conference, USENIX ATC ’25, USA, 2025. USENIX Association

  69. [69]

    SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Parti- tion

    Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief. SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Parti- tion. InSC18: International Conference for High Perfor- mance Computing, Networking, Storage and Analysis, pages 1–13, 2018

  70. [70]

    Partial Failure Resilient Memory Man- agement System for (CXL-based) Distributed Shared Memory

    Mingxing Zhang, Teng Ma, Jinqi Hua, Zheng Liu, Kang Chen, Ning Ding, Fan Du, Jinlei Jiang, Tao Ma, and Yongwei Wu. Partial Failure Resilient Memory Man- agement System for (CXL-based) Distributed Shared Memory. InProceedings of the 29th Symposium on Op- erating Systems Principles, SOSP ’23, page 658–674, New York, NY , USA, 2023. Association for Computing...

  71. [71]

    Da Zheng, Randal Burns, and Alexander S. Szalay. To- ward millions of file system IOPS on low-cost, com- modity hardware. InProceedings of the International Conference on High Performance Computing, Network- ing, Storage and Analysis, SC ’13, New York, NY , USA,

  72. [72]

    Association for Computing Machinery

  73. [73]

    cache_ext: Customizing the Page Cache with eBPF

    Tal Zussman, Ioannis Zarkadas, Jeremy Carin, Andrew Cheng, Hubertus Franke, Jonas Pfefferle, and Asaf Cidon. cache_ext: Customizing the Page Cache with eBPF. InProceedings of the ACM SIGOPS 31st Sympo- sium on Operating Systems Principles, SOSP ’25, page 462–478, New York, NY , USA, 2025. Association for Computing Machinery. 16