pith. machine review for the scientific record. sign in

arxiv: 2604.06682 · v2 · submitted 2026-04-08 · 💻 cs.DC · cs.OS

Recognition: 2 theorem links

· Lean Theorem

Nexus: Transparent I/O Offloading for High-Density Serverless Computing

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:31 UTC · model grok-4.3

classification 💻 cs.DC cs.OS
keywords serverless computingI/O offloadingKVM hypervisorhigh-density deploymenttransparent offloadingvirtual machinesFaaS performance
0
0 comments X

The pith

By offloading I/O processing to a shared host backend, Nexus allows serverless VMs to run with up to 44% less CPU and 31% less memory while preserving full compatibility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Serverless platforms rely on VMs to isolate many small functions, but this forces each VM to carry a full copy of the cloud SDK, RPC layer, and network stack. Nexus intercepts calls right at the function's API boundary and redirects them through zero-copy shared memory to a single always-on host service. The change frees VMs from duplicating I/O logic, cuts overall node resource use, and supports overlapping data movement with VM startup. Because the offload happens below the programming model, existing code and dependencies continue to work unchanged. The measured gains include higher function density per server and faster response times.

Core claim

Nexus is a KVM-based hypervisor that transparently decouples compute from I/O by intercepting the communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the heavyweight communication fabric from the guest VM while preserving the conventional serverless programming model. The structural separation enables asynchronous I/O optimizations such as overlapping input payload prefetching with VM restoration from a snapshot and writing output payloads back to storage off the critical path.

What carries the argument

API-boundary interception with zero-copy shared-memory offload to a host-resident I/O backend

If this is right

  • Node-level CPU consumption drops by up to 44% compared with the production baseline.
  • Memory consumption drops by up to 31%, which increases deployment density by 37%.
  • Warm-start latency falls 39% and cold-start latency falls 10%.
  • End-to-end response time stays within 20% of a WebAssembly-based but ecosystem-incompatible hypervisor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The offload approach could be applied to other hypervisor-based serverless systems that currently duplicate network and RPC stacks inside every tenant.
  • Placing the shared I/O logic in one host component may simplify performance tuning and security monitoring across all functions.
  • Further gains might come from letting the host backend batch or cache requests that cross multiple functions.

Load-bearing premise

Intercepting the communication fabric at the API boundary can be performed transparently and securely inside a multi-tenant KVM environment without introducing new bottlenecks, compatibility breaks, or isolation violations.

What would settle it

Running the system on production workloads and observing either no meaningful drop in node CPU or memory use, or an increase in API compatibility failures or security incidents.

Figures

Figures reproduced from arXiv: 2604.06682 by Dmitrii Ustiugov, JooYoung Park, Jovan Stojkovic, Kevin Nguetchouang, Likun Zhang, Marco Cali, Riccardo Mancini.

Figure 1
Figure 1. Figure 1: Traditional serverless architecture overview. domain-specific dependencies—such as Python’s machine learning ecosystem and Node.js’s extensive API SDKs. These dependencies introduce significant migration barriers be￾cause they lack the maturity of high-performance compiled runtimes like C++ or Rust. Consequently, preserving compat￾ibility with existing FaaS programming models, container￾ized deployment str… view at source ↗
Figure 3
Figure 3. Figure 3: Breakdown of memory footprint for each com￾ponent during function execution averaged across vSwarm workloads. cycles, which is overhead that steals CPU time that would have been allocated to the actual functions’ logic. In summary, the heavy communication fabric inflates user￾space cycles, while the virtualized network stack amplifies the kernel-space cycles due to the hypervisor activity. 3.2 Memory Overh… view at source ↗
Figure 2
Figure 2. Figure 2: CPU cycles breakdown: (a) overall on a worker node across host and guest domains; (b) CPU cycles and (c) instructions for a synthetic single-PUT workload across different communication fabrics (TCP vs. SDKs) and (d) CPU cycles across bare-metal and virtualized environments. increases CPU cycles by 3x and 5x (for Python and Go, re￾spectively), which we attribute to the increase in the number of executed ins… view at source ↗
Figure 4
Figure 4. Figure 4: Nexus serverless architecture overview. creation and allow remote writes to complete after the func￾tion invocation’s processing completes. This breaks the strict restore–fetch–compute–write serialization, identified in §3.3, and shortens the critical path of the invocation. Nexus does so while preserving safety: the backend takes responsibility for performing the I/O transfer, e.g., to remote storage, on … view at source ↗
Figure 5
Figure 5. Figure 5: Function execution lifecycle with RPC manage￾ment and cloud storage access offloading. interfaces to perform low-level networking, it triggers the legacy path, which transparently falls back to standard vir￾tualized Ethernet devices governed by the same fixed-rate￾limiting mechanisms as the baseline architecture. 4.2 Anatomy of an Invocation This decoupled architecture fundamentally transforms the traditio… view at source ↗
Figure 6
Figure 6. Figure 6: End-to-End latency evaluation and resource utiliza￾tion as deployment density scales. Each deployed function serves a trace sampled from the Azure Functions production dataset [53]. 7 Evaluation In this section, we evaluate the design and implementation of Nexus. We first evaluate whether Nexus improves de￾ployment density in a cluster with a realistic mix of func￾tions (§7.1), and then explain the resulti… view at source ↗
Figure 7
Figure 7. Figure 7: Warm latency across vSwarm workloads normal￾ized to Baseline. Nexus reduce guest-side I/O processing. Figure 6a shows that Baseline sustains up to 320 deployed functions while meeting the target SLO, whereas Nexus￾TCP and Nexus-Async sustain 380 and Nexus sustains 440, respectively, corresponding to the deployment density gains of 18% and 37%, respectively. To explain these benefits, we analyze the cluster… view at source ↗
Figure 9
Figure 9. Figure 9: kvm exit and kvm vcpu wakeup event rates across vSwarm workloads normalized per invocation. Nexus re￾duces both rates compared to baseline (1.0). 7.2.1 CPU Cycles. We first evaluate the impact of com￾pute and I/O decoupling on warm execution latency using the same set of vSwarm functions as described in §6. We measure unloaded latency by deploying a single function instance and repeatedly sending a request… view at source ↗
Figure 10
Figure 10. Figure 10: Per-function instance memory footprint across vSwarm workloads, normalized to Baseline. Nexus reduces per-VM memory footprint by consolidating the communica￾tion fabric out of the VM to Nexus’s backend. 1 5 10 20 30 40 Number of Function Instances per Worker Node 0 100 200 Memory Footprint (MB) -10% -19% -20% -21% -21% -21% Nexus Baseline Function Instance Nexus Backend [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗
Figure 11
Figure 11. Figure 11: Worker node memory footprint breakdown across various deployment densities. Nexus amortized the shared communication fabric on the Nexus’s backend, con￾sistently reducing the footprint by 10-21%. to break the collection into guest and host user and kernel space, and report them normalized to the baseline. For KVM activity, we use the 𝑝𝑒𝑟 𝑓 -𝑘𝑣𝑚 tool and deduct per invocation. The results are normalized to… view at source ↗
Figure 14
Figure 14. Figure 14: Execution time, per-invocation CPU cycles break￾down, and memory footprint of AES encryption workload under the 3 studied systems: Baseline, Nexus, and Faasm. 7.2.3 Cold Latency Breakdown. We next analyze cold￾start latency to understand how Nexus reduces it by invoking functions one at a time. With instrumentation, we capture the latency breakdown ( [PITH_FULL_IMAGE:figures/full_fig_p012_14.png] view at source ↗
read the original abstract

Serverless computing relies on extreme multi-tenancy to remain economically viable, driving providers to rely on virtual machines (VMs) that ensure strong isolation and seamless ecosystem compatibility with the FaaS programming model. However, current architectures tightly couple application processing logic with I/O processing, forcing every VM to duplicate a heavy communication fabric (cloud SDK, RPC, and TCP/IP). Our analysis reveals this duplication consumes over 25% of a function's memory footprint, and may double the CPU cycles in VMs compared to bare-metal execution. While prior systems attempt to solve this using WebAssembly or library OSes, they naively sacrifice ecosystem compatibility, forcing developers to migrate code and dependencies to new languages. We introduce Nexus, a serverless-native KVM-based hypervisor that transparently decouples compute from I/O. Nexus shifts the execution model by intercepting communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the heavyweight communication fabric from the guest VM, while preserving the conventional serverless programming model. By structurally separating these domains, Nexus unlocks asynchronous I/O optimizations: overlapping input payload prefetching with VM restoration from a snapshot and writing output payloads back to storage off the critical path. Compared to the production baseline, Nexus reduces overall node-level CPU and memory consumption by up to 44% and 31%, respectively, thus increasing deployment density by 37%. Also, Nexus reduces warm- and cold-start latency by 39% and 10%, respectively, bringing the response time within 20% of that of a WASM-based, ecosystem-incompatible hypervisor.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces Nexus, a serverless-native KVM-based hypervisor that transparently decouples compute from I/O by intercepting the communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the duplicated heavyweight communication fabric (cloud SDK, RPC, TCP/IP) from guest VMs while preserving the conventional FaaS programming model and VM isolation. Asynchronous I/O optimizations are enabled, such as overlapping input payload prefetching with VM restoration from snapshots. Compared to a production baseline, the paper reports up to 44% reduction in node-level CPU consumption, 31% in memory (increasing deployment density by 37%), and reductions in warm- and cold-start latency of 39% and 10%, with response times within 20% of a WASM-based but ecosystem-incompatible hypervisor.

Significance. If the empirical claims hold under rigorous validation, this would be a significant contribution to serverless systems by addressing a key source of overhead (over 25% memory and potentially doubled CPU cycles from duplicated fabrics) without forcing migration to incompatible runtimes like WASM or library OSes. The transparent API-boundary interception approach maintains broad ecosystem compatibility and strong isolation, which are load-bearing requirements for production multi-tenant deployments. The focus on structural separation of domains to unlock async optimizations is a practical strength with potential for high-density serverless platforms.

major comments (2)
  1. [§4 (Architecture)] §4 (Architecture): The central mechanism relies on zero-copy shared memory between multiple guest VMs and the host backend for I/O offloading. However, the manuscript provides no security analysis, access control details for per-tenant regions, adversarial testing, or formal isolation argument to validate against cross-tenant leakage or violations in a multi-tenant KVM setup. This is load-bearing for the claim of preserving strong VM isolation while achieving transparency.
  2. [Abstract and §5 (Evaluation)] Abstract and §5 (Evaluation): The reported improvements (44% CPU, 31% memory, 37% density, 39%/10% latency reductions) are presented against a production baseline without details on workloads, measurement methodology, statistical significance, error bars, or potential confounds. This undermines verification of the central performance and density claims.
minor comments (1)
  1. [Abstract] The abstract could more explicitly separate node-level aggregate savings from per-VM metrics to clarify the density calculation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of the work's significance. We address the major comments point by point below. Both points can be addressed through revisions that add missing details and analysis without altering the core claims or results.

read point-by-point responses
  1. Referee: [§4 (Architecture)] §4 (Architecture): The central mechanism relies on zero-copy shared memory between multiple guest VMs and the host backend for I/O offloading. However, the manuscript provides no security analysis, access control details for per-tenant regions, adversarial testing, or formal isolation argument to validate against cross-tenant leakage or violations in a multi-tenant KVM setup. This is load-bearing for the claim of preserving strong VM isolation while achieving transparency.

    Authors: We agree that explicit security analysis is necessary to substantiate the isolation claims. The design builds on KVM's established memory isolation and hypervisor-enforced access controls for shared memory mappings, with per-tenant regions allocated and protected at the host level to prevent cross-tenant access. However, the manuscript lacks a dedicated discussion of these mechanisms, potential adversarial models, or formal arguments. In revision, we will add a new subsection to §4 covering: (1) access control via KVM's EPT and memory mapping policies for per-tenant shared regions, (2) why zero-copy does not introduce leakage paths, (3) reference to KVM's proven isolation properties in multi-tenant settings, and (4) a high-level adversarial analysis. This will be textual and design-based rather than new empirical testing. revision: yes

  2. Referee: [Abstract and §5 (Evaluation)] Abstract and §5 (Evaluation): The reported improvements (44% CPU, 31% memory, 37% density, 39%/10% latency reductions) are presented against a production baseline without details on workloads, measurement methodology, statistical significance, error bars, or potential confounds. This undermines verification of the central performance and density claims.

    Authors: We acknowledge that the evaluation section would benefit from greater methodological transparency to allow independent verification. The reported figures were obtained using standard serverless benchmarks and representative production workloads on a controlled testbed, with averages over repeated runs. In the revised manuscript, we will expand §5 to include: explicit workload descriptions and function characteristics; detailed measurement methodology (including tools, sampling intervals, and node-level aggregation); error bars with standard deviation or confidence intervals; results of statistical significance tests; and explicit discussion of potential confounds such as hardware variability, background system activity, and snapshot restoration overheads. These additions will be based on the existing experimental data and setup. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system design and evaluation

full rationale

The paper describes a new KVM-based hypervisor architecture for offloading I/O in serverless functions and reports measured improvements in CPU, memory, density, and latency versus an external production baseline. No derivation chain, equations, fitted parameters, or predictions exist that could reduce to inputs by construction. All claims rest on direct experimental comparisons and architectural description, with no self-citations, self-definitions, or ansatzes invoked as load-bearing steps. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters, invented entities, or non-standard axioms are visible in the abstract. The work rests on the domain assumption that KVM isolation remains intact after API-level interception and that the host backend can scale without becoming a new bottleneck.

axioms (1)
  • domain assumption KVM hypervisor provides sufficient isolation for multi-tenant serverless workloads after API interception
    Implicit in the choice of KVM-based design and the claim of preserved strong isolation.

pith-pipeline@v0.9.0 · 5622 in / 1244 out tokens · 80807 ms · 2026-05-10T18:31:50.564835+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

69 extracted references

  1. [1]

    [n. d.]. Understanding Lambda function scaling - AWS Documentation. Available athttps://docs.aws.amazon.com/lambda/latest/dg/lambda- concurrency.html

  2. [2]

    Cloud Hypervisor

    2021. Cloud Hypervisor. Available athttps://www.cloudhypervisor. org/

  3. [3]

    A suite of representative serverless cloud-agnostic benchmarks

    2023. A suite of representative serverless cloud-agnostic benchmarks. Available athttps://github.com/vhive-serverless/vSwarm/

  4. [4]

    Cloudflare Workers

    2023. Cloudflare Workers. Available athttps://workers.cloudflare.com

  5. [5]

    Google Cloud Run

    2023. Google Cloud Run. Available athttps://cloud.google.com/run

  6. [6]

    Istio considerations for large clusters

    2023. Istio considerations for large clusters. Available athttps://www. istio.io/

  7. [7]

    State of serverless

    2023. State of serverless. Available athttps://www.datadoghq.com/ state-of-serverless/

  8. [8]

    The container Security Platform

    2023. The container Security Platform. Available athttps://gvisor.dev/

  9. [9]

    Whats stopping webassembly from widespread adoption

    2024. Whats stopping webassembly from widespread adoption. Avail- able athttps://thenewstack.io/whats-stopping-webassembly-from- widespread-adoption/

  10. [10]

    AWS Serverless Application Repository

    2025. AWS Serverless Application Repository. Available athttps: //aws.amazon.com/serverless/serverlessrepo/

  11. [11]

    Azure Virtual Machines

    2025. Azure Virtual Machines. Available athttps://azure.microsoft. com/en-us/products/virtual-machines

  12. [12]

    Book of crosvm

    2025. Book of crosvm. Available athttps://crosvm.dev/book/

  13. [13]

    Cloud Run jobs and second-generation execution environment now GA

    2025. Cloud Run jobs and second-generation execution environment now GA. Available athttps://cloud.google.com/blog/products/ serverless/cloud-run-jobs-and-second-generation-execution- environment-ga?hl=en

  14. [14]

    Issues related to Faasm python support

    2025. Issues related to Faasm python support. Available athttps: //github.com/faasm/faasm/issues/900andhttps://github.com/faasm/ faasm/issues/880

  15. [15]

    Knative.https://knative.dev/docs/

    2025. Knative.https://knative.dev/docs/

  16. [16]

    Kubernetes

    2025. Kubernetes. Available athttps://kubernetes.io

  17. [17]

    Linux Profiling with performance counters

    2025. Linux Profiling with performance counters. Available athttps: //perfwiki.github.io/main/

  18. [18]

    Production Host Setup Recommendations

    2025. Production Host Setup Recommendations. Avail- able athttps://github.com/firecracker-microvm/firecracker/blob/main/ docs/prod-host-setup.md

  19. [19]

    WASI: Current State and Roadmap

    2025. WASI: Current State and Roadmap. Available athttps://www. riotsecure.se/blog/wasi_current_state_and_roadmap

  20. [20]

    WebAssembly’s unseen gap, why your code might not work

    2025. WebAssembly’s unseen gap, why your code might not work. Available athttps://medium.com/wasm/webassemblys-unseen-gap- why-our-code-might-not-work-1df65bb1301b

  21. [21]

    Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications.. In Proceedings of the 17th Symposium on Networked Systems Design and Implementation (NSDI). 419–434

  22. [22]

    Amazon Web Services. 2026. Amazon ElastiCache.https://aws. amazon.com/elasticache/

  23. [23]

    Amazon Web Services. 2026. Amazon Simple Storage Service (Amazon S3).https://aws.amazon.com/s3/

  24. [24]

    Deutsch, Yuheng Yang, Thomas Bourgeat, Jules Drean, Joel S

    Peter W. Deutsch, Yuheng Yang, Thomas Bourgeat, Jules Drean, Joel S. Emer, and Mengjia Yan. 2022. DAGguise: mitigating memory tim- ing side channels.. InProceedings of the 27th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVII). 329–343

  25. [25]

    Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qixuan Wu, and Haibo Chen. 2020. Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting.. InProceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS- XXV). 467–481

  26. [26]

    Vojislav Dukic, Rodrigo Bruno, Ankit Singla, and Gustavo Alonso

  27. [27]

    InProceedings of the 2020 ACM Symposium on Cloud Computing (SOCC)

    Photons: lambdas on a diet.. InProceedings of the 2020 ACM Symposium on Cloud Computing (SOCC). 45–59

  28. [28]

    Armando Fox and Eric A. Brewer. 1999. Harvest, Yield and Scalable Tolerant Systems.. InProceedings of The 7th Workshop on Hot Topics in Operating Systems (HotOS-VII). 174–178

  29. [29]

    Joshua Fried, Gohar Irfan Chaudhry, Enrique Saurez, Esha Choukse, Íñigo Goiri, Sameh Elnikety, Rodrigo Fonseca, and Adam Belay. 2024. Making Kernel Bypass Practical for the Cloud with Junction.. InPro- ceedings of the 21st Symposium on Networked Systems Design and Im- plementation (NSDI). 55–73

  30. [30]

    Google. [n. d.]. gRPC: A High-Performance, Open Source Universal RPC Framework. Available athttps://grpc.io

  31. [31]

    Kaijie Guo, Dingji Li, Ben Luo, Yibin Shen, Kaihuan Peng, Ning Luo, Shengdong Dai, Chen Liang, Jianming Song, Hang Yang, Xiantao Zhang, and Zeyu Mi. 2024. VPRI: Efficient I/O Page Fault Handling via Software-Hardware Co-Design for IaaS Clouds. 541–557

  32. [32]

    Amery Hung and Bobby Eshleman. 2023. VSOCK: From Convenience to Performant VirtIO Communication. InLinux Plumbers Conference (LPC).https://lpc.events/event/17/contributions/1626/

  33. [33]

    2023.Intel 64 and IA-32 Architectures Software Developer Manuals, Volume 3A: System Programming Guide, Part 1

    Intel Corporation. 2023.Intel 64 and IA-32 Architectures Software Developer Manuals, Volume 3A: System Programming Guide, Part 1. 13 Intel Corporation.https://software.intel.com/content/www/us/en/ develop/articles/intel-sdm.html

  34. [34]

    Sami Jaktholm. 2024. Sjakthol/Aws-Network-Benchmark. Available athttps://github.com/sjakthol/aws-network-benchmark/blob/main/ analysis/2024/results-lambda.ipynb

  35. [35]

    Zhipeng Jia and Emmett Witchel. 2021. Boki: Stateful Serverless Com- puting with Shared Logs.. InProceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP). 691–707

  36. [36]

    Zhipeng Jia and Emmett Witchel. 2021. Nightcore: efficient and scal- able serverless computing for latency-sensitive, interactive microser- vices.. InProceedings of the 26th International Conference on Archi- tectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI). 152–166

  37. [37]

    Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostic, Youngjin Kwon, Simon Peter, and Emmett Witchel. 2021. LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism.. InProceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP). 756–771

  38. [38]

    Avi Kivity, Dor Laor, Glauber Costa, Pekka Enberg, Nadav Har’El, Don Marti, and Vlad Zolotarov. 2014. OSv - Optimizing the Operating System for Virtual Machines.. InProceedings of the 2014 USENIX Annual Technical Conference (ATC). 61–72

  39. [39]

    Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Serverless Analytics.. InProceedings of the 13th Symposium on Operating System Design and Implementation (OSDI). 427–444

  40. [40]

    Swaroop Kotni, Ajay Nayak, Vinod Ganapathy, and Arkaprava Basu

  41. [41]

    In Proceedings of the 2021 USENIX Annual Technical Conference (ATC)

    Faastlane: Accelerating Function-as-a-Service Workflows.. In Proceedings of the 2021 USENIX Annual Technical Conference (ATC). 805–820

  42. [42]

    Tom Kuchler, Pinghe Li, Yazhuo Zhang, Lazar Cvetkovic, Boris Gora- nov, Tobias Stocker, Leon Thomm, Simone Kalbermatter, Tim Notter, Andrea Lattuada, and Ana Klimovic. 2025. Unlocking True Elasticity for the Cloud-Native Era with Dandelion.. InProceedings of the 30th ACM Symposium on Operating Systems Principles (SOSP). 944–961

  43. [43]

    Simon Kuenzer, Vlad-Andrei Badoiu, Hugo Lefeuvre, Sharan San- thanam, Alexander Jung, Gaulthier Gain, Cyril Soldani, Costin Lupu, Stefan Teodorescu, Costi Raducanu, Cristian Banu, Laurent Mathy, Razvan Deaconescu, Costin Raiciu, and Felipe Huici. 2021. Unikraft: fast, specialized unikernels the easy way.. InProceedings of the 2021 EuroSys Conference. 376–394

  44. [44]

    Ousterhout

    Collin Lee, Seo Jin Park, Ankita Kejriwal, Satoshi Matsushita, and John K. Ousterhout. 2015. Implementing linearizability at large scale and low latency.. InProceedings of the 25th ACM Symposium on Oper- ating Systems Principles (SOSP). 71–86

  45. [45]

    Yuanlong Li, Atri Bhattacharyya, Madhur Kumar, Abhishek Bhat- tacharjee, Yoav Etsion, Babak Falsafi, Sanidhya Kashyap, and Mathias Payer. 2025. Single-Address-Space FaaS with Jord.. InProceedings of the 52nd International Symposium on Computer Architecture (ISCA). 694–707

  46. [46]

    Yunzhuo Liu, Junchen Guo, Bo Jiang, Yang Song, Pengyu Zhang, Rong Wen, Biao Lyu, Shunmin Zhu, and Xinbing Wang. 2025. FastIOV: Fast Startup of Passthrough Network I/O Virtualization for Secure Containers.. InProceedings of the 2025 EuroSys Conference. 720–735

  47. [47]

    Artemiy Margaritov, Dmitrii Ustiugov, Amna Shahab, and Boris Grot

  48. [48]

    InProceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI)

    PTEMagnet: fine-grained physical memory reservation for faster page walks in public clouds.. InProceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI). 211–223

  49. [49]

    Microsoft. 2026. Azure Blob Storage.https://azure.microsoft.com/en- us/products/storage/blobs/Accessed: 2026-03-11

  50. [50]

    Microsoft Azure. [n. d.]. Azure Public Dataset: Azure LLM In- ference Trace 2023. Available athttps://github.com/Azure/ AzurePublicDataset/blob/master/AzureLLMInferenceDataset2023. md

  51. [51]

    MinIO. [n. d.]. MinIO: High Performance Object Storage. Available at https://min.io/

  52. [52]

    Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, Stéphane Pouget, Josiane Kouam, Renaud Lachaize, Jinho Hwang, Tim Wood, Daniel Hagimont, Noël De Palma, Bernabé Batchakui, and Alain Tchana. 2021. OFC: an opportunistic caching system for FaaS platforms.. InProceedings of the 2021 EuroSys Conference. 228–244

  53. [53]

    Shixiong Qi, Songyu Zhang, K. K. Ramakrishnan, Diman Zad Tootaghaj, Hardik Soni, and Puneet Sharma. 2025. Palladium: A DPU- enabled Multi-Tenant Serverless Cloud over Zero-copy Multi-node RDMA Fabrics.. InProceedings of the ACM SIGCOMM 2025 Conference. 1257–1259

  54. [54]

    Alessandro Randazzo and Ilenia Tinnirello. 2019. Kata Containers: An Emerging Architecture for Enabling MEC Services in Fast and Secure Way.. InSixth International Conference on Internet of Things: Systems, Management and Security. 209–214

  55. [55]

    Yadwadkar, Rodrigo Fonseca, Chris- tos Kozyrakis, and Ricardo Bianchini

    Francisco Romero, Gohar Irfan Chaudhry, Iñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Chris- tos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications.. InProceedings of the 2021 ACM Symposium on Cloud Computing (SOCC). 122–137

  56. [56]

    Mohammad Shahrad, Rodrigo Fonseca, Iñigo Goiri, Gohar Irfan Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tres- ness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider.. InProceedings of the 2020 USENIX Annual Technical Conference (ATC). 205–218

  57. [57]

    Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. 2018. LegoOS: A Disseminated, Distributed OS for Hardware Resource Dis- aggregation.. InProceedings of the 13th Symposium on Operating System Design and Implementation (OSDI). 69–87

  58. [58]

    Zhiming Shen, Zhen Sun, Gur-Eyal Sela, Eugene Bagdasaryan, Christina Delimitrou, Robbert van Renesse, and Hakim Weatherspoon

  59. [59]

    InProceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIV)

    X-Containers: Breaking Down Barriers to Improve Performance and Isolation of Cloud-Native Containers.. InProceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIV). 121–135

  60. [60]

    Pietzuch

    Simon Shillaker and Peter R. Pietzuch. 2020. Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing.. InProceedings of the 2020 USENIX Annual Technical Conference (ATC). 419–433

  61. [61]

    Hellerstein, and Alexey Tumanov

    Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, Johann Schleier-Smith, Joseph Gonzalez, Joseph M. Hellerstein, and Alexey Tumanov. 2020. Cloudburst: Stateful Functions-as-a-Service.Proc. VLDB Endow.13, 11 (2020), 2438–2452

  62. [62]

    Foteini Strati, Xianzhe Ma, and Ana Klimovic. 2024. Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications.. InProceedings of the 2024 EuroSys Conference. 1075–1092

  63. [63]

    Dmitrii Ustiugov, Dohyun Park, Lazar Cvetkovic, Mihajlo Djokic, Hongyu Hè, Boris Grot, and Ana Klimovic. 2023. Enabling In-Vitro Serverless Systems Research.. InProceedings of the 4th Workshop on Resource Disaggregation and Serverless. 1–7

  64. [64]

    Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, and Boris Grot. 2021. Benchmarking, analysis, and optimization of serverless function snapshots.. InProceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI). 559–572

  65. [65]

    Wanninger, Joshua J

    Nicholas C. Wanninger, Joshua J. Bowden, Kirtankumar Shetty, Ayush Garg, and Kyle C. Hale. 2022. Isolating functions at the hardware limit with virtines.. InProceedings of the 2022 EuroSys Conference. 644–662

  66. [66]

    Robert N. M. Watson, Jonathan Woodruff, Peter G. Neumann, Simon W. Moore, Jonathan Anderson, David Chisnall, Nirav H. Dave, Brooks Davis, Khilan Gudka, Ben Laurie, Steven J. Murdoch, Robert M. Nor- ton, Michael Roe, Stacey D. Son, and Munraj Vadera. 2015. CHERI: A 14 Hybrid Capability-System Architecture for Scalable Software Com- partmentalization.. InIE...

  67. [67]

    Jianing You, Kang Chen, Laiping Zhao, Yiming Li, Yichi Chen, Yux- uan Du, Yanjie Wang, Luhang Wen, Keyang Hu, and Keqiu Li. 2025. AlloyStack: A Library Operating System for Serverless Workflow Ap- plications.. InProceedings of the 2025 EuroSys Conference. 921–937

  68. [68]

    Rossbach

    Hangchen Yu, Arthur Michener Peters, Amogh Akshintala, and Christopher J. Rossbach. 2020. AvA: Accelerated Virtualization of Accelerators.. InProceedings of the 25th International Conference on Ar- chitectural Support for Programming Languages and Operating Systems (ASPLOS-XXV). 807–825

  69. [69]

    Fletcher, and Josep Torrellas

    Zirui Neil Zhao, Adam Morrison, Christopher W. Fletcher, and Josep Torrellas. 2024. Everywhere All at Once: Co-Location Attacks on Public Cloud FaaS.. InASPLOS (1). 133–149. 15