arxiv: 2604.06682 · v2 · submitted 2026-04-08 · 💻 cs.DC · cs.OS

Recognition: 2 theorem links

· Lean Theorem

Nexus: Transparent I/O Offloading for High-Density Serverless Computing

JooYoung Park , Kevin Nguetchouang , Jovan Stojkovic , Likun Zhang , Riccardo Mancini , Marco Cali , Dmitrii Ustiugov

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:31 UTC · model grok-4.3

classification 💻 cs.DC cs.OS

keywords serverless computingI/O offloadingKVM hypervisorhigh-density deploymenttransparent offloadingvirtual machinesFaaS performance

0 comments

The pith

By offloading I/O processing to a shared host backend, Nexus allows serverless VMs to run with up to 44% less CPU and 31% less memory while preserving full compatibility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Serverless platforms rely on VMs to isolate many small functions, but this forces each VM to carry a full copy of the cloud SDK, RPC layer, and network stack. Nexus intercepts calls right at the function's API boundary and redirects them through zero-copy shared memory to a single always-on host service. The change frees VMs from duplicating I/O logic, cuts overall node resource use, and supports overlapping data movement with VM startup. Because the offload happens below the programming model, existing code and dependencies continue to work unchanged. The measured gains include higher function density per server and faster response times.

Core claim

Nexus is a KVM-based hypervisor that transparently decouples compute from I/O by intercepting the communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the heavyweight communication fabric from the guest VM while preserving the conventional serverless programming model. The structural separation enables asynchronous I/O optimizations such as overlapping input payload prefetching with VM restoration from a snapshot and writing output payloads back to storage off the critical path.

What carries the argument

API-boundary interception with zero-copy shared-memory offload to a host-resident I/O backend

If this is right

Node-level CPU consumption drops by up to 44% compared with the production baseline.
Memory consumption drops by up to 31%, which increases deployment density by 37%.
Warm-start latency falls 39% and cold-start latency falls 10%.
End-to-end response time stays within 20% of a WebAssembly-based but ecosystem-incompatible hypervisor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The offload approach could be applied to other hypervisor-based serverless systems that currently duplicate network and RPC stacks inside every tenant.
Placing the shared I/O logic in one host component may simplify performance tuning and security monitoring across all functions.
Further gains might come from letting the host backend batch or cache requests that cross multiple functions.

Load-bearing premise

Intercepting the communication fabric at the API boundary can be performed transparently and securely inside a multi-tenant KVM environment without introducing new bottlenecks, compatibility breaks, or isolation violations.

What would settle it

Running the system on production workloads and observing either no meaningful drop in node CPU or memory use, or an increase in API compatibility failures or security incidents.

Figures

Figures reproduced from arXiv: 2604.06682 by Dmitrii Ustiugov, JooYoung Park, Jovan Stojkovic, Kevin Nguetchouang, Likun Zhang, Marco Cali, Riccardo Mancini.

**Figure 1.** Figure 1: Traditional serverless architecture overview. domain-specific dependencies—such as Python’s machine learning ecosystem and Node.js’s extensive API SDKs. These dependencies introduce significant migration barriers because they lack the maturity of high-performance compiled runtimes like C++ or Rust. Consequently, preserving compatibility with existing FaaS programming models, containerized deployment str… view at source ↗

**Figure 3.** Figure 3: Breakdown of memory footprint for each component during function execution averaged across vSwarm workloads. cycles, which is overhead that steals CPU time that would have been allocated to the actual functions’ logic. In summary, the heavy communication fabric inflates userspace cycles, while the virtualized network stack amplifies the kernel-space cycles due to the hypervisor activity. 3.2 Memory Overh… view at source ↗

**Figure 2.** Figure 2: CPU cycles breakdown: (a) overall on a worker node across host and guest domains; (b) CPU cycles and (c) instructions for a synthetic single-PUT workload across different communication fabrics (TCP vs. SDKs) and (d) CPU cycles across bare-metal and virtualized environments. increases CPU cycles by 3x and 5x (for Python and Go, respectively), which we attribute to the increase in the number of executed ins… view at source ↗

**Figure 4.** Figure 4: Nexus serverless architecture overview. creation and allow remote writes to complete after the function invocation’s processing completes. This breaks the strict restore–fetch–compute–write serialization, identified in §3.3, and shortens the critical path of the invocation. Nexus does so while preserving safety: the backend takes responsibility for performing the I/O transfer, e.g., to remote storage, on … view at source ↗

**Figure 5.** Figure 5: Function execution lifecycle with RPC management and cloud storage access offloading. interfaces to perform low-level networking, it triggers the legacy path, which transparently falls back to standard virtualized Ethernet devices governed by the same fixed-ratelimiting mechanisms as the baseline architecture. 4.2 Anatomy of an Invocation This decoupled architecture fundamentally transforms the traditio… view at source ↗

**Figure 6.** Figure 6: End-to-End latency evaluation and resource utilization as deployment density scales. Each deployed function serves a trace sampled from the Azure Functions production dataset [53]. 7 Evaluation In this section, we evaluate the design and implementation of Nexus. We first evaluate whether Nexus improves deployment density in a cluster with a realistic mix of functions (§7.1), and then explain the resulti… view at source ↗

**Figure 7.** Figure 7: Warm latency across vSwarm workloads normalized to Baseline. Nexus reduce guest-side I/O processing. Figure 6a shows that Baseline sustains up to 320 deployed functions while meeting the target SLO, whereas NexusTCP and Nexus-Async sustain 380 and Nexus sustains 440, respectively, corresponding to the deployment density gains of 18% and 37%, respectively. To explain these benefits, we analyze the cluster… view at source ↗

**Figure 9.** Figure 9: kvm exit and kvm vcpu wakeup event rates across vSwarm workloads normalized per invocation. Nexus reduces both rates compared to baseline (1.0). 7.2.1 CPU Cycles. We first evaluate the impact of compute and I/O decoupling on warm execution latency using the same set of vSwarm functions as described in §6. We measure unloaded latency by deploying a single function instance and repeatedly sending a request… view at source ↗

**Figure 10.** Figure 10: Per-function instance memory footprint across vSwarm workloads, normalized to Baseline. Nexus reduces per-VM memory footprint by consolidating the communication fabric out of the VM to Nexus’s backend. 1 5 10 20 30 40 Number of Function Instances per Worker Node 0 100 200 Memory Footprint (MB) -10% -19% -20% -21% -21% -21% Nexus Baseline Function Instance Nexus Backend [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗

**Figure 11.** Figure 11: Worker node memory footprint breakdown across various deployment densities. Nexus amortized the shared communication fabric on the Nexus’s backend, consistently reducing the footprint by 10-21%. to break the collection into guest and host user and kernel space, and report them normalized to the baseline. For KVM activity, we use the 𝑝𝑒𝑟 𝑓 -𝑘𝑣𝑚 tool and deduct per invocation. The results are normalized to… view at source ↗

**Figure 14.** Figure 14: Execution time, per-invocation CPU cycles breakdown, and memory footprint of AES encryption workload under the 3 studied systems: Baseline, Nexus, and Faasm. 7.2.3 Cold Latency Breakdown. We next analyze coldstart latency to understand how Nexus reduces it by invoking functions one at a time. With instrumentation, we capture the latency breakdown ( [PITH_FULL_IMAGE:figures/full_fig_p012_14.png] view at source ↗

read the original abstract

Serverless computing relies on extreme multi-tenancy to remain economically viable, driving providers to rely on virtual machines (VMs) that ensure strong isolation and seamless ecosystem compatibility with the FaaS programming model. However, current architectures tightly couple application processing logic with I/O processing, forcing every VM to duplicate a heavy communication fabric (cloud SDK, RPC, and TCP/IP). Our analysis reveals this duplication consumes over 25% of a function's memory footprint, and may double the CPU cycles in VMs compared to bare-metal execution. While prior systems attempt to solve this using WebAssembly or library OSes, they naively sacrifice ecosystem compatibility, forcing developers to migrate code and dependencies to new languages. We introduce Nexus, a serverless-native KVM-based hypervisor that transparently decouples compute from I/O. Nexus shifts the execution model by intercepting communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the heavyweight communication fabric from the guest VM, while preserving the conventional serverless programming model. By structurally separating these domains, Nexus unlocks asynchronous I/O optimizations: overlapping input payload prefetching with VM restoration from a snapshot and writing output payloads back to storage off the critical path. Compared to the production baseline, Nexus reduces overall node-level CPU and memory consumption by up to 44% and 31%, respectively, thus increasing deployment density by 37%. Also, Nexus reduces warm- and cold-start latency by 39% and 10%, respectively, bringing the response time within 20% of that of a WASM-based, ecosystem-incompatible hypervisor.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Nexus offers a practical KVM modification for offloading serverless I/O that delivers measurable density gains without compatibility tradeoffs.

read the letter

Nexus takes the duplicated I/O stack out of serverless VMs by intercepting calls at the API boundary and handing them off to a host-side shared memory backend. This keeps the usual programming model intact while trimming the per-function footprint. The new part is doing this transparently inside a production KVM hypervisor rather than relying on WASM or library OS changes. They add async tricks like overlapping payload prefetch with VM snapshot restore and offloading writes. That combination looks fresh compared to the cited work. The paper shows real gains: node CPU down 44%, memory 31%, density up 37%, plus lower warm and cold start times. Measuring against an actual production baseline is a plus, and the motivation about 25% memory in the comm fabric is concrete. The soft spot is the multi-tenant security of that zero-copy shared memory. The design moves data through a host backend, but without explicit checks for cross-VM leakage or isolation proofs, it is hard to be sure no new attack surface was added. The abstract skips any security evaluation, so the full paper needs to cover that or it could be a gap. This work is aimed at cloud operators and researchers focused on serverless efficiency and hypervisor tweaks. A reader interested in practical density improvements would get value from the measurements and the engineering approach. It should go to peer review. The claims are grounded in comparisons to real systems and the problem is worth solving, though the security and methodology details will need scrutiny from referees.

Referee Report

2 major / 1 minor

Summary. The paper introduces Nexus, a serverless-native KVM-based hypervisor that transparently decouples compute from I/O by intercepting the communication fabric at the API boundary and offloading it to an always-on host shared backend via zero-copy shared memory. This removes the duplicated heavyweight communication fabric (cloud SDK, RPC, TCP/IP) from guest VMs while preserving the conventional FaaS programming model and VM isolation. Asynchronous I/O optimizations are enabled, such as overlapping input payload prefetching with VM restoration from snapshots. Compared to a production baseline, the paper reports up to 44% reduction in node-level CPU consumption, 31% in memory (increasing deployment density by 37%), and reductions in warm- and cold-start latency of 39% and 10%, with response times within 20% of a WASM-based but ecosystem-incompatible hypervisor.

Significance. If the empirical claims hold under rigorous validation, this would be a significant contribution to serverless systems by addressing a key source of overhead (over 25% memory and potentially doubled CPU cycles from duplicated fabrics) without forcing migration to incompatible runtimes like WASM or library OSes. The transparent API-boundary interception approach maintains broad ecosystem compatibility and strong isolation, which are load-bearing requirements for production multi-tenant deployments. The focus on structural separation of domains to unlock async optimizations is a practical strength with potential for high-density serverless platforms.

major comments (2)

[§4 (Architecture)] §4 (Architecture): The central mechanism relies on zero-copy shared memory between multiple guest VMs and the host backend for I/O offloading. However, the manuscript provides no security analysis, access control details for per-tenant regions, adversarial testing, or formal isolation argument to validate against cross-tenant leakage or violations in a multi-tenant KVM setup. This is load-bearing for the claim of preserving strong VM isolation while achieving transparency.
[Abstract and §5 (Evaluation)] Abstract and §5 (Evaluation): The reported improvements (44% CPU, 31% memory, 37% density, 39%/10% latency reductions) are presented against a production baseline without details on workloads, measurement methodology, statistical significance, error bars, or potential confounds. This undermines verification of the central performance and density claims.

minor comments (1)

[Abstract] The abstract could more explicitly separate node-level aggregate savings from per-VM metrics to clarify the density calculation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of the work's significance. We address the major comments point by point below. Both points can be addressed through revisions that add missing details and analysis without altering the core claims or results.

read point-by-point responses

Referee: [§4 (Architecture)] §4 (Architecture): The central mechanism relies on zero-copy shared memory between multiple guest VMs and the host backend for I/O offloading. However, the manuscript provides no security analysis, access control details for per-tenant regions, adversarial testing, or formal isolation argument to validate against cross-tenant leakage or violations in a multi-tenant KVM setup. This is load-bearing for the claim of preserving strong VM isolation while achieving transparency.

Authors: We agree that explicit security analysis is necessary to substantiate the isolation claims. The design builds on KVM's established memory isolation and hypervisor-enforced access controls for shared memory mappings, with per-tenant regions allocated and protected at the host level to prevent cross-tenant access. However, the manuscript lacks a dedicated discussion of these mechanisms, potential adversarial models, or formal arguments. In revision, we will add a new subsection to §4 covering: (1) access control via KVM's EPT and memory mapping policies for per-tenant shared regions, (2) why zero-copy does not introduce leakage paths, (3) reference to KVM's proven isolation properties in multi-tenant settings, and (4) a high-level adversarial analysis. This will be textual and design-based rather than new empirical testing. revision: yes
Referee: [Abstract and §5 (Evaluation)] Abstract and §5 (Evaluation): The reported improvements (44% CPU, 31% memory, 37% density, 39%/10% latency reductions) are presented against a production baseline without details on workloads, measurement methodology, statistical significance, error bars, or potential confounds. This undermines verification of the central performance and density claims.

Authors: We acknowledge that the evaluation section would benefit from greater methodological transparency to allow independent verification. The reported figures were obtained using standard serverless benchmarks and representative production workloads on a controlled testbed, with averages over repeated runs. In the revised manuscript, we will expand §5 to include: explicit workload descriptions and function characteristics; detailed measurement methodology (including tools, sampling intervals, and node-level aggregation); error bars with standard deviation or confidence intervals; results of statistical significance tests; and explicit discussion of potential confounds such as hardware variability, background system activity, and snapshot restoration overheads. These additions will be based on the existing experimental data and setup. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system design and evaluation

full rationale

The paper describes a new KVM-based hypervisor architecture for offloading I/O in serverless functions and reports measured improvements in CPU, memory, density, and latency versus an external production baseline. No derivation chain, equations, fitted parameters, or predictions exist that could reduce to inputs by construction. All claims rest on direct experimental comparisons and architectural description, with no self-citations, self-definitions, or ansatzes invoked as load-bearing steps. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters, invented entities, or non-standard axioms are visible in the abstract. The work rests on the domain assumption that KVM isolation remains intact after API-level interception and that the host backend can scale without becoming a new bottleneck.

axioms (1)

domain assumption KVM hypervisor provides sufficient isolation for multi-tenant serverless workloads after API interception
Implicit in the choice of KVM-based design and the claim of preserved strong isolation.

pith-pipeline@v0.9.0 · 5622 in / 1244 out tokens · 80807 ms · 2026-05-10T18:31:50.564835+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Nexus intercepts the communication fabric at the high-level API boundary, remoting it to a shared host backend via zero-copy shared memory. This completely extracts the infrastructure tax from the guest without requiring any user code modifications.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Nexus reduces node-level CPU and memory consumption by up to 44% and 31%, respectively, thus increasing deployment density by 37%.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

69 extracted references

[1]

[n. d.]. Understanding Lambda function scaling - AWS Documentation. Available athttps://docs.aws.amazon.com/lambda/latest/dg/lambda- concurrency.html
[2]

Cloud Hypervisor

2021. Cloud Hypervisor. Available athttps://www.cloudhypervisor. org/

2021
[3]

A suite of representative serverless cloud-agnostic benchmarks

2023. A suite of representative serverless cloud-agnostic benchmarks. Available athttps://github.com/vhive-serverless/vSwarm/

2023
[4]

Cloudflare Workers

2023. Cloudflare Workers. Available athttps://workers.cloudflare.com

2023
[5]

Google Cloud Run

2023. Google Cloud Run. Available athttps://cloud.google.com/run

2023
[6]

Istio considerations for large clusters

2023. Istio considerations for large clusters. Available athttps://www. istio.io/

2023
[7]

State of serverless

2023. State of serverless. Available athttps://www.datadoghq.com/ state-of-serverless/

2023
[8]

The container Security Platform

2023. The container Security Platform. Available athttps://gvisor.dev/

2023
[9]

Whats stopping webassembly from widespread adoption

2024. Whats stopping webassembly from widespread adoption. Avail- able athttps://thenewstack.io/whats-stopping-webassembly-from- widespread-adoption/

2024
[10]

AWS Serverless Application Repository

2025. AWS Serverless Application Repository. Available athttps: //aws.amazon.com/serverless/serverlessrepo/

2025
[11]

Azure Virtual Machines

2025. Azure Virtual Machines. Available athttps://azure.microsoft. com/en-us/products/virtual-machines

2025
[12]

Book of crosvm

2025. Book of crosvm. Available athttps://crosvm.dev/book/

2025
[13]

Cloud Run jobs and second-generation execution environment now GA

2025. Cloud Run jobs and second-generation execution environment now GA. Available athttps://cloud.google.com/blog/products/ serverless/cloud-run-jobs-and-second-generation-execution- environment-ga?hl=en

2025
[14]

Issues related to Faasm python support

2025. Issues related to Faasm python support. Available athttps: //github.com/faasm/faasm/issues/900andhttps://github.com/faasm/ faasm/issues/880

2025
[15]

Knative.https://knative.dev/docs/

2025. Knative.https://knative.dev/docs/

2025
[16]

Kubernetes

2025. Kubernetes. Available athttps://kubernetes.io

2025
[17]

Linux Profiling with performance counters

2025. Linux Profiling with performance counters. Available athttps: //perfwiki.github.io/main/

2025
[18]

Production Host Setup Recommendations

2025. Production Host Setup Recommendations. Avail- able athttps://github.com/firecracker-microvm/firecracker/blob/main/ docs/prod-host-setup.md

2025
[19]

WASI: Current State and Roadmap

2025. WASI: Current State and Roadmap. Available athttps://www. riotsecure.se/blog/wasi_current_state_and_roadmap

2025
[20]

WebAssembly’s unseen gap, why your code might not work

2025. WebAssembly’s unseen gap, why your code might not work. Available athttps://medium.com/wasm/webassemblys-unseen-gap- why-our-code-might-not-work-1df65bb1301b

2025
[21]

Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications.. In Proceedings of the 17th Symposium on Networked Systems Design and Implementation (NSDI). 419–434

2020
[22]

Amazon Web Services. 2026. Amazon ElastiCache.https://aws. amazon.com/elasticache/

2026
[23]

Amazon Web Services. 2026. Amazon Simple Storage Service (Amazon S3).https://aws.amazon.com/s3/

2026
[24]

Deutsch, Yuheng Yang, Thomas Bourgeat, Jules Drean, Joel S

Peter W. Deutsch, Yuheng Yang, Thomas Bourgeat, Jules Drean, Joel S. Emer, and Mengjia Yan. 2022. DAGguise: mitigating memory tim- ing side channels.. InProceedings of the 27th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVII). 329–343

2022
[25]

Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qixuan Wu, and Haibo Chen. 2020. Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting.. InProceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS- XXV). 467–481

2020
[26]

Vojislav Dukic, Rodrigo Bruno, Ankit Singla, and Gustavo Alonso
[27]

InProceedings of the 2020 ACM Symposium on Cloud Computing (SOCC)

Photons: lambdas on a diet.. InProceedings of the 2020 ACM Symposium on Cloud Computing (SOCC). 45–59

2020
[28]

Armando Fox and Eric A. Brewer. 1999. Harvest, Yield and Scalable Tolerant Systems.. InProceedings of The 7th Workshop on Hot Topics in Operating Systems (HotOS-VII). 174–178

1999
[29]

Joshua Fried, Gohar Irfan Chaudhry, Enrique Saurez, Esha Choukse, Íñigo Goiri, Sameh Elnikety, Rodrigo Fonseca, and Adam Belay. 2024. Making Kernel Bypass Practical for the Cloud with Junction.. InPro- ceedings of the 21st Symposium on Networked Systems Design and Im- plementation (NSDI). 55–73

2024
[30]

Google. [n. d.]. gRPC: A High-Performance, Open Source Universal RPC Framework. Available athttps://grpc.io
[31]

Kaijie Guo, Dingji Li, Ben Luo, Yibin Shen, Kaihuan Peng, Ning Luo, Shengdong Dai, Chen Liang, Jianming Song, Hang Yang, Xiantao Zhang, and Zeyu Mi. 2024. VPRI: Efficient I/O Page Fault Handling via Software-Hardware Co-Design for IaaS Clouds. 541–557

2024
[32]

Amery Hung and Bobby Eshleman. 2023. VSOCK: From Convenience to Performant VirtIO Communication. InLinux Plumbers Conference (LPC).https://lpc.events/event/17/contributions/1626/

2023
[33]

2023.Intel 64 and IA-32 Architectures Software Developer Manuals, Volume 3A: System Programming Guide, Part 1

Intel Corporation. 2023.Intel 64 and IA-32 Architectures Software Developer Manuals, Volume 3A: System Programming Guide, Part 1. 13 Intel Corporation.https://software.intel.com/content/www/us/en/ develop/articles/intel-sdm.html

2023
[34]

Sami Jaktholm. 2024. Sjakthol/Aws-Network-Benchmark. Available athttps://github.com/sjakthol/aws-network-benchmark/blob/main/ analysis/2024/results-lambda.ipynb

2024
[35]

Zhipeng Jia and Emmett Witchel. 2021. Boki: Stateful Serverless Com- puting with Shared Logs.. InProceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP). 691–707

2021
[36]

Zhipeng Jia and Emmett Witchel. 2021. Nightcore: efficient and scal- able serverless computing for latency-sensitive, interactive microser- vices.. InProceedings of the 26th International Conference on Archi- tectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI). 152–166

2021
[37]

Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostic, Youngjin Kwon, Simon Peter, and Emmett Witchel. 2021. LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism.. InProceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP). 756–771

2021
[38]

Avi Kivity, Dor Laor, Glauber Costa, Pekka Enberg, Nadav Har’El, Don Marti, and Vlad Zolotarov. 2014. OSv - Optimizing the Operating System for Virtual Machines.. InProceedings of the 2014 USENIX Annual Technical Conference (ATC). 61–72

2014
[39]

Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Serverless Analytics.. InProceedings of the 13th Symposium on Operating System Design and Implementation (OSDI). 427–444

2018
[40]

Swaroop Kotni, Ajay Nayak, Vinod Ganapathy, and Arkaprava Basu
[41]

In Proceedings of the 2021 USENIX Annual Technical Conference (ATC)

Faastlane: Accelerating Function-as-a-Service Workflows.. In Proceedings of the 2021 USENIX Annual Technical Conference (ATC). 805–820

2021
[42]

Tom Kuchler, Pinghe Li, Yazhuo Zhang, Lazar Cvetkovic, Boris Gora- nov, Tobias Stocker, Leon Thomm, Simone Kalbermatter, Tim Notter, Andrea Lattuada, and Ana Klimovic. 2025. Unlocking True Elasticity for the Cloud-Native Era with Dandelion.. InProceedings of the 30th ACM Symposium on Operating Systems Principles (SOSP). 944–961

2025
[43]

Simon Kuenzer, Vlad-Andrei Badoiu, Hugo Lefeuvre, Sharan San- thanam, Alexander Jung, Gaulthier Gain, Cyril Soldani, Costin Lupu, Stefan Teodorescu, Costi Raducanu, Cristian Banu, Laurent Mathy, Razvan Deaconescu, Costin Raiciu, and Felipe Huici. 2021. Unikraft: fast, specialized unikernels the easy way.. InProceedings of the 2021 EuroSys Conference. 376–394

2021
[44]

Ousterhout

Collin Lee, Seo Jin Park, Ankita Kejriwal, Satoshi Matsushita, and John K. Ousterhout. 2015. Implementing linearizability at large scale and low latency.. InProceedings of the 25th ACM Symposium on Oper- ating Systems Principles (SOSP). 71–86

2015
[45]

Yuanlong Li, Atri Bhattacharyya, Madhur Kumar, Abhishek Bhat- tacharjee, Yoav Etsion, Babak Falsafi, Sanidhya Kashyap, and Mathias Payer. 2025. Single-Address-Space FaaS with Jord.. InProceedings of the 52nd International Symposium on Computer Architecture (ISCA). 694–707

2025
[46]

Yunzhuo Liu, Junchen Guo, Bo Jiang, Yang Song, Pengyu Zhang, Rong Wen, Biao Lyu, Shunmin Zhu, and Xinbing Wang. 2025. FastIOV: Fast Startup of Passthrough Network I/O Virtualization for Secure Containers.. InProceedings of the 2025 EuroSys Conference. 720–735

2025
[47]

Artemiy Margaritov, Dmitrii Ustiugov, Amna Shahab, and Boris Grot
[48]

InProceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI)

PTEMagnet: fine-grained physical memory reservation for faster page walks in public clouds.. InProceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI). 211–223
[49]

Microsoft. 2026. Azure Blob Storage.https://azure.microsoft.com/en- us/products/storage/blobs/Accessed: 2026-03-11

2026
[50]

Microsoft Azure. [n. d.]. Azure Public Dataset: Azure LLM In- ference Trace 2023. Available athttps://github.com/Azure/ AzurePublicDataset/blob/master/AzureLLMInferenceDataset2023. md

2023
[51]

MinIO. [n. d.]. MinIO: High Performance Object Storage. Available at https://min.io/
[52]

Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, Stéphane Pouget, Josiane Kouam, Renaud Lachaize, Jinho Hwang, Tim Wood, Daniel Hagimont, Noël De Palma, Bernabé Batchakui, and Alain Tchana. 2021. OFC: an opportunistic caching system for FaaS platforms.. InProceedings of the 2021 EuroSys Conference. 228–244

2021
[53]

Shixiong Qi, Songyu Zhang, K. K. Ramakrishnan, Diman Zad Tootaghaj, Hardik Soni, and Puneet Sharma. 2025. Palladium: A DPU- enabled Multi-Tenant Serverless Cloud over Zero-copy Multi-node RDMA Fabrics.. InProceedings of the ACM SIGCOMM 2025 Conference. 1257–1259

2025
[54]

Alessandro Randazzo and Ilenia Tinnirello. 2019. Kata Containers: An Emerging Architecture for Enabling MEC Services in Fast and Secure Way.. InSixth International Conference on Internet of Things: Systems, Management and Security. 209–214

2019
[55]

Yadwadkar, Rodrigo Fonseca, Chris- tos Kozyrakis, and Ricardo Bianchini

Francisco Romero, Gohar Irfan Chaudhry, Iñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Chris- tos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications.. InProceedings of the 2021 ACM Symposium on Cloud Computing (SOCC). 122–137

2021
[56]

Mohammad Shahrad, Rodrigo Fonseca, Iñigo Goiri, Gohar Irfan Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tres- ness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider.. InProceedings of the 2020 USENIX Annual Technical Conference (ATC). 205–218

2020
[57]

Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. 2018. LegoOS: A Disseminated, Distributed OS for Hardware Resource Dis- aggregation.. InProceedings of the 13th Symposium on Operating System Design and Implementation (OSDI). 69–87

2018
[58]

Zhiming Shen, Zhen Sun, Gur-Eyal Sela, Eugene Bagdasaryan, Christina Delimitrou, Robbert van Renesse, and Hakim Weatherspoon
[59]

InProceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIV)

X-Containers: Breaking Down Barriers to Improve Performance and Isolation of Cloud-Native Containers.. InProceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXIV). 121–135
[60]

Pietzuch

Simon Shillaker and Peter R. Pietzuch. 2020. Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing.. InProceedings of the 2020 USENIX Annual Technical Conference (ATC). 419–433

2020
[61]

Hellerstein, and Alexey Tumanov

Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, Johann Schleier-Smith, Joseph Gonzalez, Joseph M. Hellerstein, and Alexey Tumanov. 2020. Cloudburst: Stateful Functions-as-a-Service.Proc. VLDB Endow.13, 11 (2020), 2438–2452

2020
[62]

Foteini Strati, Xianzhe Ma, and Ana Klimovic. 2024. Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications.. InProceedings of the 2024 EuroSys Conference. 1075–1092

2024
[63]

Dmitrii Ustiugov, Dohyun Park, Lazar Cvetkovic, Mihajlo Djokic, Hongyu Hè, Boris Grot, and Ana Klimovic. 2023. Enabling In-Vitro Serverless Systems Research.. InProceedings of the 4th Workshop on Resource Disaggregation and Serverless. 1–7

2023
[64]

Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, and Boris Grot. 2021. Benchmarking, analysis, and optimization of serverless function snapshots.. InProceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XXVI). 559–572

2021
[65]

Wanninger, Joshua J

Nicholas C. Wanninger, Joshua J. Bowden, Kirtankumar Shetty, Ayush Garg, and Kyle C. Hale. 2022. Isolating functions at the hardware limit with virtines.. InProceedings of the 2022 EuroSys Conference. 644–662

2022
[66]

Robert N. M. Watson, Jonathan Woodruff, Peter G. Neumann, Simon W. Moore, Jonathan Anderson, David Chisnall, Nirav H. Dave, Brooks Davis, Khilan Gudka, Ben Laurie, Steven J. Murdoch, Robert M. Nor- ton, Michael Roe, Stacey D. Son, and Munraj Vadera. 2015. CHERI: A 14 Hybrid Capability-System Architecture for Scalable Software Com- partmentalization.. InIE...

2015
[67]

Jianing You, Kang Chen, Laiping Zhao, Yiming Li, Yichi Chen, Yux- uan Du, Yanjie Wang, Luhang Wen, Keyang Hu, and Keqiu Li. 2025. AlloyStack: A Library Operating System for Serverless Workflow Ap- plications.. InProceedings of the 2025 EuroSys Conference. 921–937

2025
[68]

Rossbach

Hangchen Yu, Arthur Michener Peters, Amogh Akshintala, and Christopher J. Rossbach. 2020. AvA: Accelerated Virtualization of Accelerators.. InProceedings of the 25th International Conference on Ar- chitectural Support for Programming Languages and Operating Systems (ASPLOS-XXV). 807–825

2020
[69]

Fletcher, and Josep Torrellas

Zirui Neil Zhao, Adam Morrison, Christopher W. Fletcher, and Josep Torrellas. 2024. Everywhere All at Once: Co-Location Attacks on Public Cloud FaaS.. InASPLOS (1). 133–149. 15

2024