arxiv: 2605.04310 · v1 · submitted 2026-05-05 · 💻 cs.DC

Recognition: unknown

ClusterLess: Deadline-Aware Serverless Workflow Orchestration on Federated Edge Clusters

Ilir Murturi, Mario Colosi, Massimo Villari, Radu Prodan, Reza Farahani, Schahram Dustdar, Stefan Nastic

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:11 UTC · model grok-4.3

classification 💻 cs.DC

keywords serverlessedge computingworkflow orchestrationfederated clustersdeadline awarenessKubernetesdistributed systems

0 comments

The pith

ClusterLess orchestrates concurrent serverless workflows across federated edge Kubernetes clusters to meet strict end-to-end deadlines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ClusterLess to handle serverless workflows that must run concurrently across multiple edge clusters under tight end-to-end timing constraints. Single-cluster orchestration works locally but fails to coordinate placement and execution modes when workflows span federated sites. ClusterLess adds a leader-selected super-master layer on top of intra-cluster management to analyze dependencies and decide where each function runs. Experiments on a six-cluster, 64-node testbed across 18 workload configurations demonstrate faster completion and higher deadline success than four baseline approaches.

Core claim

ClusterLess manages the E2E lifecycle of workflow execution, including dependency analysis, execution mode selection, and resource aware placement. It integrates structured intra cluster orchestration with a leader selected, super master driven intercluster coordination layer, determining where and how each workflow function should be executed across the federated edge clusters.

What carries the argument

The leader-selected super-master intercluster coordination layer, which combines with local intra-cluster orchestration to select execution modes and place functions resource-aware across federated clusters.

If this is right

Workflow completion times drop by up to 40% relative to the four baseline methods.
Deadline satisfaction rises from below 50% to over 90% across the tested configurations.
Any remaining deadline violations stay limited to single-digit seconds.
The gains appear consistently for varying input sizes and deadline classes under concurrent load.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The hybrid local-plus-global decision structure may scale to other container-orchestration platforms beyond the OpenFaaS and Argo stack used here.
Leader election overhead could become noticeable in much larger federations or with frequent cluster membership changes.
Similar coordination patterns could apply to deadline-driven workflows in fog or multi-cloud serverless settings.
Dynamic addition of clusters or nodes would require explicit extensions to the current placement logic.

Load-bearing premise

The six-cluster 64-node testbed and the chosen concurrent workload patterns sufficiently represent real federated multi-edge environments under strict end-to-end deadlines.

What would settle it

Running the same concurrent workflows on a federation of 20 or more clusters with greater network latency variation and checking whether the reported gains in completion time and deadline satisfaction still hold.

Figures

Figures reproduced from arXiv: 2605.04310 by Ilir Murturi, Mario Colosi, Massimo Villari, Radu Prodan, Reza Farahani, Schahram Dustdar, Stefan Nastic.

**Figure 1.** Figure 1: ClusterLess system architecture. E. Resource feasibility model A placement decision for function f on worker zni at time t is resource-feasible only if sufficient compute and memory resources are available on the worker node: C f ≤ Cni(t), Mf ≤ Mni(t), (11) In addition, data transfers induced by workflow dependencies must be bandwidth-feasible. For a dependency g → f ∈ Ew executed on workers zni and znj w… view at source ↗

**Figure 2.** Figure 2: Case study serverless workflows. dominated by processing-intensive audio processing, exposing ClusterLess ability to coordinate concurrent branches and offload heavy functions under tight E2E deadlines. 2) Regression Tuning (RT): performs E2E regression model selection on a structured dataset. The workflow follows a branched DAG of six functions ( view at source ↗

**Figure 3.** Figure 3: Workload composition and temporal arrival behavior across clusters for different arrival rates. D. Experimental design We implemented all ClusterLess components and algorithms in Python 3.12, interacting with the K8s API. 1) Orchestration parameters: We fix the control epoch to ∆T = 1 s, the heartbeat timeout to Tfail = 5 s, and the admissible load threshold to l = 0.75. 2) Concurrency limits and resource… view at source ↗

**Figure 4.** Figure 4: Super-master behavior under increasing cluster load. When the load of C1 reaches the admissible threshold (l = 0.75; vertical dashed line), continuing inter-cluster orchestration on the same master would directly compete with local orchestration for control-plane resources. From this point onward, the super-master role is handled by the master of C2, while C1 continues exclusively with intra-cluster orche… view at source ↗

**Figure 6.** Figure 6: Average workflow completion time per cluster under different arrival regimes. NKS CLI RNX RRX CLU 0 25 50 75 100 Requests (%) 37.7 73.7 73.1 77.7 92.3 Uniform NKS CLI RNX RRX CLU 25.5 43.3 48.7 49.983.4 Skewed NKS CLI RNX RRX CLU 39.081.2 81.7 82.4 96.4 Dynamic Deadline Met Deadline Missed (a) Deadline satisfaction rate. 50 100 150 200 250 Deadline Violation Time (s) Dynamic Skewed Uniform CLU NKS CLI RNX … view at source ↗

**Figure 7.** Figure 7: Deadline violation behavior across strategies and arrival rates. explicitly accounts for both execution and inter-cluster transfer costs, which becomes critical as payload size increases. E. Offloading behavior and execution mode analysis Fig. 9a quantifies how offloading-based strategies respond to increasing arrival pressure. Under the uniform regime, most workflows are executed locally, but clear differ… view at source ↗

**Figure 8.** Figure 8: Deadline satisfaction across strategies and arrival rates under different workload constraints. RNX RRX CLU 0 50 100 Requests (%) 81.3 81.9 86.6 Uniform RNX RRX CLU 65.1 64.2 76.2 Skewed RNX RRX CLU 86.9 87.6 94.4 Dynamic Internal Offloaded (a) Local vs. offloaded function execution. C1 C2 C3 C4 C5 C6 WE WS CS 83.6 86.1 81.8 80.9 76.4 65.0 11.9 9.8 12.5 11.6 16.2 19.6 4.5 4.1 5.7 7.4 7.4 15.3 Uniform WE WS… view at source ↗

**Figure 9.** Figure 9: Offloading behavior and execution-mode selection under different arrival regimes. 0 50 Uniform Skewed C1 Dynamic 0 50 C2 0 50 C3 0 50 C4 0 50 C5 0 250 0 50 0 250 Time (s) 0 250 C6 CPU Utilization (%) NKS CLI RNX RRX CLU view at source ↗

**Figure 10.** Figure 10: Per-cluster CPU utilization in different methods. workload configurations. Results show that ClusterLess reduces workflow completion time and deadline violations compared to four baselines. Future work will explore multi-objective and learning-based orchestration. REFERENCES [1] “Gartner.” https://www.gartner.com/en/newsroom/press-releases/2023-1 0-30-gartner-says-50-percent-of-critical-enterprise-applica… view at source ↗

read the original abstract

The recent convergence of edge computing, serverless execution, and Kubernetes (K8s) based container orchestration has enabled the processing of application workflows close to data sources. While effective within a single edge cluster, existing schemes do not generalize to federated multi edge environments, where multiple workflows execute concurrently under strict end to end (E2E) deadline constraints. This paper introduces ClusterLess, a deadline aware serverless workflow orchestration method for federated multi edge K8s clusters. ClusterLess manages the E2E lifecycle of workflow execution, including dependency analysis, execution mode selection, and resource aware placement. To this end, it integrates structured intra cluster orchestration with a leader selected, super master driven intercluster coordination layer, determining where and how each workflow function should be executed across the federated edge clusters. We implement ClusterLess using OpenFaaS as the serverless execution substrate and Argo for workflow management, and deploy it on a realistic testbed of six edge clusters comprising 64 heterogeneous edge nodes. Experimental results with concurrent serverless workflows, spanning 18 workload configurations across different input sizes and deadline classes, show that ClusterLess reduces workflow completion time by up to 40 %, increases deadline satisfaction from below 50 % to over 90 %, and confines deadline violations to single digit seconds compared to four baseline methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ClusterLess shows workable gains on a 64-node federated testbed for deadline-driven serverless workflows, but the setup's representativeness is the main open question.

read the letter

The paper's core contribution is a concrete orchestration layer that splits intra-cluster scheduling from a leader-chosen super-master for cross-cluster coordination, all aimed at end-to-end deadlines in concurrent serverless workflows. They built it on OpenFaaS and Argo and ran it on a six-cluster, 64-node heterogeneous testbed. Across 18 workload configurations with varying input sizes and deadline classes, the results show up to 40% shorter completion times, deadline satisfaction rising from below 50% to over 90%, and violations limited to single-digit seconds versus four baselines. That is direct measurement on a realistic-enough platform, with no obvious post-hoc filtering or fitted parameters, so the numbers have some weight for an engineering paper in this area. The approach itself does not appear in the cited prior work, which is the main novelty. The testbed and workload choices are the soft spot. Six clusters and 64 nodes under controlled conditions may not capture the full range of inter-cluster latency jitter, node churn, or complex dependency patterns that show up in production federated edges. If those factors increase coordination overhead or create new failure modes, the reported gains could shrink. The paper stays within distributed systems and does not claim broader theoretical impact. It is useful for researchers and practitioners working on edge serverless systems who need implementation-level ideas for deadline handling. The implementation is grounded and the measurements are reported clearly enough that a serious referee should see it, even if the evaluation scope needs tightening in revision.

Referee Report

1 major / 2 minor

Summary. The paper introduces ClusterLess, a deadline-aware serverless workflow orchestration system for federated multi-edge Kubernetes clusters. It combines intra-cluster orchestration with a leader-selected super-master for inter-cluster coordination, handling dependency analysis, execution mode selection, and resource-aware placement. Implemented on OpenFaaS and Argo, it is evaluated on a realistic 6-cluster 64-node heterogeneous testbed across 18 workload configurations with varying input sizes and deadline classes, reporting up to 40% reduction in workflow completion time, deadline satisfaction increasing from below 50% to over 90%, and deadline violations confined to single-digit seconds compared to four baselines.

Significance. If the results hold, the work is significant for edge computing as it provides a practical, implemented solution for concurrent serverless workflows under strict E2E deadlines in federated settings, a gap not addressed by single-cluster schemes. The direct measurements on a heterogeneous testbed without fitted parameters or post-hoc exclusions, spanning multiple input sizes and deadline classes, offer concrete, reproducible evidence of gains over baselines. This strengthens the case for super-master coordination in real deployments.

major comments (1)

[Evaluation section] Evaluation section (testbed and workload description): The central performance claims (40% completion-time reduction, >90% deadline satisfaction) rest on a 6-cluster/64-node testbed and 18 synthetic workloads. To support the broader assertion for real-world federated multi-edge environments, the paper must provide explicit analysis of how the setup models inter-cluster network variability, dynamic node heterogeneity, and complex concurrent workflow dependencies; without this, generalization beyond the controlled testbed remains a load-bearing concern.

minor comments (2)

[Abstract] Abstract: Inconsistent use of 'K8s' and 'Kubernetes'; standardize terminology for clarity.
Consider adding a dedicated limitations or threats-to-validity subsection discussing testbed scale and workload representativeness.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation of minor revision. We address the major comment on the evaluation section below by expanding the manuscript with explicit analysis of the testbed modeling choices.

read point-by-point responses

Referee: [Evaluation section] Evaluation section (testbed and workload description): The central performance claims (40% completion-time reduction, >90% deadline satisfaction) rest on a 6-cluster/64-node testbed and 18 synthetic workloads. To support the broader assertion for real-world federated multi-edge environments, the paper must provide explicit analysis of how the setup models inter-cluster network variability, dynamic node heterogeneity, and complex concurrent workflow dependencies; without this, generalization beyond the controlled testbed remains a load-bearing concern.

Authors: We agree that additional explicit analysis would strengthen the paper's support for generalization. The original manuscript described the testbed as realistic and heterogeneous but did not dedicate space to detailing the modeling of the three aspects raised. In the revised manuscript, we have added a dedicated paragraph in the Evaluation section (under testbed description) that explicitly explains: (1) inter-cluster network variability is modeled via direct measurements of latency and bandwidth between the six clusters using standard tools on the physical federated setup; (2) dynamic node heterogeneity is captured by deploying on 64 real edge nodes with documented variations in CPU cores, memory, and network interfaces across clusters, without any synthetic fitting; and (3) complex concurrent workflow dependencies are handled by running all 18 workload configurations with simultaneous execution of multiple workflows, shared resource contention, and varying input sizes/deadline classes. These additions are based on the actual experimental configuration and do not change any reported results. We believe this directly addresses the concern while preserving the paper's focus. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from direct testbed implementation

full rationale

The paper introduces ClusterLess as an orchestration method integrating intra-cluster and inter-cluster coordination for serverless workflows, implemented on OpenFaaS and Argo, then evaluated via direct measurement on a 6-cluster 64-node testbed across 18 workload configurations. All performance claims (completion time, deadline satisfaction) derive from these concrete runs rather than any equations, fitted parameters, predictions, or self-referential derivations. No self-citation chains, ansatzes, or uniqueness theorems are invoked as load-bearing steps; the evaluation is self-contained and externally replicable.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper introduces an engineering system rather than a mathematical derivation, so the ledger contains domain assumptions about cluster coordination and workload representativeness rather than free parameters or invented physical entities.

axioms (2)

domain assumption Federated edge clusters can be coordinated via a leader-selected super-master layer with acceptable communication overhead.
Invoked in the description of the inter-cluster coordination layer.
domain assumption The chosen workload patterns and deadline classes are representative of real concurrent serverless applications.
Underlies the claim that results generalize beyond the 18 tested configurations.

invented entities (1)

ClusterLess orchestration system no independent evidence
purpose: To manage E2E lifecycle including dependency analysis, mode selection, and placement across federated clusters.
The novel proposed software artifact.

pith-pipeline@v0.9.0 · 5559 in / 1430 out tokens · 85748 ms · 2026-05-08T17:11:49.281654+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 2 canonical work pages

[1]

Gartner

“Gartner.” https://www.gartner.com/en/newsroom/press-releases/2023-1 0-30-gartner-says-50-percent-of-critical-enterprise-applications-will-r eside-outside-of-centralized-public-cloud-locations-through-2027

2023
[2]

How Does it Function? Characterizing Long-Term Trends in Production Serverless Workloads,

A. Joosenet al., “How Does it Function? Characterizing Long-Term Trends in Production Serverless Workloads,” inProc. of the 2023 ACM Symp. on Cloud Computing, 2023

2023
[3]

Serverless Workflow Management on the Computing Continuum: A Mini-Survey,

R. Farahaniet al., “Serverless Workflow Management on the Computing Continuum: A Mini-Survey,” in15th ACM/SPEC Intl. Conf. on Perfor- mance Engineering, 2024

2024
[4]

Kubernetes in IT Administration and Serverless Computing: An Empirical Study and Research Challenges,

S. K. Mondalet al., “Kubernetes in IT Administration and Serverless Computing: An Empirical Study and Research Challenges,”The Journal of Supercomputing, 2022

2022
[5]

FaasHouse: Sustainable Serverless Edge Com- puting through Energy-Aware Resource Scheduling,

M. S. Aslanpouret al., “FaasHouse: Sustainable Serverless Edge Com- puting through Energy-Aware Resource Scheduling,”IEEE Tran. on Services Computing, 2024

2024
[6]

Kubernetes Scheduling: Taxonomy, Ongoing Issues and Challenges,

C. Carrión, “Kubernetes Scheduling: Taxonomy, Ongoing Issues and Challenges,”ACM Computing Surveys, 2022

2022
[7]

EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum,

R. Farahani and R. Prodan, “EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum,” inIEEE Intl. Conf. on Cloud Computing, IEEE, 2025

2025
[8]

Serverless Computing: A Survey of Opportunities, Challenges, and Applications,

H. Shafieiet al., “Serverless Computing: A Survey of Opportunities, Challenges, and Applications,”ACM Computing Surveys, 2022

2022
[9]

Modern Computing: Vision and Challenges,

S. S. Gillet al., “Modern Computing: Vision and Challenges,”Telem- atics and Informatics Reports, 2024

2024
[10]

Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum,

R. Farahaniet al., “Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum,” inIEEE Intl. Conf. on Cluster Computing, IEEE, 2024

2024
[11]

Evaluating the Impact of Inter-cluster Com- munications in Edge Computing,

M. Michalkeet al., “Evaluating the Impact of Inter-cluster Com- munications in Edge Computing,” inIEEE Network Operations and Management Symp., IEEE, 2025

2025
[12]

Optimizing Service Selection and Load Balancing in Multi-Cluster Microservice Systems with MCOSS,

D. Bacharet al., “Optimizing Service Selection and Load Balancing in Multi-Cluster Microservice Systems with MCOSS,” in2023 IFIP Networking Conf., IEEE, 2023

2023
[13]

Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems,

L. Poggianiet al., “Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems,” inProc. of the 1st Workshop on Serverless at the Edge, 2024

2024
[14]

HEART: Heterogeneous-Aware Traffic Allocation in Multi-Replica Deployments on Kubernetes,

H. Parket al., “HEART: Heterogeneous-Aware Traffic Allocation in Multi-Replica Deployments on Kubernetes,” in2025 IEEE 18th Intl. Conf. on Cloud Computing, IEEE, 2025

2025
[15]

Karmada

“Karmada.” Accessed: 2025-01-10

2025
[16]

Open Source FaaS Performance Aspects,

D. Ballaet al., “Open Source FaaS Performance Aspects,” in2020 43rd Intl. Conf. on Telecommunications and Signal Processing, IEEE, 2020

2020
[17]

Mitigating Cold Starts in Serverless Platforms: A Pool-based Approach,

P.-M. Lin and A. Glikson, “Mitigating Cold Starts in Serverless Platforms: A Pool-based Approach,”arXiv preprint arXiv:1903.12221, 2019

work page arXiv 1903
[18]

Dirigent: Lightweight Serverless Orchestration,

L. Cvetkovi ´cet al., “Dirigent: Lightweight Serverless Orchestration,” inProc. of the ACM SIGOPS Symp. on Operating Systems Principles, 2024

2024
[19]

Towards Seamless Serverless Computing Across an Edge-Cloud Continuum,

E. Simionet al., “Towards Seamless Serverless Computing Across an Edge-Cloud Continuum,” inProce. of the IEEE/ACM 16th Intl. Conf. on Utility and Cloud Computing, 2023

2023
[20]

Triggerflow: Trigger-based Orchestration of Server- less Workflows,

P. G. Lópezet al., “Triggerflow: Trigger-based Orchestration of Server- less Workflows,” inProc. of the 14th ACM Intl. Conf. on Distributed and Event-Based Systems, 2020

2020
[21]

GreenWhisk: Emission-Aware Computing for Server- less Platform,

J. Serenariet al., “GreenWhisk: Emission-Aware Computing for Server- less Platform,” inIEEE Intl. Conf. on Cloud Engineering, IEEE, 2024

2024
[22]

Beyond Throughput: A 4G LTE Dataset with Channel and Context Metrics,

D. Racaet al., “Beyond Throughput: A 4G LTE Dataset with Channel and Context Metrics,” inProc. of the 9th ACM Multimedia Systems Conference, 2018. Available at: https://zenodo.org/records/1219679

work page arXiv 2018
[23]

Predicting the Costs of Serverless Workflows,

S. Eismannet al., “Predicting the Costs of Serverless Workflows,” inProc. of the 2020 ACM/SPEC International Conf. on Performance Engineering. Available at: https://github.com/jacopotagliabue/no-ops-m achine-learning, year = 2020

2020
[24]

Regression Tuning Workflow

“Regression Tuning Workflow.” Available at https://github.com/jacopot agliabue/no-ops-machine-learning
[25]

Analysis of Enterprise Media Server Workloads: Access Patterns, Locality, Content Evolution, and Rates of Change,

L. Cherkasova and M. Gupta, “Analysis of Enterprise Media Server Workloads: Access Patterns, Locality, Content Evolution, and Rates of Change,”IEEE/ACM Trans. on Networking, 2004

2004