pith. machine review for the scientific record. sign in

arxiv: 2605.04310 · v1 · submitted 2026-05-05 · 💻 cs.DC

Recognition: unknown

ClusterLess: Deadline-Aware Serverless Workflow Orchestration on Federated Edge Clusters

Ilir Murturi, Mario Colosi, Massimo Villari, Radu Prodan, Reza Farahani, Schahram Dustdar, Stefan Nastic

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:11 UTC · model grok-4.3

classification 💻 cs.DC
keywords serverlessedge computingworkflow orchestrationfederated clustersdeadline awarenessKubernetesdistributed systems
0
0 comments X

The pith

ClusterLess orchestrates concurrent serverless workflows across federated edge Kubernetes clusters to meet strict end-to-end deadlines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ClusterLess to handle serverless workflows that must run concurrently across multiple edge clusters under tight end-to-end timing constraints. Single-cluster orchestration works locally but fails to coordinate placement and execution modes when workflows span federated sites. ClusterLess adds a leader-selected super-master layer on top of intra-cluster management to analyze dependencies and decide where each function runs. Experiments on a six-cluster, 64-node testbed across 18 workload configurations demonstrate faster completion and higher deadline success than four baseline approaches.

Core claim

ClusterLess manages the E2E lifecycle of workflow execution, including dependency analysis, execution mode selection, and resource aware placement. It integrates structured intra cluster orchestration with a leader selected, super master driven intercluster coordination layer, determining where and how each workflow function should be executed across the federated edge clusters.

What carries the argument

The leader-selected super-master intercluster coordination layer, which combines with local intra-cluster orchestration to select execution modes and place functions resource-aware across federated clusters.

If this is right

  • Workflow completion times drop by up to 40% relative to the four baseline methods.
  • Deadline satisfaction rises from below 50% to over 90% across the tested configurations.
  • Any remaining deadline violations stay limited to single-digit seconds.
  • The gains appear consistently for varying input sizes and deadline classes under concurrent load.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The hybrid local-plus-global decision structure may scale to other container-orchestration platforms beyond the OpenFaaS and Argo stack used here.
  • Leader election overhead could become noticeable in much larger federations or with frequent cluster membership changes.
  • Similar coordination patterns could apply to deadline-driven workflows in fog or multi-cloud serverless settings.
  • Dynamic addition of clusters or nodes would require explicit extensions to the current placement logic.

Load-bearing premise

The six-cluster 64-node testbed and the chosen concurrent workload patterns sufficiently represent real federated multi-edge environments under strict end-to-end deadlines.

What would settle it

Running the same concurrent workflows on a federation of 20 or more clusters with greater network latency variation and checking whether the reported gains in completion time and deadline satisfaction still hold.

Figures

Figures reproduced from arXiv: 2605.04310 by Ilir Murturi, Mario Colosi, Massimo Villari, Radu Prodan, Reza Farahani, Schahram Dustdar, Stefan Nastic.

Figure 1
Figure 1. Figure 1: ClusterLess system architecture. E. Resource feasibility model A placement decision for function f on worker zni at time t is resource-feasible only if sufficient compute and memory resources are available on the worker node: C f ≤ Cni(t), Mf ≤ Mni(t), (11) In addition, data transfers induced by workflow de￾pendencies must be bandwidth-feasible. For a dependency g → f ∈ Ew executed on workers zni and znj w… view at source ↗
Figure 2
Figure 2. Figure 2: Case study serverless workflows. dominated by processing-intensive audio processing, exposing ClusterLess ability to coordinate concurrent branches and offload heavy functions under tight E2E deadlines. 2) Regression Tuning (RT): performs E2E regression model selection on a structured dataset. The workflow follows a branched DAG of six functions ( view at source ↗
Figure 3
Figure 3. Figure 3: Workload composition and temporal arrival behavior across clusters for different arrival rates. D. Experimental design We implemented all ClusterLess components and algo￾rithms in Python 3.12, interacting with the K8s API. 1) Orchestration parameters: We fix the control epoch to ∆T = 1 s, the heartbeat timeout to Tfail = 5 s, and the admissible load threshold to l = 0.75. 2) Concurrency limits and resource… view at source ↗
Figure 4
Figure 4. Figure 4: Super-master behavior under increasing cluster load. When the load of C1 reaches the admissible threshold (l = 0.75; vertical dashed line), continuing inter-cluster or￾chestration on the same master would directly compete with local orchestration for control-plane resources. From this point onward, the super-master role is handled by the master of C2, while C1 continues exclusively with intra-cluster orche… view at source ↗
Figure 6
Figure 6. Figure 6: Average workflow completion time per cluster under different arrival regimes. NKS CLI RNX RRX CLU 0 25 50 75 100 Requests (%) 37.7 73.7 73.1 77.7 92.3 Uniform NKS CLI RNX RRX CLU 25.5 43.3 48.7 49.983.4 Skewed NKS CLI RNX RRX CLU 39.081.2 81.7 82.4 96.4 Dynamic Deadline Met Deadline Missed (a) Deadline satisfaction rate. 50 100 150 200 250 Deadline Violation Time (s) Dynamic Skewed Uniform CLU NKS CLI RNX … view at source ↗
Figure 7
Figure 7. Figure 7: Deadline violation behavior across strategies and arrival rates. explicitly accounts for both execution and inter-cluster transfer costs, which becomes critical as payload size increases. E. Offloading behavior and execution mode analysis Fig. 9a quantifies how offloading-based strategies respond to increasing arrival pressure. Under the uniform regime, most workflows are executed locally, but clear differ… view at source ↗
Figure 8
Figure 8. Figure 8: Deadline satisfaction across strategies and arrival rates under different workload constraints. RNX RRX CLU 0 50 100 Requests (%) 81.3 81.9 86.6 Uniform RNX RRX CLU 65.1 64.2 76.2 Skewed RNX RRX CLU 86.9 87.6 94.4 Dynamic Internal Offloaded (a) Local vs. offloaded function execution. C1 C2 C3 C4 C5 C6 WE WS CS 83.6 86.1 81.8 80.9 76.4 65.0 11.9 9.8 12.5 11.6 16.2 19.6 4.5 4.1 5.7 7.4 7.4 15.3 Uniform WE WS… view at source ↗
Figure 9
Figure 9. Figure 9: Offloading behavior and execution-mode selection under different arrival regimes. 0 50 Uniform Skewed C1 Dynamic 0 50 C2 0 50 C3 0 50 C4 0 50 C5 0 250 0 50 0 250 Time (s) 0 250 C6 CPU Utilization (%) NKS CLI RNX RRX CLU view at source ↗
Figure 10
Figure 10. Figure 10: Per-cluster CPU utilization in different methods. workload configurations. Results show that ClusterLess reduces workflow completion time and deadline violations compared to four baselines. Future work will explore multi-objective and learning-based orchestration. REFERENCES [1] “Gartner.” https://www.gartner.com/en/newsroom/press-releases/2023-1 0-30-gartner-says-50-percent-of-critical-enterprise-applica… view at source ↗
read the original abstract

The recent convergence of edge computing, serverless execution, and Kubernetes (K8s) based container orchestration has enabled the processing of application workflows close to data sources. While effective within a single edge cluster, existing schemes do not generalize to federated multi edge environments, where multiple workflows execute concurrently under strict end to end (E2E) deadline constraints. This paper introduces ClusterLess, a deadline aware serverless workflow orchestration method for federated multi edge K8s clusters. ClusterLess manages the E2E lifecycle of workflow execution, including dependency analysis, execution mode selection, and resource aware placement. To this end, it integrates structured intra cluster orchestration with a leader selected, super master driven intercluster coordination layer, determining where and how each workflow function should be executed across the federated edge clusters. We implement ClusterLess using OpenFaaS as the serverless execution substrate and Argo for workflow management, and deploy it on a realistic testbed of six edge clusters comprising 64 heterogeneous edge nodes. Experimental results with concurrent serverless workflows, spanning 18 workload configurations across different input sizes and deadline classes, show that ClusterLess reduces workflow completion time by up to 40 %, increases deadline satisfaction from below 50 % to over 90 %, and confines deadline violations to single digit seconds compared to four baseline methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces ClusterLess, a deadline-aware serverless workflow orchestration system for federated multi-edge Kubernetes clusters. It combines intra-cluster orchestration with a leader-selected super-master for inter-cluster coordination, handling dependency analysis, execution mode selection, and resource-aware placement. Implemented on OpenFaaS and Argo, it is evaluated on a realistic 6-cluster 64-node heterogeneous testbed across 18 workload configurations with varying input sizes and deadline classes, reporting up to 40% reduction in workflow completion time, deadline satisfaction increasing from below 50% to over 90%, and deadline violations confined to single-digit seconds compared to four baselines.

Significance. If the results hold, the work is significant for edge computing as it provides a practical, implemented solution for concurrent serverless workflows under strict E2E deadlines in federated settings, a gap not addressed by single-cluster schemes. The direct measurements on a heterogeneous testbed without fitted parameters or post-hoc exclusions, spanning multiple input sizes and deadline classes, offer concrete, reproducible evidence of gains over baselines. This strengthens the case for super-master coordination in real deployments.

major comments (1)
  1. [Evaluation section] Evaluation section (testbed and workload description): The central performance claims (40% completion-time reduction, >90% deadline satisfaction) rest on a 6-cluster/64-node testbed and 18 synthetic workloads. To support the broader assertion for real-world federated multi-edge environments, the paper must provide explicit analysis of how the setup models inter-cluster network variability, dynamic node heterogeneity, and complex concurrent workflow dependencies; without this, generalization beyond the controlled testbed remains a load-bearing concern.
minor comments (2)
  1. [Abstract] Abstract: Inconsistent use of 'K8s' and 'Kubernetes'; standardize terminology for clarity.
  2. Consider adding a dedicated limitations or threats-to-validity subsection discussing testbed scale and workload representativeness.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation of minor revision. We address the major comment on the evaluation section below by expanding the manuscript with explicit analysis of the testbed modeling choices.

read point-by-point responses
  1. Referee: [Evaluation section] Evaluation section (testbed and workload description): The central performance claims (40% completion-time reduction, >90% deadline satisfaction) rest on a 6-cluster/64-node testbed and 18 synthetic workloads. To support the broader assertion for real-world federated multi-edge environments, the paper must provide explicit analysis of how the setup models inter-cluster network variability, dynamic node heterogeneity, and complex concurrent workflow dependencies; without this, generalization beyond the controlled testbed remains a load-bearing concern.

    Authors: We agree that additional explicit analysis would strengthen the paper's support for generalization. The original manuscript described the testbed as realistic and heterogeneous but did not dedicate space to detailing the modeling of the three aspects raised. In the revised manuscript, we have added a dedicated paragraph in the Evaluation section (under testbed description) that explicitly explains: (1) inter-cluster network variability is modeled via direct measurements of latency and bandwidth between the six clusters using standard tools on the physical federated setup; (2) dynamic node heterogeneity is captured by deploying on 64 real edge nodes with documented variations in CPU cores, memory, and network interfaces across clusters, without any synthetic fitting; and (3) complex concurrent workflow dependencies are handled by running all 18 workload configurations with simultaneous execution of multiple workflows, shared resource contention, and varying input sizes/deadline classes. These additions are based on the actual experimental configuration and do not change any reported results. We believe this directly addresses the concern while preserving the paper's focus. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from direct testbed implementation

full rationale

The paper introduces ClusterLess as an orchestration method integrating intra-cluster and inter-cluster coordination for serverless workflows, implemented on OpenFaaS and Argo, then evaluated via direct measurement on a 6-cluster 64-node testbed across 18 workload configurations. All performance claims (completion time, deadline satisfaction) derive from these concrete runs rather than any equations, fitted parameters, predictions, or self-referential derivations. No self-citation chains, ansatzes, or uniqueness theorems are invoked as load-bearing steps; the evaluation is self-contained and externally replicable.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper introduces an engineering system rather than a mathematical derivation, so the ledger contains domain assumptions about cluster coordination and workload representativeness rather than free parameters or invented physical entities.

axioms (2)
  • domain assumption Federated edge clusters can be coordinated via a leader-selected super-master layer with acceptable communication overhead.
    Invoked in the description of the inter-cluster coordination layer.
  • domain assumption The chosen workload patterns and deadline classes are representative of real concurrent serverless applications.
    Underlies the claim that results generalize beyond the 18 tested configurations.
invented entities (1)
  • ClusterLess orchestration system no independent evidence
    purpose: To manage E2E lifecycle including dependency analysis, mode selection, and placement across federated clusters.
    The novel proposed software artifact.

pith-pipeline@v0.9.0 · 5559 in / 1430 out tokens · 85748 ms · 2026-05-08T17:11:49.281654+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 2 canonical work pages

  1. [1]

    Gartner

    “Gartner.” https://www.gartner.com/en/newsroom/press-releases/2023-1 0-30-gartner-says-50-percent-of-critical-enterprise-applications-will-r eside-outside-of-centralized-public-cloud-locations-through-2027

  2. [2]

    How Does it Function? Characterizing Long-Term Trends in Production Serverless Workloads,

    A. Joosenet al., “How Does it Function? Characterizing Long-Term Trends in Production Serverless Workloads,” inProc. of the 2023 ACM Symp. on Cloud Computing, 2023

  3. [3]

    Serverless Workflow Management on the Computing Continuum: A Mini-Survey,

    R. Farahaniet al., “Serverless Workflow Management on the Computing Continuum: A Mini-Survey,” in15th ACM/SPEC Intl. Conf. on Perfor- mance Engineering, 2024

  4. [4]

    Kubernetes in IT Administration and Serverless Computing: An Empirical Study and Research Challenges,

    S. K. Mondalet al., “Kubernetes in IT Administration and Serverless Computing: An Empirical Study and Research Challenges,”The Journal of Supercomputing, 2022

  5. [5]

    FaasHouse: Sustainable Serverless Edge Com- puting through Energy-Aware Resource Scheduling,

    M. S. Aslanpouret al., “FaasHouse: Sustainable Serverless Edge Com- puting through Energy-Aware Resource Scheduling,”IEEE Tran. on Services Computing, 2024

  6. [6]

    Kubernetes Scheduling: Taxonomy, Ongoing Issues and Challenges,

    C. Carrión, “Kubernetes Scheduling: Taxonomy, Ongoing Issues and Challenges,”ACM Computing Surveys, 2022

  7. [7]

    EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum,

    R. Farahani and R. Prodan, “EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum,” inIEEE Intl. Conf. on Cloud Computing, IEEE, 2025

  8. [8]

    Serverless Computing: A Survey of Opportunities, Challenges, and Applications,

    H. Shafieiet al., “Serverless Computing: A Survey of Opportunities, Challenges, and Applications,”ACM Computing Surveys, 2022

  9. [9]

    Modern Computing: Vision and Challenges,

    S. S. Gillet al., “Modern Computing: Vision and Challenges,”Telem- atics and Informatics Reports, 2024

  10. [10]

    Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum,

    R. Farahaniet al., “Heftless: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum,” inIEEE Intl. Conf. on Cluster Computing, IEEE, 2024

  11. [11]

    Evaluating the Impact of Inter-cluster Com- munications in Edge Computing,

    M. Michalkeet al., “Evaluating the Impact of Inter-cluster Com- munications in Edge Computing,” inIEEE Network Operations and Management Symp., IEEE, 2025

  12. [12]

    Optimizing Service Selection and Load Balancing in Multi-Cluster Microservice Systems with MCOSS,

    D. Bacharet al., “Optimizing Service Selection and Load Balancing in Multi-Cluster Microservice Systems with MCOSS,” in2023 IFIP Networking Conf., IEEE, 2023

  13. [13]

    Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems,

    L. Poggianiet al., “Live Migration of Multi-Container Kubernetes Pods in Multi-Cluster Serverless Edge Systems,” inProc. of the 1st Workshop on Serverless at the Edge, 2024

  14. [14]

    HEART: Heterogeneous-Aware Traffic Allocation in Multi-Replica Deployments on Kubernetes,

    H. Parket al., “HEART: Heterogeneous-Aware Traffic Allocation in Multi-Replica Deployments on Kubernetes,” in2025 IEEE 18th Intl. Conf. on Cloud Computing, IEEE, 2025

  15. [15]

    Karmada

    “Karmada.” Accessed: 2025-01-10

  16. [16]

    Open Source FaaS Performance Aspects,

    D. Ballaet al., “Open Source FaaS Performance Aspects,” in2020 43rd Intl. Conf. on Telecommunications and Signal Processing, IEEE, 2020

  17. [17]

    Mitigating Cold Starts in Serverless Platforms: A Pool-based Approach,

    P.-M. Lin and A. Glikson, “Mitigating Cold Starts in Serverless Platforms: A Pool-based Approach,”arXiv preprint arXiv:1903.12221, 2019

  18. [18]

    Dirigent: Lightweight Serverless Orchestration,

    L. Cvetkovi ´cet al., “Dirigent: Lightweight Serverless Orchestration,” inProc. of the ACM SIGOPS Symp. on Operating Systems Principles, 2024

  19. [19]

    Towards Seamless Serverless Computing Across an Edge-Cloud Continuum,

    E. Simionet al., “Towards Seamless Serverless Computing Across an Edge-Cloud Continuum,” inProce. of the IEEE/ACM 16th Intl. Conf. on Utility and Cloud Computing, 2023

  20. [20]

    Triggerflow: Trigger-based Orchestration of Server- less Workflows,

    P. G. Lópezet al., “Triggerflow: Trigger-based Orchestration of Server- less Workflows,” inProc. of the 14th ACM Intl. Conf. on Distributed and Event-Based Systems, 2020

  21. [21]

    GreenWhisk: Emission-Aware Computing for Server- less Platform,

    J. Serenariet al., “GreenWhisk: Emission-Aware Computing for Server- less Platform,” inIEEE Intl. Conf. on Cloud Engineering, IEEE, 2024

  22. [22]

    Beyond Throughput: A 4G LTE Dataset with Channel and Context Metrics,

    D. Racaet al., “Beyond Throughput: A 4G LTE Dataset with Channel and Context Metrics,” inProc. of the 9th ACM Multimedia Systems Conference, 2018. Available at: https://zenodo.org/records/1219679

  23. [23]

    Predicting the Costs of Serverless Workflows,

    S. Eismannet al., “Predicting the Costs of Serverless Workflows,” inProc. of the 2020 ACM/SPEC International Conf. on Performance Engineering. Available at: https://github.com/jacopotagliabue/no-ops-m achine-learning, year = 2020

  24. [24]

    Regression Tuning Workflow

    “Regression Tuning Workflow.” Available at https://github.com/jacopot agliabue/no-ops-machine-learning

  25. [25]

    Analysis of Enterprise Media Server Workloads: Access Patterns, Locality, Content Evolution, and Rates of Change,

    L. Cherkasova and M. Gupta, “Analysis of Enterprise Media Server Workloads: Access Patterns, Locality, Content Evolution, and Rates of Change,”IEEE/ACM Trans. on Networking, 2004