pith. machine review for the scientific record. sign in

arxiv: 2605.09787 · v1 · submitted 2026-05-10 · 💻 cs.DC · cs.PF

Recognition: no theorem link

Cloud Performance Decomposition for Long-Term Performance Engineering: A Case Study

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:54 UTC · model grok-4.3

classification 💻 cs.DC cs.PF
keywords cloud performancetime-series decompositionperformance predictionserverless functionsresource allocationseasonal patternsAWS scaling
0
0 comments X

The pith

Two time-series decomposition techniques reveal obscured trends and seasonal cycles in cloud performance traces, enabling accurate predictions and improved resource allocation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a hybrid and a fully automatic time-series decomposition method to separate intertwined factors in cloud performance data. In a case study with 11 serverless functions, both methods consistently uncover trends and seasonal cycles like weekly and quarterly patterns. These components then support future performance prediction with mean absolute percentage errors of 1.8% and 2.1%, outperforming basic time-series and deep learning methods. The insights also inform resource allocation on AWS, cutting latency variability by more than 60% and maximum latency by 10%, with generalization shown on benchmarks.

Core claim

We propose two time-series decomposition techniques for cloud performance engineering: a hybrid/manual method and a fully automatic method. Through a case study of 11 serverless functions, we show that both approaches can successfully and consistently reveal trends and seasonal cycles, such as weekly and quarterly patterns, which are otherwise obscured. As an evaluation and application of the decomposition, we used the decomposed components to predict future performance, yielding mean absolute percentage error (MAPE) values of only 1.8% (hybrid) and 2.1% (automatic), significantly outperforming basic time-series methods and deep learning. We further show that decomposition insights can guide

What carries the argument

The hybrid/manual and fully automatic time-series decomposition techniques that isolate trend, seasonal, and residual components in performance traces.

Load-bearing premise

The proposed decomposition techniques accurately isolate the underlying performance factors without introducing artifacts or missing intermittent patterns, and that the results from the 11-function case study and AWS generalize to diverse cloud deployments.

What would settle it

A cloud performance trace from another service or provider where the decomposition fails to identify known seasonal patterns or produces prediction errors higher than basic time-series methods would challenge the central claim.

Figures

Figures reproduced from arXiv: 2605.09787 by Donald Lien, Lori Pollock, Shimul Debnath, Wei Wang, William Hart.

Figure 1
Figure 1. Figure 1: Latency trace of a cloud function over 301 days [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: STL decomposition on the trace from Fig. 1. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: An illustration of one iteration of EMD decomposition [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The invocation chain of the SAB functions. [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Latency traces of SAB functions (the trace of [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The overall workflow of manual decomposition. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Trend and seasonal variations decomposed from the trace in Fig. 1 using the hybrid decomposition technique. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: ACF values of the residuals in Fig. 7d). [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Latency prediction based on the hybrid decomposition [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Components (IMFs) from the automatic decomposition, with comparisons with the hybrid/manual decomposition. [PITH_FULL_IMAGE:figures/full_fig_p007_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Prediction based on Automatic decomposition for [PITH_FULL_IMAGE:figures/full_fig_p007_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Hybrid and automatic decomposition results for four SAB applications. Each application has four figures representing [PITH_FULL_IMAGE:figures/full_fig_p008_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Predicted daily latency of the rest 10 SAB traces. [PITH_FULL_IMAGE:figures/full_fig_p009_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Two-weeks latency under normal unoptimized vs. decomposition-informed allocation. Gold-asterisk points indicate the [PITH_FULL_IMAGE:figures/full_fig_p011_14.png] view at source ↗
read the original abstract

Cloud performance fluctuates due to factors such as resource contention and workload changes. These factors can be short-term, seasonal, or long-term. Their effects are often intertwined in performance traces, making performance management difficult. Prior work on cloud performance engineering used time-series decomposition to separate these factors. However, existing approaches rely on basic decomposition methods that may miss key variation patterns and fail on traces with complex or intermittent patterns, limiting their usefulness across diverse cloud deployments. To address this limitation, we propose two time-series decomposition techniques for cloud performance engineering: a hybrid/manual method and a fully automatic method. Through a case study of 11 serverless functions, we show that both approaches can successfully and consistently reveal trends and seasonal cycles, such as weekly and quarterly patterns, which are otherwise obscured. As an evaluation and application of the decomposition, we used the decomposed components to predict future performance, yielding mean absolute percentage error (MAPE) values of only 1.8\% (hybrid) and 2.1\% (automatic), significantly outperforming basic time-series methods and deep learning. We further show that decomposition insights can guide practical resource allocation. Using decomposition-informed scaling on AWS, we reduced latency variability by over 60\% and maximum latency by 10\%. Similar experiments on benchmarks on AWS confirmed that seasonal patterns and performance gains generalize beyond our case study. Notably, our findings demonstrate that even a single performance trace contains rich actionable information for guiding cloud management decisions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes two time-series decomposition techniques (a hybrid/manual method and a fully automatic method) to separate short-term, seasonal, and long-term factors in cloud performance traces, which are often intertwined due to resource contention and workload changes. In a case study of 11 serverless functions, both methods are shown to reveal obscured trends and seasonal cycles such as weekly and quarterly patterns. The decomposed components are then applied to predict future performance, achieving MAPE values of 1.8% (hybrid) and 2.1% (automatic) that outperform basic time-series methods and deep learning. The insights are further used to guide AWS resource scaling, reducing latency variability by over 60% and maximum latency by 10%, with generalization confirmed via additional AWS benchmarks.

Significance. If the decomposition techniques prove accurate in isolating performance factors without introducing artifacts, the work could meaningfully advance long-term performance engineering in cloud and serverless systems by extracting actionable information from individual traces for improved prediction and resource allocation. The empirical AWS results and reported scaling gains provide a practical demonstration of potential impact.

major comments (2)
  1. [Abstract and §5] Abstract and §5 (Evaluation): The reported MAPE values (1.8% hybrid, 2.1% automatic) and claims of outperforming baselines are presented without details on the decomposition algorithms, data preprocessing, or quantitative validation (e.g., tests on synthetic traces with known intermittent patterns or metrics confirming no artifacts). This is load-bearing for the central claim that the methods 'successfully and consistently reveal trends and seasonal cycles' and enable low-error prediction.
  2. [§4] §4 (Case Study): The generalization from 11 serverless functions and AWS benchmarks to 'diverse cloud deployments' with complex patterns rests on visual inspection and prediction MAPE alone; no quantitative assessment (such as recovery error on injected patterns or cross-validation against known events) is provided to confirm faithful isolation of factors.
minor comments (2)
  1. Figure captions and legends in the decomposition and scaling result plots could be expanded to explicitly label components (trend, seasonal, residual) and scaling policies for easier interpretation.
  2. [§5] The paper would benefit from a brief comparison table in §5 summarizing MAPE and variability metrics against all baselines (basic time-series, deep learning, and non-decomposition scaling) for direct readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate where revisions will be made to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Abstract and §5] Abstract and §5 (Evaluation): The reported MAPE values (1.8% hybrid, 2.1% automatic) and claims of outperforming baselines are presented without details on the decomposition algorithms, data preprocessing, or quantitative validation (e.g., tests on synthetic traces with known intermittent patterns or metrics confirming no artifacts). This is load-bearing for the central claim that the methods 'successfully and consistently reveal trends and seasonal cycles' and enable low-error prediction.

    Authors: The decomposition algorithms are presented in Section 3, with the hybrid method combining manual pattern identification and automated component fitting, and the automatic method relying on statistical detection of seasonal and trend components. Data preprocessing steps, including normalization and outlier handling for the performance traces, are described briefly but will be expanded with pseudocode and parameter settings in the revised manuscript. Our validation relies on real AWS traces from 11 serverless functions plus additional benchmarks, where low MAPE and scaling gains serve as evidence of effective decomposition without introducing obvious artifacts. We acknowledge that synthetic traces with injected patterns would provide stronger quantitative confirmation of no artifacts; however, such controlled experiments are outside the current scope as our focus is on practical cloud workloads where ground truth factors are unavailable. We will add a dedicated subsection in §5 discussing preprocessing details, potential artifacts, and why the real-world results support the claims. revision: partial

  2. Referee: [§4] §4 (Case Study): The generalization from 11 serverless functions and AWS benchmarks to 'diverse cloud deployments' with complex patterns rests on visual inspection and prediction MAPE alone; no quantitative assessment (such as recovery error on injected patterns or cross-validation against known events) is provided to confirm faithful isolation of factors.

    Authors: Section 4 presents results from 11 serverless functions and confirms generalization via separate AWS benchmark experiments showing similar seasonal patterns and scaling benefits. Visual inspection of decomposed components combined with predictive accuracy (MAPE) and practical latency reductions (>60% variability, 10% max latency) serve as our primary evidence for faithful isolation, as real traces lack known ground-truth factors for metrics like recovery error. We agree this limits strong claims about all diverse deployments and will revise the text in §4 and the abstract to qualify the generalization scope, emphasize the serverless/AWS context, and add a limitations paragraph on the absence of injected-pattern validation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical case study and out-of-sample evaluation are self-contained

full rationale

The paper proposes two decomposition techniques (hybrid and automatic) and applies them to 11 serverless function traces. It reports that the methods reveal trends/seasonal cycles, then uses the resulting components for future-performance forecasting evaluated by MAPE against held-out actual data (1.8% hybrid, 2.1% automatic), plus AWS scaling experiments that measure latency reduction. These steps rely on standard time-series methods, direct comparison to baselines, and external benchmarks rather than any self-definition, fitted-parameter renaming, or self-citation chain. The central claims are falsifiable via the reported quantitative metrics and do not reduce to their inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard assumption from time-series analysis that performance traces can be meaningfully decomposed into trend, seasonal, and residual components; no free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption Time-series performance data can be decomposed into additive or multiplicative trend, seasonal, and residual components that capture distinct variation sources.
    Invoked implicitly when proposing the hybrid and automatic decomposition methods to reveal obscured patterns.

pith-pipeline@v0.9.0 · 5571 in / 1249 out tokens · 55118 ms · 2026-05-12T02:54:02.485121+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

92 extracted references · 92 canonical work pages

  1. [1]

    CNCF Cloud Native Definition v1.0,

    CNCF, “CNCF Cloud Native Definition v1.0,” https://github.com/cncf/toc/blob/main/DEFINITION.md, 2018, accessed: 06-09-2023

  2. [2]

    What Is Cloud Native?

    Amazon, “What Is Cloud Native?” https://aws.amazon.com/what- is/cloud-native/, 2023, accessed: 06-09-2023

  3. [3]

    Next Generation Cloud Computing: New Trends and Research Directions,

    B. Varghese and R. Buyya, “Next Generation Cloud Computing: New Trends and Research Directions,”Future Generation Computer Systems, vol. 79, pp. 849–861, 2018

  4. [4]

    In- frastructure Cost Comparison of Running Web Applications in the Cloud Using AWS Lambda and Monolithic and Microservice Architectures,

    M. Villamizar, O. Garc ´es, L. Ochoa, H. Castro, L. Salamanca, M. Ver- ano, R. Casallas, S. Gil, C. Valencia, A. Zambrano, and M. Lang, “In- frastructure Cost Comparison of Running Web Applications in the Cloud Using AWS Lambda and Monolithic and Microservice Architectures,” in2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing...

  5. [5]

    AlpaServe: Statistical Mul- tiplexing with Model Parallelism for Deep Learning Serving,

    Z. Li, L. Zheng, Y . Zhong, V . Liu, Y . Sheng, X. Jin, Y . Huang, Z. Chen, H. Zhang, J. E. Gonzalez, and I. Stoica, “AlpaServe: Statistical Mul- tiplexing with Model Parallelism for Deep Learning Serving,” in17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23), 2023

  6. [6]

    Morphling: Fast, Near-Optimal Auto-Configuration for Cloud-Native Model Serving,

    L. Wang, L. Yang, Y . Yu, W. Wang, B. Li, X. Sun, J. He, and L. Zhang, “Morphling: Fast, Near-Optimal Auto-Configuration for Cloud-Native Model Serving,” inProceedings of the ACM Symposium on Cloud Computing, 2021

  7. [7]

    Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native,

    Y . Lu, S. Bian, L. Chen, Y . He, Y . Hui, M. Lentz, B. Li, F. Liu, J. Li, Q. Liu, R. Liu, X. Liu, L. Ma, K. Rong, J. Wang, Y . Wu, Y . Wu, H. Zhang, M. Zhang, Q. Zhang, T. Zhou, and D. Zhuo, “Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native,”

  8. [8]

    Available: https://arxiv.org/abs/2401.12230

    [Online]. Available: https://arxiv.org/abs/2401.12230

  9. [9]

    Serving DNNs like Clockwork: Performance Predictability from the Bottom Up,

    A. Gujarati, R. Karimi, S. Alzayat, W. Hao, A. Kaufmann, Y . Vigfusson, and J. Mace, “Serving DNNs like Clockwork: Performance Predictability from the Bottom Up,” in14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), 2020

  10. [10]

    Automatic performance monitoring and regres- sion testing during the transition from monolith to microservices,

    A. Janes and B. Russo, “Automatic performance monitoring and regres- sion testing during the transition from monolith to microservices,” in 2019 IEEE International Symposium on Software Reliability Engineer- ing Workshops (ISSREW), 2019

  11. [11]

    Long-term iaas selection using performance discovery,

    S. M. M. Fattah, A. Bouguettaya, and S. Mistry, “Long-term iaas selection using performance discovery,”IEEE Transactions on Services Computing, vol. 15, no. 4, 2022

  12. [12]

    Performance monitoring and root cause analysis for cloud-hosted web applications,

    H. Jayathilaka, C. Krintz, and R. Wolski, “Performance monitoring and root cause analysis for cloud-hosted web applications,” inProceedings of the 26th International Conference on World Wide Web, 2017

  13. [13]

    Taming Performance Variability,

    A. Maricq, D. Duplyakin, I. Jimenez, C. Maltzahn, R. Stutsman, and R. Ricci, “Taming Performance Variability,” inUSENIX Symp. on Operating Systems Design and Implementation, 2018

  14. [14]

    Perfor- mance Evaluation of Heterogeneous Cloud Functions,

    K. Figiela, A. Gajek, A. Zima, B. Obrok, and M. Malawski, “Perfor- mance Evaluation of Heterogeneous Cloud Functions,”Concurrency and Computation: Practice and Experience, vol. 30, no. 23, p. e4792, 2018

  15. [15]

    Sieve: Actionable Insights from Monitored Metrics in Distributed Systems,

    J. Thalheim, A. Rodrigues, I. E. Akkus, P. Bhatotia, R. Chen, B. Viswanath, L. Jiao, and C. Fetzer, “Sieve: Actionable Insights from Monitored Metrics in Distributed Systems,” inProceedings of the 18th ACM/IFIP/USENIX Middleware Conference, 2017

  16. [16]

    Prediction and characterization of application power use in a high-performance computing environment,

    B. Bugbee, C. Phillips, H. Egan, R. Elmore, K. Gruchalla, and A. Purkayastha, “Prediction and characterization of application power use in a high-performance computing environment,”Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 10, no. 3, pp. 155–165, 2017

  17. [17]

    A case study on the stability of performance tests for serverless applications,

    S. Eismann, D. E. Costa, L. Liao, C.-P. Bezemer, W. Shang, A. van Hoorn, and S. Kounev, “A case study on the stability of performance tests for serverless applications,”Journal of Systems and Software, vol. 189, p. 111294, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0164121222000498

  18. [18]

    A Novel Technique for Long-Term Anomaly Detection in the Cloud,

    O. Vallis, J. Hochenbaum, and A. Kejariwal, “A Novel Technique for Long-Term Anomaly Detection in the Cloud,” in6th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 14), 2014

  19. [19]

    Time Series Forecasting of Cloud Resource Usage,

    S. S, N. S, and S. V . D. K, “Time Series Forecasting of Cloud Resource Usage,” in2021 IEEE 6th International Conference on Computing, Communication and Automation (ICCCA), 2021

  20. [20]

    Software Aging Prediction for Cloud Services Using a Gate Recurrent Unit Neural Network Model Based on Time Series Decomposition,

    K. Jia, X. Yu, C. Zhang, W. Hu, D. Zhao, and J. Xiang, “Software Aging Prediction for Cloud Services Using a Gate Recurrent Unit Neural Network Model Based on Time Series Decomposition,”IEEE Transactions on Emerging Topics in Computing, 2023

  21. [21]

    STL: A Seasonal-Trend Decomposition,

    R. B. Cleveland, W. S. Cleveland, J. E. McRae, and I. Terpenning, “STL: A Seasonal-Trend Decomposition,”Journal of Official Statistics, vol. 6, no. 1, pp. 3–73, 1990

  22. [22]

    Anomaly Detection in Time Series: A Comprehensive Evaluation,

    S. Schmidl, P. Wenig, and T. Papenbrock, “Anomaly Detection in Time Series: A Comprehensive Evaluation,”Proc. VLDB Endow., vol. 15, no. 9, p. 1779–1797, may 2022

  23. [23]

    A review of irregular time series data handling with gated recurrent neural networks,

    P. B. Weerakody, K. W. Wong, G. Wang, and W. Ela, “A review of irregular time series data handling with gated recurrent neural networks,” Neurocomputing, vol. 441, pp. 161–178, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231221003003

  24. [24]

    Functional latent dynamics for irregularly sampled time se- ries forecasting,

    C. Kl ¨otergens, V . K. Yalavarthi, M. Stubbemann, and L. Schmidt- Thieme, “Functional latent dynamics for irregularly sampled time se- ries forecasting,” inMachine Learning and Knowledge Discovery in Databases. Research Track, A. Bifet, J. Davis, T. Krilavi ˇcius, M. Kull, E. Ntoutsi, and I. ˇZliobait˙e, Eds., 2024. 13

  25. [25]

    Analysis and solution to the mode mixing phenomenon in emd,

    Y . Gao, G. Ge, Z. Sheng, and E. Sang, “Analysis and solution to the mode mixing phenomenon in emd,” in2008 Congress on Image and Signal Processing, vol. 5, 2008, pp. 223–227

  26. [26]

    Study on mode mixing problem of empirical mode decomposition,

    G. Xu, Z. Yang, and S. Wang, “Study on mode mixing problem of empirical mode decomposition,” in2016 Joint International Informa- tion Technology, Mechanical and Electronic Engineering Conference. Atlantis Press, 2016, pp. 389–394

  27. [27]

    The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Non-stationary Time Series Analysis,

    N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.- C. Yen, C. C. Tung, and H. H. Liu, “The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Non-stationary Time Series Analysis,”Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences, vol. 454, no. 1971, pp. 903–995, 1998

  28. [28]

    Ensemble empirical mode decomposition: a noise-assisted data analysis method,

    Z. Wu and N. E. Huang, “Ensemble empirical mode decomposition: a noise-assisted data analysis method,”Advances in Adaptive Data Analysis, vol. 01, no. 01, pp. 1–41, 2009

  29. [29]

    Chatfield and H

    C. Chatfield and H. Xing,The analysis of time series: an introduction with R. CRC press, 2019

  30. [30]

    P. J. Brockwell and R. A. Davis,Introduction to Time Series and Forecasting. Springer, 2002

  31. [31]

    Exploring time series randomness,

    P. Massoli, “Exploring time series randomness,”Curr Res Stat Math, vol. 3, no. 1, pp. 01–07, 2024

  32. [32]

    The Fourier Transform,

    R. N. Bracewell, “The Fourier Transform,”Scientific American, vol. 260, no. 6, pp. 86–95, 1989. [Online]. Available: http://www.jstor.org/stable/24987290

  33. [33]

    Data-driven Nonstationary Signal De- composition Approaches: A Comparative Analysis,

    T. Eriksen and N. u. Rehman, “Data-driven Nonstationary Signal De- composition Approaches: A Comparative Analysis,”Scientific Reports, vol. 13, no. 1, p. 1798, 2023

  34. [34]

    Signal Processing Techniques Applied to Human Sleep EEG Signals—A Review,

    S. Motamedi-Fakhr, M. Moshrefi-Torbati, M. Hill, C. M. Hill, and P. R. White, “Signal Processing Techniques Applied to Human Sleep EEG Signals—A Review,”Biomedical Signal Processing and Control, vol. 10, pp. 21–33, 2014

  35. [35]

    Uniform Phase Empirical Mode Decomposition: An Optimal Hybridization of Masking Signal and Ensemble Approaches,

    Y .-H. Wang, K. Hu, and M.-T. Lo, “Uniform Phase Empirical Mode Decomposition: An Optimal Hybridization of Masking Signal and Ensemble Approaches,”IEEE Access, vol. 6, 2018

  36. [36]

    New Insights and Best Practices for the Successful Use of Empirical Mode Decomposition, Iterative Filtering and Derived Algorithms,

    A. Stallone, A. Cicone, and M. Materassi, “New Insights and Best Practices for the Successful Use of Empirical Mode Decomposition, Iterative Filtering and Derived Algorithms,”Scientific reports, vol. 10, no. 1, p. 15161, 2020

  37. [37]

    A New View of Nonlinear Water Waves: the Hilbert Spectrum,

    N. E. Huang, Z. Shen, and S. R. Long, “A New View of Nonlinear Water Waves: the Hilbert Spectrum,”Annual review of fluid mechanics, vol. 31, no. 1, pp. 417–457, 1999

  38. [38]

    R. J. Hyndman and G. Athanasopoulos,Forecasting: principles and practice. OTexts, 2018

  39. [39]

    Deep learning combined wind speed forecasting with hybrid time series decomposition and multi-objective parameter optimization,

    S.-X. Lv and L. Wang, “Deep learning combined wind speed forecasting with hybrid time series decomposition and multi-objective parameter optimization,”Applied Energy, vol. 311, p. 118674, 2022

  40. [40]

    A Hybrid Evolutionary Decompo- sition System for Time Series Forecasting,

    J. F. de Oliveira and T. B. Ludermir, “A Hybrid Evolutionary Decompo- sition System for Time Series Forecasting,”Neurocomputing, vol. 180, pp. 27–34, 2016, progress in Intelligent Systems Design

  41. [41]

    Time Series Forecasting using A Hybrid ARIMA and Neural Network Model,

    G. Zhang, “Time Series Forecasting using A Hybrid ARIMA and Neural Network Model,”Neurocomputing, vol. 50, pp. 159–175, 2003

  42. [42]

    Exponential Smoothing: The State of the Art,

    E. S. Gardner Jr, “Exponential Smoothing: The State of the Art,”Journal of forecasting, vol. 4, no. 1, pp. 1–28, 1985

  43. [43]

    Generalized hampel filters,

    R. K. Pearson, Y . Neuvo, J. Astola, and M. Gabbouj, “Generalized hampel filters,”EURASIP Journal on Advances in Signal Processing, vol. 2016, pp. 1–18, 2016

  44. [44]

    Forecasting Sales by Exponentially Weighted Moving Averages,

    P. R. Winters, “Forecasting Sales by Exponentially Weighted Moving Averages,”Management science, vol. 6, no. 3, pp. 324–342, 1960

  45. [45]

    Is Big Data Performance Re- producible in Modern Cloud Networks? ,

    A. Uta, A. Custura, D. Duplyakin, I. Jimenez, J. Rellermeyer, C. Maltzahn, R. Ricci, and A. Iosup, “Is Big Data Performance Re- producible in Modern Cloud Networks? ,” in17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), 2020

  46. [46]

    Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems,

    H. S. Gunawi, R. O. Suminto, R. Sears, C. Golliher, S. Sundararaman, X. Lin, T. Emami, W. Sheng, N. Bidokhti, C. McCaffrey, D. Srinivasan, B. Panda, A. Baptist, G. Grider, P. M. Fields, K. Harms, R. B. Ross, A. Jacobson, R. Ricci, K. Webb, P. Alvaro, H. B. Runesha, M. Hao, and H. Li, “Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Pr...

  47. [47]

    Stripe 2021 Update,

    Stripe, “Stripe 2021 Update,” https://stripe.com/files/stripe-2021- update.pdf, 2021, [Online]

  48. [48]

    Counterfactual expla- nations for multivariate time series,

    E. Ates, B. Aksar, V . J. Leung, and A. K. Coskun, “Counterfactual expla- nations for multivariate time series,” in2021 International Conference on Applied Artificial Intelligence (ICAPAI), 2021

  49. [49]

    How Your Web Traffic Changes With the Season,

    K. Pratt, “How Your Web Traffic Changes With the Season,” https://uberall.com/en-us/resources/blog/how-your-web-traffic-changes- with-the-season, 2019, accessed: 07-09-2023

  50. [50]

    Notice a drop in your Website Traffic this sum- mer? Don’t Panic!

    M. Technologies, “Notice a drop in your Website Traffic this sum- mer? Don’t Panic!” https://www.mltinnovations.com/notice-a-drop-in- your-website-traffic-this-summer-dont-panic/, 2023, accessed: 07-09- 2023

  51. [51]

    How Does Summer Affect Website Traffic?

    S. Pace, “How Does Summer Affect Website Traffic?” https://blog.imageworksllc.com/blog/how-does-summer-affect-website- traffic, 2015, accessed: 07-09-2023

  52. [52]

    Does Restraining End Effect Matter in EMD-based Modeling Framework for Time Series Prediction? Some Experimental Evidences,

    T. Xiong, Y . Bao, and Z. Hu, “Does Restraining End Effect Matter in EMD-based Modeling Framework for Time Series Prediction? Some Experimental Evidences,”Neurocomputing, vol. 123, pp. 174–184, 2014, contains Special issue articles: Advances in Pattern Recognition Applications and Methods. [Online]. Available: https://www.sciencedirect.com/science/article...

  53. [53]

    Application of the EEMD Method to Rotor Fault Diagnosis of Rotating Machinery,

    Y . Lei, Z. He, and Y . Zi, “Application of the EEMD Method to Rotor Fault Diagnosis of Rotating Machinery,”Mechanical Systems and Signal Processing, vol. 23, no. 4, pp. 1327–1338, 2009. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0888327008002720

  54. [54]

    On the Marriage of Lp-norms and Edit Distance,

    L. Chen and R. Ng, “On the Marriage of Lp-norms and Edit Distance,” inProceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30, 2004

  55. [55]

    FBDetect: Catching Tiny Performance Regres- sions at Hyperscale through In-Production Monitoring,

    D. Y . Yoon, Y . Wang, M. Yu, E. Huang, J. I. Jones, A. Kukkadapu, O. Kocas, J. Wiepert, K. Goenka, S. Chen, Y . Lin, Z. Huang, J. Kong, M. Chow, and C. Tang, “FBDetect: Catching Tiny Performance Regres- sions at Hyperscale through In-Production Monitoring,” inProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

  56. [56]

    SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing,

    M. Copik, G. Kwasniewski, M. Besta, M. Podstawski, and T. Hoefler, “SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing,” inProceedings of the 22nd International Middleware Conference, ser. Middleware ’21. Association for Computing Machinery, 2021. [Online]. Available: https://doi.org/10.1145/3464298.3476133

  57. [57]

    Predicting Faults in High Per- formance Computing Systems: An in-Depth Survey of the State-of- the-Practice,

    D. Jauk, D. Yang, and M. Schulz, “Predicting Faults in High Per- formance Computing Systems: An in-Depth Survey of the State-of- the-Practice,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019

  58. [58]

    Smart Predictive Maintenance for High- Performance Computing Systems: A Literature Review,

    A. L. d. C. D. Lima, V . M. Aranha, C. J. a. d. L. Carvalho, and E. G. S. Nascimento, “Smart Predictive Maintenance for High- Performance Computing Systems: A Literature Review,”J. Supercom- put., vol. 77, no. 11, p. 13494–13513, nov 2021

  59. [59]

    A Systematic Review on Anomaly Detection for Cloud Computing Environments,

    T. Hagemann and K. Katsarou, “A Systematic Review on Anomaly Detection for Cloud Computing Environments,” inProceedings of the 2020 3rd Artificial Intelligence and Cloud Computing Conference, 2021

  60. [60]

    Time-Series Anomaly Detection Service at Microsoft,

    H. Ren, B. Xu, Y . Wang, C. Yi, C. Huang, X. Kou, T. Xing, M. Yang, J. Tong, and Q. Zhang, “Time-Series Anomaly Detection Service at Microsoft,” inProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

  61. [61]

    Performance anomaly detection and bottleneck identification,

    O. Ibidunmoye, F. Hern ´andez-Rodriguez, and E. Elmroth, “Performance anomaly detection and bottleneck identification,”ACM Comput. Surv., vol. 48, no. 1, jul 2015

  62. [62]

    Failure prediction of data centers using time series and fault tree analysis,

    T. Chalermarrewong, T. Achalakul, and S. C. W. See, “Failure prediction of data centers using time series and fault tree analysis,” in2012 IEEE 18th International Conference on Parallel and Distributed Systems, 2012

  63. [63]

    Log-Assisted Straggler-Aware I/O Scheduler for High-End Computing,

    N. Tavakoli, D. Dai, and Y . Chen, “Log-Assisted Straggler-Aware I/O Scheduler for High-End Computing,” in2016 45th International Conference on Parallel Processing Workshops (ICPPW), 2016

  64. [64]

    Profi- ciency Metrics for Failure Prediction in High Performance Computing,

    N. Taerat, C. Leangsuksun, C. Chandler, and N. Naksinehaboon, “Profi- ciency Metrics for Failure Prediction in High Performance Computing,” inInternational Symposium on Parallel and Distributed Processing with Applications, 2010

  65. [65]

    Reliability of a System of k Nodes for High Performance Computing Applications,

    N. R. Gottumukkala, R. Nassar, M. Paun, C. B. Leangsuksun, and S. L. Scott, “Reliability of a System of k Nodes for High Performance Computing Applications,”IEEE Transactions on Reliability, vol. 59, no. 1, pp. 162–169, 2010

  66. [66]

    System- atically Inferring I/O Performance Variability by Examining Repetitive Job Behavior,

    E. Costa, T. Patel, B. Schwaller, J. M. Brandt, and D. Tiwari, “System- atically Inferring I/O Performance Variability by Examining Repetitive Job Behavior,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021

  67. [67]

    Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices,

    Y . Gan, Y . Zhang, K. Hu, D. Cheng, Y . He, M. Pancholi, and C. De- limitrou, “Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices,” inProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019. 14

  68. [68]

    AutoMAP: Diagnose Your Microservice-Based Web Applications Automatically,

    M. Ma, J. Xu, Y . Wang, P. Chen, Z. Zhang, and P. Wang, “AutoMAP: Diagnose Your Microservice-Based Web Applications Automatically,” inProceedings of The Web Conference 2020, 2020

  69. [69]

    Causal inference-based root cause analysis for online service systems with intervention recognition,

    M. Li, Z. Li, K. Yin, X. Nie, W. Zhang, K. Sui, and D. Pei, “Causal inference-based root cause analysis for online service systems with intervention recognition,” inProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022

  70. [70]

    Actionable and Interpretable Fault Localization for Recurring Failures in Online Service Systems,

    Z. Li, N. Zhao, M. Li, X. Lu, L. Wang, D. Chang, X. Nie, L. Cao, W. Zhang, K. Sui, Y . Wang, X. Du, G. Duan, and D. Pei, “Actionable and Interpretable Fault Localization for Recurring Failures in Online Service Systems,” inProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022

  71. [71]

    Enabling Practical Cloud Performance Debugging with Unsupervised Learning,

    Y . Gan, M. Liang, S. Dev, D. Lo, and C. Delimitrou, “Enabling Practical Cloud Performance Debugging with Unsupervised Learning,”SIGOPS Oper. Syst. Rev., vol. 56, no. 1, p. 34–41, jun 2022

  72. [72]

    Detecting Layered Bottlenecks in Microser- vices,

    T. Inagaki, Y . Ueda, M. Ohara, S. Choochotkaew, M. Amaral, S. Trent, T. Chiba, and Q. Zhang, “Detecting Layered Bottlenecks in Microser- vices,” in2022 IEEE 15th International Conference on Cloud Comput- ing (CLOUD), 2022, pp. 385–396

  73. [73]

    Localizing and Explaining Faults in Microservices Using Distributed Tracing,

    J. Rios, S. Jha, and L. Shwartz, “Localizing and Explaining Faults in Microservices Using Distributed Tracing,” in2022 IEEE 15th Interna- tional Conference on Cloud Computing (CLOUD), 2022, pp. 489–499

  74. [74]

    ImpactTracer: Root Cause Localization in Microservices Based on Fault Propagation Modeling,

    R. Xie, J. Yang, J. Li, and L. Wang, “ImpactTracer: Root Cause Localization in Microservices Based on Fault Propagation Modeling,” in2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2023

  75. [75]

    CRISP: Critical Path Analysis of Large-Scale Microservice Architectures,

    Z. Zhang, M. K. Ramanathan, P. Raj, A. Parwal, T. Sherwood, and M. Chabbi, “CRISP: Critical Path Analysis of Large-Scale Microservice Architectures,” in2022 USENIX Annual Technical Conference (USENIX ATC 22), 2022

  76. [76]

    Performance Debugging for Distributed Systems of Black Boxes,

    M. K. Aguilera, J. C. Mogul, J. L. Wiener, P. Reynolds, and A. Muthi- tacharoen, “Performance Debugging for Distributed Systems of Black Boxes,”SIGOPS Oper. Syst. Rev., vol. 37, no. 5, p. 74–89, oct 2003

  77. [77]

    Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control,

    I. Cohen, J. S. Chase, M. Goldszmidt, T. Kelly, and J. Symons, “Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control,” in6th Symposium on Operating Systems Design & Implementation (OSDI 04), 2004

  78. [78]

    Structured comparative analysis of systems logs to diagnose performance problems,

    K. Nagaraj, C. Killian, and J. Neville, “Structured comparative analysis of systems logs to diagnose performance problems,” inProceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, 2012

  79. [79]

    A Case For Cross- Domain Observability to Debug Performance Issues in Microservices,

    R. K, P. Tammana, P. G. Kannan, and P. Naik, “A Case For Cross- Domain Observability to Debug Performance Issues in Microservices,” in2022 IEEE 15th International Conference on Cloud Computing (CLOUD), 2022

  80. [80]

    Col- lie: Finding Performance Anomalies in RDMA Subsystems,

    X. Kong, Y . Zhu, H. Zhou, Z. Jiang, J. Ye, C. Guo, and D. Zhuo, “Col- lie: Finding Performance Anomalies in RDMA Subsystems,” in19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), 2022

Showing first 80 references.