Time Series Analysis in Machine Learning
Pith reviewed 2026-06-27 08:27 UTC · model grok-4.3
The pith
Machine learning techniques for time series build directly on classical statistical models like ARIMA.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The chapter establishes that traditional statistical methods provide the necessary groundwork for modern machine learning approaches to time series, with coverage of both categories plus domain examples to give readers theoretical understanding and practical context for application in research.
What carries the argument
The structured progression from basic time series concepts through classical statistical models to machine learning methods that organizes the review.
If this is right
- Readers gain the ability to apply ARIMA or exponential smoothing as baselines before moving to recurrent or transformer models for forecasting tasks.
- Feature-based regression and Gaussian processes supply interpretable alternatives when deep learning models prove too opaque for scientific data.
- State-space models integrate naturally with hidden Markov models for analyzing sequential observations in astronomy.
- Common principles across domains allow techniques tested in finance to transfer to weather or astrophysical time series with minimal adjustment.
Where Pith is reading between the lines
- The review's emphasis on classical foundations implies that new machine learning models for time series should routinely report performance against ARIMA or state-space baselines.
- Multi-domain examples suggest the methods are sufficiently general that standardized test suites could be developed to compare approaches across scientific fields.
- The coverage of transformers points toward extensions that combine attention mechanisms with explicit handling of non-stationarity for real-time cosmology data streams.
Load-bearing premise
The review accurately and comprehensively represents the selected classical and machine learning techniques without significant omissions or errors.
What would settle it
A reader locating a clear factual error in the description of how ARIMA models or transformers handle time series data would undermine the review's reliability as a pedagogical resource.
Figures
read the original abstract
Time series analysis is a fundamental component of machine learning, especially in astrophysics and cosmology where temporal data abound. This chapter provides a pedagogical review of time series analysis techniques from a machine learning perspective. We cover the basic concepts of time series (stationarity, autocorrelation, seasonality), classical statistical models (autoregressive, moving average, ARIMA, exponential smoothing, state-space models), and modern machine learning approaches. In particular, we discuss how traditional statistical methods lay the groundwork, and then explore machine learning methods for time series, including feature-based regression, tree-based ensemble methods, hidden Markov models, Gaussian processes, and deep learning models (recurrent neural networks, convolutional networks, transformers). Throughout, we illustrate with examples drawn from multiple domains (e.g. astronomy, weather forecasting, finance) to emphasize common principles. The goal is to equip readers with both the theoretical understanding and practical context to apply machine learning techniques for time series analysis in their research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a pedagogical review chapter on time series analysis from a machine learning perspective. It covers basic concepts including stationarity, autocorrelation, and seasonality; classical statistical models such as autoregressive, moving average, ARIMA, exponential smoothing, and state-space models; and modern ML approaches including feature-based regression, tree-based ensembles, hidden Markov models, Gaussian processes, and deep learning models (RNNs, CNNs, transformers). Examples are drawn from astronomy, weather forecasting, and finance to illustrate common principles, with the goal of providing theoretical understanding and practical context for applying these techniques.
Significance. If the descriptions of the listed techniques are accurate and balanced, the review could serve as a useful introductory resource for astrophysics and cosmology researchers working with temporal data, by connecting classical statistical foundations to contemporary ML methods without introducing new derivations or claims.
minor comments (2)
- [Abstract] The abstract lists 'hidden Markov models' under modern ML approaches; confirm that the corresponding section distinguishes them clearly from classical state-space models to avoid potential overlap in presentation.
- Ensure that domain examples (astronomy, weather, finance) are distributed evenly across sections rather than clustered, to better emphasize the 'common principles' stated in the abstract.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript as a useful pedagogical review and for the recommendation to accept. We are glad that the coverage of classical and modern time series methods, along with domain examples, is viewed as providing appropriate theoretical and practical context for astrophysics researchers.
Circularity Check
No significant circularity: pedagogical review with no derivations or predictions
full rationale
The manuscript is a review chapter that surveys existing time series concepts, classical models (ARIMA, state-space), and ML methods (RNNs, transformers, GPs) with domain examples. No original derivations, fitted parameters, predictions, or uniqueness theorems are claimed. The central claim is simply that the listed topics are covered pedagogically; this is self-contained and carries no circularity burden under the defined criteria. No self-citation load-bearing steps, ansatzes, or renamings of results exist.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Aigrain, S., & Foreman-Mackey, D. (2023). Gaussian Process regression for astronomical time series.Annual Review of Astronomy and Astrophysics61, 329–371. [arXiv:2209.08940]
arXiv 2023
-
[2]
Akhmetali, A., Zhunuskanov, A., Sakan, A., Zaidyn, M., Namazbayev, T., Turlykozhayeva, D., & Ussipov, N. (2025). Luminis Stellarum et Machina: Applications of Machine Learning in Light Curve Analysis.arXiv preprint arXiv:2504.10038
arXiv 2025
-
[3]
F., Stella, L., Turkmen, C., Zhang, X., Mercado, P., Shen, H., Shchur, O., Rangapu- ram, S
Ansari, A. F., Stella, L., Turkmen, C., Zhang, X., Mercado, P., Shen, H., Shchur, O., Rangapu- ram, S. S., Pineda Arango, S., Kapoor, S., Zschiegner, J., Maddix, D. C., Wang, H., Mahoney, M. W., Torkkola, K., Wilson, A. G., Bohlke-Schneider, M., & Wang, Y . (2024). Chronos: Learning the Language of Time Series.Transactions on Machine Learning Research, 20...
Pith/arXiv arXiv 2024
-
[4]
M., Lim, P
Astropy Collaboration, Price-Whelan, A. M., Lim, P. L., et al. (2022). The Astropy Project: Sustaining and Growing a Community-oriented Open-source Project and the Latest Major Release (v5.0) of the Core Package.The Astrophysical Journal935(2), 167
2022
-
[5]
Bai, S., Kolter, J. Z., & Koltun, V . (2018). An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling.arXiv preprint arXiv:1803.01271
Pith/arXiv arXiv 2018
-
[6]
C., Rangapuram, S
Alexandrov, A., Benidis, K., Bohlke-Schneider, M., Flunkert, V ., Gasthaus, J., Januschowski, T., Maddix, D. C., Rangapuram, S. S., Salinas, D., Schulz, J., Stella, L., T ¨urkmen, A. C., & Wang, Y . (2020). GluonTS: Probabilistic and Neural Time Series Modeling in Python.Journal of Machine Learning Research21(116), 1–6
2020
-
[7]
C., et al
Bellm, E. C., et al. (2019). The Zwicky Transient Facility: System Overview, Performance, and First Results.Publications of the Astronomical Society of the Pacific131, 018002
2019
-
[8]
J., & Clifford, J
Berndt, D. J., & Clifford, J. (1994). Using Dynamic Time Warping to Find Patterns in Time Series. InKDD Workshop10(16), 359–370
1994
-
[9]
Box, G. E. P., & Jenkins, G. M. (1970).Time Series Analysis: Forecasting and Control. Holden-Day
1970
-
[10]
Breiman, L. (2001). Random Forests.Machine Learning45(1), 5–32
2001
-
[11]
Che, Z., Purushotham, S., Cho, K., Sontag, D., & Liu, Y . (2018). Recurrent Neural Networks for Multivariate Time Series with Missing Values.Scientific Reports8, 6085
2018
-
[12]
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. InProceed- ings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794
2016
-
[13]
Chen, R. T. Q., Rubanova, Y ., Bettencourt, J., & Duvenaud, D. (2018). Neural Ordinary Dif- ferential Equations. InAdvances in Neural Information Processing Systems (NeurIPS 2018) 31, 6571–6583. 1 Time Series Analysis in Machine Learning 29
2018
-
[14]
Christ, M., Braun, N., Neuffer, J., & Kempa-Liehr, A. W. (2018). Time Series FeatuRe Ex- traction on basis of Scalable Hypothesis tests (tsfresh – A Python package).Neurocomputing 307, 72–77
2018
-
[15]
Cho, K., van Merri ¨enboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y . (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statis- tical Machine Translation. InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1724–1734
2014
-
[16]
B., Cleveland, W
Cleveland, R. B., Cleveland, W. S., McRae, J. E., & Terpenning, I. (1990). STL: A Seasonal- Trend Decomposition Procedure Based on Loess.Journal of Official Statistics6(1), 3–73
1990
-
[17]
Das, A., Kong, W., Sen, R., & Zhou, Y . (2024). A decoder-only foundation model for time- series forecasting. InProceedings of the 41st International Conference on Machine Learning (ICML 2024), PMLR 235, 10148–10167
2024
-
[18]
Dempster, A., Petitjean, F., & Webb, G. I. (2020). ROCKET: Exceptionally Fast and Accurate Time Series Classification Using Random Convolutional Kernels.Data Mining and Knowl- edge Discovery34(5), 1454–1495
2020
-
[19]
Foreman-Mackey, D., Agol, E., Ambikasaran, S., & Angus, R. (2017). Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series.The Astronomical Journal154(6), 220
2017
-
[20]
Gal, Y ., & Ghahramani, Z. (2016). Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. InProceedings of the 33rd International Conference on Machine Learning (ICML 2016), PMLR48, 1050–1059
2016
-
[21]
George, D., & Huerta, E. A. (2018). Deep Learning for Real-Time Gravitational Wave De- tection and Parameter Estimation: Results with Advanced LIGO Data.Physics Letters B778, 64–70
2018
-
[22]
Hall, T., & Rasheed, K. (2025). A Survey of Machine Learning Methods for Time Series Prediction.Applied Sciences15(11), 5957. [DOI: 10.3390/app15115957]
-
[23]
Hamilton, J. D. (1994).Time Series Analysis. Princeton University Press
1994
-
[24]
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory.Neural Computation 9(8), 1735–1780
1997
-
[25]
J., & Koehler, A
Hyndman, R. J., & Koehler, A. B. (2006). Another Look at Measures of Forecast Accuracy. International Journal of Forecasting22(4), 679–688
2006
-
[26]
J., & Athanasopoulos, G
Hyndman, R. J., & Athanasopoulos, G. (2021).Forecasting: Principles and Practice(3rd ed.). OTexts: Melbourne, Australia
2021
-
[27]
F., Weber, J., Webb, G
Ismail Fawaz, H., Lucas, B., Forestier, G., Pelletier, C., Schmidt, D. F., Weber, J., Webb, G. I., Idoumghar, L., Muller, P.-A., & Petitjean, F. (2020). InceptionTime: Finding AlexNet for Time Series Classification.Data Mining and Knowledge Discovery34(6), 1936–1962
2020
-
[28]
Ivezi ´c, ˇZ., et al. (2019). LSST: From Science Drivers to Reference Design and Anticipated Data Products.The Astrophysical Journal873(2), 111
2019
-
[29]
Jain, S., & Wallace, B. C. (2019). Attention is not Explanation. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), 3543–3556
2019
-
[30]
Kalman, R. E. (1960). A New Approach to Linear Filtering and Prediction Problems.Journal of Basic Engineering82(1), 35–45
1960
-
[31]
C., Bechtold, J., & Siemiginowska, A
Kelly, B. C., Bechtold, J., & Siemiginowska, A. (2009). Are the Variations in Quasar Optical Flux Driven by Thermal Fluctuations?The Astrophysical Journal698(1), 895–910
2009
-
[32]
C., Becker, A
Kelly, B. C., Becker, A. C., Sobolewska, M., Siemiginowska, A., & Uttley, P. (2014). Flexible and Scalable Methods for Quantifying Stochastic Variability in the Era of Massive Time- domain Astronomical Data Sets.The Astrophysical Journal788(1), 33
2014
-
[33]
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y . (2017). Light- GBM: A Highly Efficient Gradient Boosting Decision Tree. InAdvances in Neural Informa- tion Processing Systems (NIPS 2017)30, 3146–3154
2017
-
[34]
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., & Anand- kumar, A. (2021). Fourier Neural Operator for Parametric Partial Differential Equations. In Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). [arXiv:2010.08895] 30 A. Pagliaro and A. Anzalone
Pith/arXiv arXiv 2021
-
[35]
O., Loeff, N., & Pfister, T
Lim, B., Arik, S. O., Loeff, N., & Pfister, T. (2021). Temporal Fusion Transformers for Inter- pretable Multi-horizon Time Series Forecasting.International Journal of Forecasting37(4), 1748–1764
2021
-
[36]
Lomb, N. R. (1976). Least-Squares Frequency Analysis of Unequally Spaced Data.Astro- physics and Space Science39(2), 447–462
1976
-
[37]
Lu, L., Jin, P., Pang, G., Zhang, Z., & Karniadakis, G. E. (2021). Learning Nonlinear Operators via DeepONet Based on the Universal Approximation Theorem of Operators.Nature Machine Intelligence3(3), 218–229
2021
-
[38]
L., Ivezi ´c, ˇZ., Kochanek, C
MacLeod, C. L., Ivezi ´c, ˇZ., Kochanek, C. S., Kozłowski, S., Kelly, B., Bootes, E., Gibson, R. R., Becker, A. C., & de Vries, W. H. (2010). Modeling the Time Variability of SDSS Stripe 82 Quasars as a Damped Random Walk.The Astrophysical Journal721(2), 1014–1033
2010
-
[39]
Makridakis, S., Spiliotis, E., & Assimakopoulos, V . (2020). The M4 Competition: 100,000 Time Series and 61 Forecasting Methods.International Journal of Forecasting36(1), 54–74
2020
-
[40]
Makridakis, S., Spiliotis, E., & Assimakopoulos, V . (2022). The M5 Accuracy Competition: Results, Findings and Conclusions.International Journal of Forecasting38(4), 1346–1364
2022
-
[41]
Malhotra, P., Vig, L., Shroff, G., & Agarwal, P. (2015). Long Short Term Memory Networks for Anomaly Detection in Time Series.ESANN 2015 proceedings, 89–94
2015
-
[42]
Middlehurst, M., Large, J., Flynn, M., Lines, J., Bostrom, A., & Bagnall, A. (2021). HIVE- COTE 2.0: A New Meta-Ensemble for Time Series Classification.Machine Learning110(11), 3211–3243
2021
-
[43]
H., Sinthong, P., & Kalagnanam, J
Nie, Y ., Nguyen, N. H., Sinthong, P., & Kalagnanam, J. (2023). A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. InProceedings of the 11th International Conference on Learning Representations (ICLR 2023). [arXiv:2211.14730]
Pith/arXiv arXiv 2023
-
[44]
van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). WaveNet: A Generative Model for Raw Audio. arXiv preprint arXiv:1609.03499
Pith/arXiv arXiv 2016
-
[45]
N., Carpov, D., Chapados, N., & Bengio, Y
Oreshkin, B. N., Carpov, D., Chapados, N., & Bengio, Y . (2020). N-BEATS: Neural Basis Expansion Analysis for Interpretable Time Series Forecasting. InProceedings of the 8th In- ternational Conference on Learning Representations (ICLR 2020). [arXiv:1905.10437]
arXiv 2020
-
[46]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V ., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V ., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python.Journal of Machine Learning Research12, 2825–2830
2011
-
[47]
V ., & Gulin, A
Prokhorenkova, L., Gusev, G., V orobev, A., Dorogush, A. V ., & Gulin, A. (2018). CatBoost: Unbiased Boosting with Categorical Features. InAdvances in Neural Information Processing Systems (NeurIPS 2018)31, 6638–6648
2018
-
[48]
Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations.Journal of Computational Physics378, 686–707
2019
-
[49]
Rabiner, L. R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.Proceedings of the IEEE77(2), 257–286
1989
-
[50]
E., & Williams, C
Rasmussen, C. E., & Williams, C. K. I. (2006).Gaussian Processes for Machine Learning. MIT Press
2006
-
[51]
R., Ghonia, H., Bhagwatkar, R., Khorasani, A., Darvishi Bayazi, M
Rasul, K., Ashok, A., Williams, A. R., Ghonia, H., Bhagwatkar, R., Khorasani, A., Darvishi Bayazi, M. J., Adamopoulos, G., Riachi, R., Hassen, N., Bilo ˇs, M., Garg, S., Schneider, A., Chapados, N., Drouin, A., Zantedeschi, V ., Nevmyvaka, Y ., & Rish, I. (2024). Lag- Llama: Towards Foundation Models for Probabilistic Time Series Forecasting.arXiv preprin...
arXiv 2024
-
[52]
Rubanova, Y ., Chen, R. T. Q., & Duvenaud, D. (2019). Latent ODEs for Irregularly-Sampled Time Series. InAdvances in Neural Information Processing Systems (NeurIPS 2019)32, 5320–5330
2019
-
[53]
Scargle, J. D. (1982). Studies in Astronomical Time Series Analysis. II. Statistical Aspects of Spectral Analysis of Unevenly Spaced Data.The Astrophysical Journal263, 835–853. 1 Time Series Analysis in Machine Learning 31
1982
-
[54]
Sutskever, I., Vinyals, O., & Le, Q. V . (2014). Sequence to Sequence Learning with Neu- ral Networks. InAdvances in Neural Information Processing Systems (NIPS 2014)27, 3104–3112
2014
-
[55]
J., & Letham, B
Taylor, S. J., & Letham, B. (2018). Forecasting at Scale.The American Statistician72(1), 37–45
2018
-
[56]
Torrence, C., & Compo, G. P. (1998). A Practical Guide to Wavelet Analysis.Bulletin of the American Meteorological Society79(1), 61–78
1998
-
[57]
VanderPlas, J. T. (2018). Understanding the Lomb-Scargle Periodogram.The Astrophysical Journal Supplement Series236(1), 16
2018
-
[58]
N., Kaiser, Ł., & Polosukhin, I
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All You Need. InAdvances in Neural Information Process- ing Systems (NIPS 2017)30
2017
-
[59]
Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., & Sun, L. (2023). Transformers in Time Series: A Survey. InProceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), 6778–6786. [DOI: 10.24963/ijcai.2023/759]
-
[60]
Wiegreffe, S., & Pinter, Y . (2019). Attention is not not Explanation. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), 11–20
2019
-
[61]
Ye, L., & Keogh, E. (2009). Time Series Shapelets: A New Primitive for Data Mining. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 947–956
2009
-
[62]
Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A., & Eickhoff, C. (2021). A Transformer- based Framework for Multivariate Time Series Representation Learning. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD 2021), 2114–2124
2021
-
[63]
Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. InProceedings of the AAAI Conference on Artificial Intelligence35(12), 11106–11115
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.