pith. sign in

arxiv: 2606.13119 · v1 · pith:AOYVBU67new · submitted 2026-06-11 · 💻 cs.LG · cs.AI· cs.NE

MP3: Multi-Period Pattern Pre-training forSpatio-Temporal Forecasting

Pith reviewed 2026-06-27 07:15 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.NE
keywords spatio-temporal forecastingmulti-period patternspre-traininggraph neural networksplug-and-play moduletemporal cyclescausality modeling
0
0 comments X

The pith

A plug-in pre-trains models on multi-period patterns from long series to resolve cases where similar short inputs produce divergent forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that spatio-temporal forecasting models often fail when short input windows hide the longer repeating cycles that shape future behavior. It introduces a pre-training step that learns those cycles separately in time, space, and across cycles, then attaches the learned patterns to existing graph-based forecasters. If the approach works, the same base models produce lower error on standard traffic, climate, and energy datasets without changing their core architecture. The claim matters because many real systems rely on forecasts that currently misread repeating weekly or daily structures as noise or coincidence.

Core claim

MP3 learns multi-period patterns by first applying edge convolution across long series to separate distinct temporal cycles, then using a bottleneck projection plus global memory bank to capture varying spatial relations at each cycle length, and finally running a causality-enhanced Transformer to model how one cycle pattern influences another. Once pre-trained, the resulting representations are inserted into any existing spatio-temporal graph network as a plug-in module. Experiments across five different base models and five datasets show that this insertion produces consistent error reductions.

What carries the argument

The MP3 plug-in, whose three components (edge-convolution temporal modeling, bottleneck-plus-memory-bank spatial modeling, and causality-enhanced Transformer for cross-period interaction) together extract and store repeating cycle patterns from long input series.

If this is right

  • Existing graph forecasters gain 4.7 percent lower MAE and 5.0 percent lower RMSE on average when the MP3 plug-in is added.
  • The same plug-in works without retraining the base model from scratch and scales to a large urban dataset.
  • The learned cycle patterns remain useful across different base architectures, showing the pre-training is not tied to one specific network design.
  • Cross-period dependencies captured by the Transformer component improve handling of superimposed cycle effects that short windows alone cannot resolve.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the cycle-pattern representations prove stable, they could be reused across cities or time periods without full retraining.
  • The same separation of temporal, spatial, and cross-cycle stages might apply to other sequence tasks where short contexts hide longer rhythms, such as energy load or epidemic curves.
  • A natural next test would be whether the memory bank can be updated online as new long series arrive rather than requiring a separate pre-training phase.

Load-bearing premise

That failures on similar short inputs arise mainly from missing longer cycle information rather than from other model limitations, and that the three new components supply exactly the missing information.

What would settle it

Attaching the pre-trained MP3 module to the five tested base models on the five datasets and observing no average error reduction or seeing gains disappear on the large-scale CA dataset.

read the original abstract

Spatio-Temporal forecasting is crucial in diverse fields, such as transportation, climate, and energy. Urban spatio-temporal data exhibits temporal mirage: similar short-window inputs have divergent future trends, and vice versa. Existing spatio-temporal graph neural networks (STGNNs) cannot effectively identify such mirages. We argue that the core reason lies in the short-window inputs that have incomplete period observation, heterogeneous global spatial correlation, and cross-period superposition causality. To bridge this gap, we develop a novel Multi- Period Pattern Pre-training (MP3), a plug-and-play pre-training plugin for distinguishing temporal mirages. MP3 presents two core innovations: (1) The multi-period pattern learning is designed to learn multi-period patterns from long time series. Specifically, multi-period temporal modeling leverages edge convolution to identify different multi-period patterns. Multi-period spatial modeling uses a bottleneck project and a global memory bank to capture heterogeneous global spatial relations efficiently. Cross-period pattern interaction employs a causality-enhanced Transformer to capture dependencies across different period patterns. (2) This plugin can seamlessly integrate into existing STGNN backbones to strengthen their forecasting performance. The experiment on five STGNN baselines across five real-world datasets (including a large-scale dataset CA) verify the effectiveness, superior scalability and strong adaptability of MP3, which brings consistent and robust performance improvements across all evaluated baselines. On average, MP3 reduces the MAE 4.7% and the RMSE 5.0%. The code can be available at https://github.com/YAN-outlook/MP3.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes MP3, a plug-and-play pre-training plugin for existing spatio-temporal graph neural networks (STGNNs) to address 'temporal mirages' where similar short-window inputs yield divergent forecasts. It identifies three causes—incomplete period observation, heterogeneous global spatial correlation, and cross-period superposition causality—and introduces three corresponding components: multi-period temporal modeling via edge convolution, spatial modeling via bottleneck projection and global memory bank, and cross-period interaction via a causality-enhanced Transformer. The plugin integrates into STGNN backbones, and experiments across five baselines and five real-world datasets (including large-scale CA) report average MAE reductions of 4.7% and RMSE reductions of 5.0%, with claims of superior scalability and adaptability. Code is stated to be available.

Significance. If the empirical gains prove robust and mechanistically linked to the proposed components, MP3 could provide a general, reusable enhancement for STGNNs in domains like transportation and climate forecasting. The plug-and-play design and public code are positive features that would aid adoption and reproducibility if the central performance claims hold under scrutiny.

major comments (3)
  1. [Abstract / Experiments] Abstract and Experiments section: the central claim of consistent 4.7% MAE / 5.0% RMSE reductions across five baselines and five datasets is presented without any reported details on data splits, cross-validation protocol, statistical significance tests, error bars, or controls for post-hoc hyperparameter choices, leaving the empirical support for the performance delta only weakly grounded.
  2. [Method] Method section (description of the three components): the manuscript states that incomplete period observation, heterogeneous global spatial correlation, and cross-period superposition causality are the primary drivers of temporal mirages and maps each MP3 component directly to one driver, yet contains no targeted diagnostics, component-wise ablations holding capacity fixed, or tests showing that gains vanish when the claimed cause is absent; aggregate performance numbers alone cannot establish the causal mechanism.
  3. [Experiments] Experiments section: no analysis is provided on whether the observed improvements scale with the added parameters (convolution kernels, memory bank size, Transformer layers) or simply with longer context, which is required to rule out capacity or context-length explanations for the reported deltas.
minor comments (2)
  1. [Abstract] The abstract mentions 'multi-period pattern learning' as one of two core innovations but the body text describes three components; clarifying the exact count and their grouping would improve readability.
  2. [Method] Notation for the memory bank and bottleneck projection should be introduced with explicit equations or pseudocode in the method section to avoid ambiguity when integrating with different backbones.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and commit to revisions that will strengthen the empirical grounding and mechanistic analysis of MP3.

read point-by-point responses
  1. Referee: [Abstract / Experiments] Abstract and Experiments section: the central claim of consistent 4.7% MAE / 5.0% RMSE reductions across five baselines and five datasets is presented without any reported details on data splits, cross-validation protocol, statistical significance tests, error bars, or controls for post-hoc hyperparameter choices, leaving the empirical support for the performance delta only weakly grounded.

    Authors: We agree that additional experimental details are needed. In the revised manuscript we will expand the Experiments section to specify the data splits, cross-validation protocol, statistical significance tests, error bars on all reported metrics, and the hyperparameter search procedure. These additions will be included in both the main text and supplementary material. revision: yes

  2. Referee: [Method] Method section (description of the three components): the manuscript states that incomplete period observation, heterogeneous global spatial correlation, and cross-period superposition causality are the primary drivers of temporal mirages and maps each MP3 component directly to one driver, yet contains no targeted diagnostics, component-wise ablations holding capacity fixed, or tests showing that gains vanish when the claimed cause is absent; aggregate performance numbers alone cannot establish the causal mechanism.

    Authors: The three causes were identified through preliminary data analysis. We acknowledge that aggregate results alone are insufficient to establish causality. The revision will add component-wise ablations with matched parameter budgets and targeted diagnostics that isolate each driver, together with controls that remove the corresponding cause from the input data. revision: yes

  3. Referee: [Experiments] Experiments section: no analysis is provided on whether the observed improvements scale with the added parameters (convolution kernels, memory bank size, Transformer layers) or simply with longer context, which is required to rule out capacity or context-length explanations for the reported deltas.

    Authors: We will add experiments that compare MP3 against (i) baselines augmented with equivalent extra parameters and (ii) baselines given the same extended context length. These controls will be reported in the revised Experiments section to separate the contribution of MP3’s design from raw capacity or context effects. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on empirical validation of a proposed architecture

full rationale

The paper introduces MP3 as a plug-and-play pre-training plugin with three explicitly designed components (edge-convolution temporal modeling, bottleneck+memory-bank spatial modeling, causality-enhanced Transformer) motivated by posited causes of temporal mirages. These are engineering choices and architectural decisions, not a derivation chain. No equations, predictions, or first-principles results are presented that reduce by construction to fitted inputs, self-definitions, or self-citation load-bearing uniqueness theorems. Performance improvements are reported via experiments on five baselines and five datasets; the central claim is therefore falsifiable by replication and does not collapse into tautology.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that temporal mirage arises from the three listed data properties and that the architectural components address them; the empirical results further depend on the choice of five baselines and five datasets plus numerous neural-network hyperparameters.

free parameters (1)
  • model hyperparameters including convolution kernels, memory bank size, and transformer layers
    These are chosen or fitted during pre-training and fine-tuning to achieve the reported gains.
axioms (1)
  • domain assumption Urban spatio-temporal data exhibits temporal mirage: similar short-window inputs have divergent future trends and vice versa.
    Invoked in the first paragraph of the abstract as the core motivation.

pith-pipeline@v0.9.1-grok · 5829 in / 1316 out tokens · 42669 ms · 2026-06-27T07:15:49.908831+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references

  1. [1]

    S. A. Sayed, Y . Abdel-Hamid, H. A. Hefny, Artificial intelligence-based traffic flow prediction: a comprehensive review, Journal of Electrical Systems and Information Technology 10 (2023) 13

  2. [2]

    Z. Li, C. Huang, L. Xia, Y . Xu, J. Pei, Spatial-temporal hypergraph self-supervised learning for crime prediction, in: IEEE 38th International Conference on Data Engineering, 2022, pp. 2984–2996

  3. [3]

    K. H. Hettige, J. Ji, S. Xiang, C. Long, G. Cong, J. Wang, Airphynet: Harnessing physics-guided neural networks for air quality prediction, in: Proceedings of the 12th International Conference on Learning Representations, 2024, p. 1–17

  4. [4]

    B. L. Smith, M. J. Demetsky, Traffic flow forecasting: Comparison of modeling approaches, Journal of Transportation Engineering (1997) 261–266

  5. [5]

    O. D. Anderson, G. E. P. Box, G. M. Jenkins, Time series analysis: Forecasting and control, The Statistician (1978) 265

  6. [6]

    Lippi, M

    M. Lippi, M. Bertini, P. Frasconi, Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learn- ing, IEEE Transactions on Intelligent Transportation Systems (2013) 871–882

  7. [7]

    L ¨utkepohl, New introduction to multiple time series analysis, Springer Berlin Heidelberg eBooks (Jan 2005)

    H. L ¨utkepohl, New introduction to multiple time series analysis, Springer Berlin Heidelberg eBooks (Jan 2005)

  8. [8]

    Zivot, J

    E. Zivot, J. Wang, Vector autoregressive models for multivariate time series (2003) 369–413

  9. [9]

    Zhang, Y

    J. Zhang, Y . Zheng, D. Qi, Deep spatio-temporal residual networks for citywide crowd flows prediction, in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017, p. 1655–1661

  10. [10]

    X. Ma, Z. Tao, Y . Wang, H. Yu, Y . Wang, Long short-term memory neural network for traffic speed prediction using remote microwave sensor data, Transportation Research Part C: Emerging Technologies (2015) 187–197

  11. [11]

    Hochreiter, J

    S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Com- putation (1997) 1735–1780

  12. [12]

    Zhang, Y

    J. Zhang, Y . Zheng, D. Qi, R. Li, X. Yi, Dnn-based prediction model for spatio-temporal data, in: Proceedings of the 24th ACM SIGSPA- TIAL International Conference on Advances in Geographic Information Systems, 2016

  13. [13]

    Y . Lv, Y . Duan, W. Kang, Z. Li, F.-Y . Wang, Traffic flow prediction with big data: A deep learning approach, IEEE Transactions on Intelligent Transportation Systems (2014) 1–9

  14. [14]

    G. Jin, Y . Liang, Y . Fang, Z. Shao, J. Huang, J. Zhang, Y . Zheng, Spatio-temporal graph neural networks for predictive learning in urban computing: A survey, IEEE Transactions on Knowledge and Data Engineering (2023) 1–20

  15. [15]

    Y . Li, R. Yu, C. Shahabi, Y . Liu, Diffusion convolutional recurrent neural network: Data-driven traffic forecasting, in: International Conference on Learning Representations, 2018

  16. [16]

    L. Zhao, Y . Song, C. Zhang, Y . Liu, P. Wang, T. Lin, M. Deng, H. Li, T- gcn: A temporal graph convolutional network for traffic prediction, IEEE Transactions on Intelligent Transportation Systems (2020) 3848–3858

  17. [17]

    Z. Fang, Q. Long, G. Song, K. Xie, Spatial-temporal graph ode networks for traffic flow forecasting, in: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining, 2021

  18. [18]

    J. Ye, L. Sun, B. Du, Y . Fu, H. Xiong, Coupled layer-wise graph convolution for transportation demand prediction, Proceedings of the AAAI Conference on Artificial Intelligence (2022) 4617–4625

  19. [19]

    C. Wang, K. Zhang, H. Wang, B. Chen, Auto-stgcn: Autonomous spatial- temporal graph convolutional network search, ACM Transactions on Knowledge Discovery from Data (2023) 1–21

  20. [20]

    Zhang, J

    Q. Zhang, J. Chang, G. Meng, S. Xiang, C. Pan, Spatio-temporal graph structure learning for traffic forecasting, Proceedings of the AAAI Conference on Artificial Intelligence (2020) 1177–1185

  21. [21]

    J. Ye, Z. Liu, B. Du, L. Sun, W. Li, Y . Fu, H. Xiong, Learning the evolutionary and multi-scale graph structure for multivariate time series forecasting, in: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 2296–2306

  22. [22]

    M. Ma, J. Hu, C. S. Jensen, F. Teng, P. Han, Z. Xu, T. Li, Learning time- aware graph structures for spatially correlated time series forecasting, in: 2024 IEEE 40th International Conference on Data Engineering (ICDE), 2024, pp. 4435–4448

  23. [23]

    L. Bai, L. Yao, C. Li, X. Wang, C. Wang, Adaptive graph convolutional recurrent network for traffic forecasting, in: Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020

  24. [24]

    Z. Wu, S. Pan, G. Long, J. Jiang, C. Zhang, Graph wavenet for deep spatial-temporal graph modeling, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

  25. [25]

    Z. Wu, S. Pan, G. Long, J. Jiang, X. Chang, C. Zhang, Connecting the dots: Multivariate time series forecasting with graph neural networks, in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, 2020, p. 753–763

  26. [26]

    Jiang, Z

    R. Jiang, Z. Wang, J. Yong, P. Jeph, Q. Chen, Y . Kobayashi, X. Song, S. Fukushima, T. Suzumura, Spatio-temporal meta-graph learning for traffic forecasting, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 37, 2023, pp. 8078–8086

  27. [27]

    Z. Dong, R. Jiang, H. Gao, H. Liu, J. Deng, Q. Wen, X. Song, Heterogeneity-informed meta-parameter learning for spatiotemporal time series forecasting, in: Proceedings of the 30th ACM SIGKDD JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 14 Conference on Knowledge Discovery and Data Mining, 2024, pp. 631– 641

  28. [28]

    Zheng, X

    C. Zheng, X. Fan, C. Wang, J. Qi, Gman: A graph multi-attention network for traffic prediction, Proceedings of the AAAI Conference on Artificial Intelligence 34 (01) (2020) 1234–1241

  29. [29]

    S. Guo, Y . Lin, L. Gong, C. Wang, Z. Zhou, Z. Shen, Y . Huang, H. Wan, Self-supervised spatial-temporal bottleneck attentive network for effi- cient long-term traffic forecasting, in: 2023 IEEE 39th International Conference on Data Engineering, 2023, pp. 1585–1596

  30. [30]

    Jiang, C

    J. Jiang, C. Han, W. X. Zhao, J. Wang, Pdformer: propagation delay- aware dynamic long-range transformer for traffic flow prediction, in: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intel- ligence, 2023

  31. [31]

    S. Guo, Y . Lin, H. Wan, X. Li, G. Cong, Learning dynamics and heterogeneity of spatial-temporal graph data for traffic forecasting, IEEE Transactions on Knowledge and Data Engineering 34 (11) (2022) 5415– 5428

  32. [32]

    Liang, Y

    Y . Liang, Y . Xia, S. Ke, Y . Wang, Q. Wen, J. Zhang, Y . Zheng, R. Zimmermann, Airformer: Predicting nationwide air quality in china with transformers, Proceedings of the AAAI Conference on Artificial Intelligence 37 (12) (2023) 14329–14337

  33. [33]

    L. Cao, B. Wang, G. Jiang, Y . Yu, J. Dong, Spatiotemporal-aware trend-seasonality decomposition network for traffic flow forecasting, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 39, 2025, pp. 11463–11471

  34. [34]

    Z. Pan, Y . Liang, W. Wang, Y . Yu, Y . Zheng, J. Zhang, Urban traffic prediction from spatio-temporal data using deep meta learning, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, 2019

  35. [35]

    Z. Li, L. Xia, Y . Xu, C. Huang, Flashst: A simple and universal prompt- tuning framework for traffic prediction, in: Proceedings of the 41st International Conference on Machine Learning, ICML’24, 2024

  36. [36]

    Z. Zhou, Q. Huang, K. Yang, K. Wang, X. Wang, Y . Zhang, Y . Liang, Y . Wang, Maintaining the status quo: Capturing invariant relations for ood spatiotemporal learning, in: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, 2023, p. 3603–3614

  37. [37]

    Devlin, M.-W

    J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186

  38. [38]

    K. He, X. Chen, S. Xie, Y . Li, P. Doll ´ar, R. Girshick, Masked autoen- coders are scalable vision learners, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16000– 16009

  39. [39]

    Z. Shao, Z. Zhang, F. Wang, Y . Xu, Pre-training enhanced spatial- temporal graph neural network for multivariate time series forecasting, in: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, 2022, pp. 1567–1577

  40. [40]

    Z. Li, L. Xia, Y . Xu, C. Huang, Gpt-st: Generative pre-training of spatio- temporal graph neural networks, in: Advances in Neural Information Processing Systems, 2023, pp. 70229–70246

  41. [41]

    H. Gao, R. Jiang, Z. Dong, J. Deng, Y . Ma, X. Song, Spatial-temporal- decoupled masked pre-training for spatiotemporal forecasting, in: Pro- ceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024, pp. 3998–4006

  42. [42]

    J. Wang, J. Jiang, W. Jiang, C. Li, W. X. Zhao, Libcity: An open library for traffic prediction, in: Proceedings of the 29th International Confer- ence on Advances in Geographic Information Systems, Association for Computing Machinery, 2021, p. 145–148

  43. [43]

    Y . Cai, J. Xu, S. Jiao, Intelligent prediction of urban road network carry- ing capacity and traffic flow based on deep learning, IEEE Transactions on Vehicular Technology 74 (2) (2025) 2067–2079

  44. [44]

    B. Yu, H. Yin, Z. Zhu, Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting, in: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI’18, 2018, p. 3634–3640

  45. [45]

    T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: International Conference on Learning Rep- resentations (ICLR), 2017

  46. [46]

    J. Deng, R. Jiang, J. Zhang, X. Song, Multi-modality spatio-temporal forecasting via self-supervised learning, in: K. Larson (Ed.), Proceed- ings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24, International Joint Conferences on Artificial Intelligence Organization, 2024, pp. 2018–2026

  47. [47]

    H. Wu, T. Hu, Y . Liu, H. Zhou, J. Wang, M. Long, Timesnet: Temporal 2d-variation modeling for general time series analysis, in: International Conference on Learning Representations, 2023

  48. [48]

    W. Cai, Y . Liang, X. Liu, J. Feng, Y . Wu, Msgnet: learning multi- scale inter-series correlations for multivariate time series forecasting, in: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

  49. [49]

    H. Wang, J. Peng, F. Huang, J. Wang, J. Chen, Y . Xiao, Micn: Multi- scale local and global context modeling for long-term series forecasting (2023)

  50. [50]

    J. Han, W. Zhang, H. Liu, T. Tao, N. Tan, H. Xiong, Bigst: Linear complexity spatio-temporal graph neural network for traffic forecasting on large-scale road networks, Proceedings of the VLDB Endowment 17 (5) (2024) 1081–1090

  51. [51]

    Szegedy, W

    C. Szegedy, W. Liu, Y . Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V . Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1–9

  52. [52]

    C. Song, Y . Lin, S. Guo, H. Wan, Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting, Proceedings of the AAAI Conference on Artificial Intelligence 34 (01) (2020) 914–921

  53. [53]

    Liang, K

    Y . Liang, K. Ouyang, Y . Wang, Z. Pan, Y . Yin, H. Chen, J. Zhang, Y . Zheng, D. S. Rosenblum, R. Zimmermann, Mixed-order relation- aware recurrent neural networks for spatio-temporal forecasting, IEEE Transactions on Knowledge and Data Engineering 35 (9) (2023) 9254– 9268

  54. [54]

    Cirstea, B

    R.-G. Cirstea, B. Yang, C. Guo, T. Kieu, S. Pan, Towards spatio- temporal aware traffic time series forecasting, in: 2022 IEEE 38th International Conference on Data Engineering (ICDE), 2022, pp. 2900– 2913

  55. [55]

    D. Liu, J. Wang, S. Shang, P. Han, Msdr: Multi-step dependency relation networks for spatial temporal forecasting, in: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, p. 1042–1050

  56. [56]

    J. Deng, X. Chen, R. Jiang, X. Song, I. W. Tsang, St-norm: Spatial and temporal normalization for multi-variate time series forecasting, in: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining, 2021, pp. 269–278