pith. sign in

arxiv: 2606.03184 · v1 · pith:OTEJAMSHnew · submitted 2026-06-02 · 💱 q-fin.CP · cs.LG· q-fin.ST

FinStressTS: A Parametric Synthetic Benchmark for Time-Series Forecasting in Finance

Pith reviewed 2026-06-28 07:39 UTC · model grok-4.3

classification 💱 q-fin.CP cs.LGq-fin.ST
keywords synthetic benchmarkfinancial time seriesforecasting evaluationvolatility clusteringregime switchingprobabilistic forecastingmodel diagnosticsjump processes
0
0 comments X

The pith

FinStressTS uses controlled parametric environments to show that autoregressive and linear models often outperform Transformers in volatility, tail, and jump settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates a synthetic benchmark called FinStressTS built from six explicit mechanism families to generate thirty diagnostic environments. Researchers can then measure how different forecasting models respond when the data-generating process is known exactly rather than hidden in real market records. A reader would care because real financial series entangle low signal-to-noise ratios, regime shifts, and heavy tails, so failures cannot be traced to any single cause. The evaluations compare fifteen models on both point and probabilistic tasks and report that performance depends on the active mechanism, that distributional alignment affects calibration, and that neural models usually need more samples than simpler baselines.

Core claim

FinStressTS comprises thirty diagnostic environments around six mechanism families—volatility clustering, multi-scale persistence, heavy-tailed shocks, regime switching, self-exciting jumps, and zero-inflated processes—and shows that autoregressive and linear models remain competitive or superior in several volatility-, tail-, and jump-driven settings while parametric probabilistic models calibrate well in stationary regimes.

What carries the argument

The six parametric mechanism families that generate the thirty diagnostic environments, each isolating one structural cause so that model errors can be attributed to a known data-generating process.

If this is right

  • Autoregressive and linear models are highly competitive in volatility-, tail-, and jump-driven environments.
  • Parametric probabilistic models such as DeepAR calibrate well in stationary settings, while flexible models help when distributions become multimodal or sparse.
  • Neural models often require more data to match simple baselines, with larger gains mainly when learning latent regimes or complex distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Model selection pipelines in finance could become conditional on detected mechanism type rather than fixed across all assets.
  • The same parametric construction could be reused to generate stress-test suites for regulatory capital calculations under known tail and jump scenarios.
  • Learning curves measured on these environments offer a direct way to quantify how much additional data a new architecture needs before it surpasses a linear baseline.

Load-bearing premise

The six parametric mechanism families accurately reproduce the latent structural causes present in real financial time series without adding correlations or artifacts absent from actual markets.

What would settle it

If the relative performance ordering of the fifteen models on real financial series differs systematically from their ordering on the matching FinStressTS environments, the claim that the benchmark isolates the relevant mechanisms would be undermined.

Figures

Figures reproduced from arXiv: 2606.03184 by Haonan Chen, Jiaze Sun, Kelvin J.L. Koa, Ke-Wei Huang, Ruiyang Ni, Yize Liu.

Figure 1
Figure 1. Figure 1: Illustrative examples of synthetic time series generated by each mechanism family under the Level 1 diagnostic [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Data-efficiency learning curves for 10 models across six synthetic mechanism families. Each subplot shows [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: CRPS as a function of training sam￾ple ratio across the six synthetic cases for each probabilistic model. Each line corresponds to one case (Case 1–6). Lower is better. a fundamental mismatch between popular time-series inductive bi￾ases and the structural properties of financial data. Notably, many of these architectures were originally designed for long-horizon fore￾casting in strongly periodic domains (… view at source ↗
read the original abstract

Financial forecasting is difficult due to low signal-to-noise ratios, latent factors, heavy tails, regime shifts, and jumps. Real-world benchmarks offer limited failure attribution: researchers can observe underperformance, but often cannot isolate why because mechanisms are unobservable and entangled. Real financial data reveal only one realized path, making it difficult to assess tail-risk calibration or data efficiency. We introduce FinStressTS, a mechanism-aware synthetic benchmark that links model behavior to controlled structural causes. FinStressTS comprises 30 diagnostic environments around six mechanism families: volatility clustering, multi-scale persistence, heavy-tailed shocks, regime switching, self-exciting jumps, and zero-inflated processes. We evaluate two tasks: point forecasting, using NMAE across five settings, and probabilistic forecasting, using CRPS under known data-generating mechanisms. We benchmark 15 models, from classical methods (HAR, VAR) to Transformer forecasters (PatchTST, iTransformer) and deep probabilistic architectures (DeepAR, TSFlow), and use learning curves to measure sample efficiency. Our evaluation reveals three insights. First, performance is mechanism-dependent: autoregressive and linear models are highly competitive, and often outperform Transformer-based models, in several volatility-, tail-, and jump-driven environments. Second, distributional alignment matters: parametric probabilistic models such as DeepAR calibrate well in stationary settings, while flexible models can help when distributions become multimodal or sparse. Third, neural models often require more data to match simple baselines, with larger gains mainly when learning latent regimes or complex distributions. FinStressTS provides an open framework for diagnosing failure modes and advancing risk-aware forecasting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper introduces FinStressTS, a synthetic benchmark with 30 diagnostic environments built from six parametric mechanism families (volatility clustering, multi-scale persistence, heavy-tailed shocks, regime switching, self-exciting jumps, zero-inflated processes). It evaluates 15 models (HAR, VAR, PatchTST, iTransformer, DeepAR, TSFlow, etc.) on point forecasting via NMAE and probabilistic forecasting via CRPS under known DGPs, plus learning curves for sample efficiency, and reports three insights: mechanism-dependent performance (AR/linear models competitive in volatility/tail/jump settings), importance of distributional alignment for calibration, and higher data needs for neural models except in regime/complex-distribution cases.

Significance. If the environments isolate the claimed mechanisms, the benchmark supplies a controlled testbed that real financial series cannot, because the latter entangle latent factors and provide only one path. This would allow precise attribution of failure modes (e.g., poor tail calibration under jumps versus regime shifts) and targeted model improvement for risk-aware forecasting. The open framework and broad model coverage constitute a useful public resource for the field.

major comments (1)
  1. [six parametric mechanism families and 30 diagnostic environments] The claim that 'performance is mechanism-dependent' (abstract) and the three insights require that each of the six families operates in isolation within its 30 environments. Self-exciting jumps (Hawkes-type) generate clustered large increments that automatically raise the autocorrelation of squared returns, thereby injecting an unintended volatility-clustering signature into a 'jumps-only' environment. Regime-switching constructions can similarly induce spurious persistence or heavy tails. The manuscript must supply explicit orthogonality diagnostics (e.g., moment or spectral comparisons across families) to confirm the diagnostic mapping is valid; absent such checks the attribution of model rankings to specific structural causes is compromised.
minor comments (1)
  1. [abstract] The abstract states the benchmark design and high-level results but supplies no equations, parameter values, or validation statistics; readers must consult the full text for reproducibility details.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the critical need to verify isolation of the six mechanism families. We agree that potential cross-contamination (e.g., Hawkes-induced volatility clustering) could weaken attribution of the reported insights, and we will strengthen the manuscript with the requested diagnostics.

read point-by-point responses
  1. Referee: [six parametric mechanism families and 30 diagnostic environments] The claim that 'performance is mechanism-dependent' (abstract) and the three insights require that each of the six families operates in isolation within its 30 environments. Self-exciting jumps (Hawkes-type) generate clustered large increments that automatically raise the autocorrelation of squared returns, thereby injecting an unintended volatility-clustering signature into a 'jumps-only' environment. Regime-switching constructions can similarly induce spurious persistence or heavy tails. The manuscript must supply explicit orthogonality diagnostics (e.g., moment or spectral comparisons across families) to confirm the diagnostic mapping is valid; absent such checks the attribution of model rankings to specific structural causes is compromised.

    Authors: We agree that the validity of mechanism-dependent performance claims rests on demonstrating that each family primarily isolates its intended structure. While the parametric constructions were chosen to emphasize one dominant feature per family (e.g., Hawkes intensity for jumps, Markov switching for regimes), we acknowledge that secondary signatures such as elevated squared-return autocorrelation in jump environments or induced kurtosis in regime environments may exist. In the revision we will add a new subsection (Section 3.3) containing explicit orthogonality checks: (i) autocorrelation functions of raw and squared series, (ii) kurtosis and tail-index estimates, and (iii) spectral density comparisons across all 30 environments. These diagnostics will quantify the degree of unintended overlap and, where necessary, adjust environment parameters to improve isolation. The three insights will be re-stated with reference to these checks. revision: yes

Circularity Check

0 steps flagged

No circularity: benchmark generation and empirical evaluation are independent of fitted results

full rationale

The paper constructs FinStressTS by specifying six parametric mechanism families (volatility clustering, regime switching, etc.) and 30 diagnostic environments, then runs standard point and probabilistic forecasting metrics (NMAE, CRPS) plus learning curves on 15 models. No equation or claim reduces a reported performance insight to a quantity defined by parameters fitted inside the same paper; the data-generating processes are fixed by construction before any model is applied, and the reported mechanism-dependent rankings are direct outputs of those evaluations rather than self-referential fits. No self-citation chain is invoked to justify uniqueness or an ansatz, and no known empirical pattern is merely renamed.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities beyond the high-level description of the six mechanism families; the benchmark itself is the primary new artifact introduced by the paper.

pith-pipeline@v0.9.1-grok · 5845 in / 1240 out tokens · 25256 ms · 2026-06-28T07:39:02.668805+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 16 canonical work pages · 3 internal anchors

  1. [1]

    Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C Maddix, Syama Rangapu- ram, David Salinas, Jasper Schulz, et al. 2020. Gluonts: Probabilistic and neural time series modeling in python.Journal of Machine Learning Research21, 116 (2020), 1–6

  2. [2]

    Torben G Andersen, Tim Bollerslev, and Francis X Diebold. 2007. Roughing it up: Including jump components in the measurement, modeling, and forecasting of return volatility.The review of economics and statistics89, 4 (2007), 701–720

  3. [3]

    Torben G Andersen, Tim Bollerslev, Francis X Diebold, and Paul Labys. 2003. Modeling and forecasting realized volatility.Econometrica71, 2 (2003), 579–625

  4. [4]

    Andersen, Tim Bollerslev, Francis X

    Torben G. Andersen, Tim Bollerslev, Francis X. Diebold, and Paul Labys. 2003. Modeling and Forecasting Realized Volatility.Econometrica71, 2 (2003), 579–625. doi:10.1111/1468-0262.00402

  5. [5]

    Yihao Ang, Qiang Huang, Yifan Bao, Anthony KH Tung, and Zhiyong Huang

  6. [6]

    TSGBench: Time Series Generation Benchmark.Proceedings of the VLDB Endowment17, 3 (2023), 305–318

  7. [7]

    Emmanuel Bacry, Iacopo Mastromatteo, and Jean-François Muzy. 2015. Hawkes Processes in Finance.Market Microstructure and Liquidity1, 1 (2015), 1550005. doi:10.1142/S2382626615500057

  8. [8]

    Tim Bollerslev. 1986. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics31, 3 (1986), 307–327. doi:10.1016/0304-4076(86)90063-1

  9. [9]

    Tim Bollerslev. 1987. A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return.The Review of Economics and Statistics69, 3 (1987), 542–547

  10. [10]

    George Box and GM Jenkins. 1976. Analysis: Forecasting and Control.San francisco(1976)

  11. [11]

    Weijun Chen, Shun Li, Xipu Yu, Heyuan Wang, Wei Chen, and Tengjiao Wang

  12. [12]

    InProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

    Automatic de-biased temporal-relational modeling for stock investment recommendation. InProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. 1999–2008

  13. [13]

    Rama Cont. 2001. Empirical properties of asset returns: stylized facts and statisti- cal issues.Quantitative Finance1, 2 (2001), 223–236. doi:10.1080/713665670

  14. [14]

    Rama Cont. 2001. Empirical properties of asset returns: stylized facts and statisti- cal issues.Quantitative finance1, 2 (2001), 223. KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea Jiaze Sun et al

  15. [15]

    Fulvio Corsi. 2009. A Simple Approximate Long-Memory Model of Realized Volatility.Journal of Financial Econometrics7, 2 (2009), 174–196. doi:10.1093/ jjfinec/nbp001

  16. [16]

    Fulvio Corsi. 2009. A simple approximate long-memory model of realized volatil- ity.Journal of financial econometrics7, 2 (2009), 174–196

  17. [17]

    Adrien Cortés, Rémi Rehm, and Victor Letzelter. 2025. Winner-takes-all for Multi- variate Probabilistic Time Series Forecasting. InICML 2025: The 42nd International Conference on Machine Learning

  18. [18]

    Yitong Duan, Lei Wang, Qizhong Zhang, and Jian Li. 2022. Factorvae: A proba- bilistic dynamic factor model based on variational autoencoder for predicting cross-sectional stock returns. InProceedings of the AAAI conference on artificial intelligence, Vol. 36. 4468–4476

  19. [19]

    Straßburger (2013): Cut Elimination in Nested Sequents for Intuitionistic Modal Logics

    Paul Embrechts, Claudia Klüppelberg, and Thomas Mikosch. 1997.Modelling Extremal Events for Insurance and Finance. Springer. doi:10.1007/978-3-642- 33483-2

  20. [20]

    Robert F Engle. 1982. Autoregressive conditional heteroscedasticity with esti- mates of the variance of United Kingdom inflation.Econometrica: Journal of the econometric society(1982), 987–1007

  21. [21]

    Robert F. Engle. 1982. Autoregressive Conditional Heteroskedasticity with Esti- mates of the Variance of United Kingdom Inflation.Econometrica50, 4 (1982), 987–1007. doi:10.2307/1912773

  22. [22]

    Eugene F Fama and Kenneth R French. 1993. Common risk factors in the returns on stocks and bonds.Journal of financial economics33, 1 (1993), 3–56

  23. [23]

    Muhammad Hasan Ferdous, Emam Hossain, and Md Osman Gani. 2025. Time- graph: Synthetic benchmark datasets for robust time-series causal discovery. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 5425–5435

  24. [24]

    Tilmann Gneiting and Matthias Katzfuss. 2014. Probabilistic forecasting.Annual Review of Statistics and Its Application1, 1 (2014), 125–151

  25. [25]

    Tilmann Gneiting and Adrian E. Raftery. 2007. Strictly Proper Scoring Rules, Prediction, and Estimation.J. Amer. Statist. Assoc.102, 477 (2007), 359–378. doi:10.1198/016214506000001437

  26. [26]

    Rakshitha Godahewa, Christoph Bergmeir, Geoffrey I Webb, Rob J Hyndman, and Pablo Montero-Manso. 2021. Monash time series forecasting archive.arXiv preprint arXiv:2105.06643(2021)

  27. [27]

    James D Hamilton. 1989. A new approach to the economic analysis of nonstation- ary time series and the business cycle.Econometrica: Journal of the econometric society(1989), 357–384

  28. [28]

    Alan G. Hawkes. 1971. Spectra of some self-exciting and mutually exciting point processes.Biometrika58, 1 (1971), 83–90. doi:10.1093/biomet/58.1.83

  29. [29]

    Yifan Hu, Yuante Li, Peiyuan Liu, Yuxia Zhu, Naiqi Li, Tao Dai, Shu-tao Xia, Dawei Cheng, and Changjun Jiang. 2025. Fintsb: A comprehensive and practical benchmark for financial time series forecasting.arXiv preprint arXiv:2502.18834 (2025)

  30. [30]

    Yanfei Kang, Rob J Hyndman, and Feng Li. 2020. GRATIS: GeneRAting TIme Series with diverse and controllable characteristics.Statistical Analysis and Data Mining: The ASA Data Science Journal13, 4 (2020), 354–376

  31. [31]

    Marcel Kollovieh, Marten Lienen, David Lüdke, Leo Schwinn, and Stephan Gün- nemann. 2024. Flow matching with gaussian process priors for probabilistic time series forecasting.arXiv preprint arXiv:2410.03024(2024)

  32. [32]

    Diane Lambert. 1992. Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing.Technometrics34, 1 (1992), 1–14. doi:10.1080/00401706. 1992.10485228

  33. [33]

    Lesmond, Joseph P

    David A. Lesmond, Joseph P. Ogden, and Charles A. Trzcinka. 1999. A New Estimate of Transaction Costs.The Review of Financial Studies12, 5 (1999), 1113–1141. doi:10.1093/rfs/12.5.1113

  34. [34]

    Jingwei Liu, Ling Yang, Hongyan Li, and Shenda Hong. 2024. Retrieval-augmented diffusion models for time series forecasting.Advances in Neural Information Processing Systems37 (2024), 2766–2786

  35. [35]

    Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. 2023. itransformer: Inverted transformers are effective for time series forecasting.arXiv preprint arXiv:2310.06625(2023)

  36. [36]

    Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. 2022. Non-stationary transformers: Exploring the stationarity in time series forecasting.Advances in neural information processing systems35 (2022), 9881–9893

  37. [37]

    Markus Löning, Anthony Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, and Franz J Király. 2019. sktime: A unified interface for machine learning with time series.arXiv preprint arXiv:1909.07872(2019)

  38. [38]

    2013.Introduction to multiple time series analysis

    Helmut Lütkepohl. 2013.Introduction to multiple time series analysis. Springer Science & Business Media

  39. [39]

    Spyros Makridakis and Michele Hibon. 2000. The M3-Competition: results, conclusions and implications.International journal of forecasting16, 4 (2000), 451–476

  40. [40]

    Spyros Makridakis, Evangelos Spiliotis, and Vassilios Assimakopoulos. 2018. The M4 Competition: Results, findings, conclusion and way forward.International Journal of forecasting34, 4 (2018), 802–808

  41. [41]

    Spyros Makridakis, Evangelos Spiliotis, Ross Hollyman, Fotios Petropoulos, Nor- man Swanson, and Anil Gaba. 2024. The M6 forecasting competition: Bridging the gap between forecasting and investment decisions.International Journal of Forecasting(2024)

  42. [42]

    James E Matheson and Robert L Winkler. 1976. Scoring rules for continuous probability distributions.Management science22, 10 (1976), 1087–1096

  43. [43]

    Y Nie. 2022. A Time Series is Worth 64Words: Long-term Forecasting with Transformers.arXiv preprint arXiv:2211.14730(2022)

  44. [44]

    Alexander Nikitin, Letizia Iannucci, and Samuel Kaski. 2024. TSGM: a flexible framework for generative modeling of synthetic time series.Advances in Neural Information Processing Systems37 (2024), 129042–129061

  45. [45]

    Olivares, Cristian Challú, Azul Garza, Max Mergenthaler Canseco, and Artur Dubrawski

    Kin G. Olivares, Cristian Challú, Azul Garza, Max Mergenthaler Canseco, and Artur Dubrawski. 2022. NeuralForecast: User friendly state-of-the-art neural forecasting models. PyCon Salt Lake City, Utah, US 2022. https://github.com/ Nixtla/neuralforecast

  46. [46]

    Cemal Öztürk. 2024. Enhancing Financial Time-Series Analysis with TimeGAN: A Novel Approach. In2024 9th International Conference on Computer Science and Engineering (UBMK). IEEE, 447–450

  47. [47]

    Xiangfei Qiu, Jilin Hu, Lekui Zhou, Xingjian Wu, Junyang Du, Buang Zhang, Chenjuan Guo, Aoying Zhou, Christian S Jensen, Zhenli Sheng, et al. 2024. TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods.Proceedings of the VLDB Endowment17, 9 (2024), 2363–2377

  48. [48]

    Kashif Rasul, Calvin Seward, Ingmar Schuster, and Roland Vollgraf. 2021. Au- toregressive denoising diffusion models for multivariate probabilistic time series forecasting. InInternational conference on machine learning. PMLR, 8857–8868

  49. [49]

    Stephen A Ross. 2013. The arbitrage theory of capital asset pricing. InHandbook of the fundamentals of financial decision making: Part I. World Scientific, 11–30

  50. [50]

    David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks.Inter- national journal of forecasting36, 3 (2020), 1181–1191

  51. [51]

    Yimiao Shao, Wenzhong Li, Kang Xia, Kaijie Lin, Mingkai Lin, and Sanglu Lu. 2025. QuantileFormer: Probabilistic Time Series Forecasting with a Pattern-Mixture Decomposed VAE Transformer. InProceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence. 6147–6155

  52. [52]

    Zezhi Shao, Fei Wang, Yongjun Xu, Wei Wei, Chengqing Yu, Zhao Zhang, Di Yao, Tao Sun, Guangyin Jin, Xin Cao, et al. 2024. Exploring progress in multi- variate time series forecasting: Comprehensive benchmarking and heterogeneity analysis.IEEE Transactions on Knowledge and Data Engineering37, 1 (2024), 291–305

  53. [53]

    Sean J Taylor and Benjamin Letham. 2018. Forecasting at scale.The American Statistician72, 1 (2018), 37–45

  54. [54]

    Heyuan Wang, Tengjiao Wang, Shun Li, Jiayi Zheng, Shijie Guan, and Wei Chen

  55. [55]

    Adaptive Long-Short Pattern Transformer for Stock Investment Selection.. InIJCAI. 3970–3977

  56. [56]

    Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Yong Liu, Mingsheng Long, and Jianmin Wang. 2024. Deep Time Series Models: A Comprehensive Survey and Benchmark. (2024)

  57. [57]

    Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Guo Qin, Haoran Zhang, Yong Liu, Yunzhong Qiu, Jianmin Wang, and Mingsheng Long. 2024. Timexer: Empowering transformers for time series forecasting with exogenous variables.Advances in Neural Information Processing Systems37 (2024), 469–498

  58. [58]

    Magnus Wiese, Robert Knobloch, Ralf Korn, and Peter Kretschmer. 2020. Quant GANs: deep generation of financial time series.Quantitative Finance20, 9 (2020), 1419–1440

  59. [59]

    Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. 2021. Autoformer: De- composition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems34 (2021), 22419–22430

  60. [60]

    Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. 2023. Are transformers effective for time series forecasting?. InProceedings of the AAAI conference on artificial intelligence, Vol. 37. 11121–11128

  61. [61]

    Liang Zeng, Lei Wang, Hui Niu, Ruchen Zhang, Ling Wang, and Jian Li. 2024. Trade when opportunity comes: price movement forecasting via locality-aware attention and iterative refinement labeling. InProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. 6134–6142

  62. [62]

    Jiawen Zhang, Xumeng Wen, Zhenwei Zhang, Shun Zheng, Jia Li, and Jiang Bian

  63. [63]

    ProbTS: Benchmarking point and distributional forecasting across diverse prediction horizons.Advances in Neural Information Processing Systems37 (2024), 48045–48082

  64. [64]

    Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. 2022. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. InInternational conference on machine learning. PMLR, 27268–27286. FinStressTS: A Parametric Synthetic Benchmark for Time-Series Forecasting in Finance KDD ’26, August 09–13, 2026, Jeju Island,...