pith. sign in

hub Mixed citations

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Mixed citation behavior. Most common role is background (64%).

72 Pith papers citing it
Background 64% of classified citations
abstract

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.

hub tools

citation-role summary

background 8 baseline 2 method 1

citation-polarity summary

clear filters

representative citing papers

Olivia: Harmonizing Time Series Foundation Models with Power Spectral Density

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

Olivia harmonizes time series datasets via normalized power spectral density using a Harmonizer module and resonator-based HarmonicAttention, achieving state-of-the-art zero-shot, few-shot, and full-shot forecasting on TSLib, GIFT-Eval, and GluonTS benchmarks.

How Good Can Linear Models Be for Time-Series Forecasting?

cs.LG · 2026-06-25 · conditional · novelty 6.0

Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.

How Do Electrocardiogram Models Scale?

cs.LG · 2026-05-17 · conditional · novelty 6.0

Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.

citing papers explorer

Showing 7 of 7 citing papers after filters.

  • From Observations to States: Latent Time Series Forecasting cs.LG · 2026-01-30 · conditional · none · ref 10 · internal anchor

    LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.

  • Sundial: A Family of Highly Capable Time Series Foundation Models cs.LG · 2025-02-02 · conditional · none · ref 16 · internal anchor

    Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.

  • How Good Can Linear Models Be for Time-Series Forecasting? cs.LG · 2026-06-25 · conditional · none · ref 16 · internal anchor

    Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.

  • How Do Electrocardiogram Models Scale? cs.LG · 2026-05-17 · conditional · none · ref 20 · internal anchor

    Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.

  • AlphaCast: A Human Wisdom-LLM Intelligence Co-Reasoning Framework for Interactive Time Series Forecasting cs.AI · 2025-11-12 · conditional · none · ref 16 · internal anchor

    AlphaCast is a training-free LLM framework that performs interactive multi-stage reasoning for time series forecasting by integrating feature extraction, knowledge bases, case libraries, and contextual pools.

  • ReNF: Rethinking the Design of Neural Long-Term Time Series Forecasters cs.LG · 2025-09-30 · conditional · none · ref 12 · internal anchor

    ReNF proposes Boosted Direct Output (BDO) and parameter smoothing so a basic temporal MLP outperforms complex state-of-the-art models on long-term time series forecasting benchmarks by implicitly combining forecasts to reduce uncertainty.

  • Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images cs.LG · 2025-08-06 · conditional · none · ref 26 · internal anchor

    PatchECG applies masked patch training and disordered attention to handle asynchronous and partially missing ECG signals from varied layouts, reaching average AUROC 0.835 on simulated conditions and 0.778 on real hospital images for atrial fibrillation.