hub Mixed citations

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam · 2022 · cs.LG · arXiv 2211.14730

Mixed citation behavior. Most common role is background (64%).

70 Pith papers citing it

Background 64% of classified citations

open full Pith review browse 70 citing papers arXiv PDF

abstract

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 8 baseline 2 method 1

citation-polarity summary

background 7 baseline 2 unclear 1 use method 1

representative citing papers

Prototype-Guided Classification Sub-Task Decoupling Framework: Enhancing Generalization and Interpretability for Multivariate Time Series

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

PDFTime reformulates multivariate time series classification as a multi-stage prototype-based decision process, claiming SOTA results on UCR and UEA benchmarks.

Olivia: Harmonizing Time Series Foundation Models with Power Spectral Density

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

Olivia harmonizes time series datasets via normalized power spectral density using a Harmonizer module and resonator-based HarmonicAttention, achieving state-of-the-art zero-shot, few-shot, and full-shot forecasting on TSLib, GIFT-Eval, and GluonTS benchmarks.

1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.

Looped SSMs: Depth-Recurrence and Input Reshaping for Time Series Classification

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Looped SSMs with shared parameters across depth match or exceed standard SSMs with more parameters on time series classification, with additional gains from input reshaping techniques.

SeesawNet: Towards Non-stationary Time Series Forecasting with Balanced Modeling of Common and Specific Dependencies

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

SeesawNet dynamically balances common and instance-specific dependencies via ASNA in temporal and channel dimensions, outperforming prior methods on non-stationary forecasting benchmarks.

What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.

FactoryBench: Evaluating Industrial Machine Understanding

cs.AI · 2026-05-08 · unverdicted · novelty 7.0

FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.

Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

MELO aggregates base predictors and their multi-scale EWLS adaptations using MLpol to achieve oracle inequalities against best fixed and time-varying predictors in non-stationary settings.

Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters

cs.LG · 2026-05-07 · accept · novelty 7.0

Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.

Discrete Prototypical Memories for Federated Time Series Foundation Models

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.

From Observations to States: Latent Time Series Forecasting

cs.LG · 2026-01-30 · conditional · novelty 7.0

LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.

Sundial: A Family of Highly Capable Time Series Foundation Models

cs.LG · 2025-02-02 · conditional · novelty 7.0

Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.

Deployment-Side Adaptiveness in Multi-Horizon Volatility Forecasting

cs.LG · 2026-06-26 · unverdicted · novelty 6.0

Validation-based selection of inference-time rollout rules for multi-output volatility forecasters yields low-cost improvements over default MIMO deployment and recovers much of ensemble benefit at lower cost.

How Good Can Linear Models Be for Time-Series Forecasting?

cs.LG · 2026-06-25 · conditional · novelty 6.0

Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

cs.LG · 2026-06-23 · unverdicted · novelty 6.0

Retrieval from out-of-domain foundation models enables personalization of a lightweight transformer for stress detection, yielding +3.92% accuracy and +4.76% F1 gains on WESAD without user labels.

Mechanical Field Networks: Structured Neural Dynamics for Multivariate Systems

cs.LG · 2026-06-08 · unverdicted · novelty 6.0

MF-Net learns a shared field state and mechanical transition rule from trajectories to deliver competitive forecasting and recoverable relation matrices on Lorenz-96 and real systems.

STaT: Resolving Shape Distortion in Non-Stationary Time Series via Tri-Modal Synergy

cs.LG · 2026-05-25 · unverdicted · novelty 6.0

STaT is a Symbolic-Temporal-Textual Alignment model that integrates three modalities to reduce shape distortion in non-stationary time series forecasting, reporting up to 8.9% gains in magnitude metrics and 8.5% less distortion on eight benchmarks.

How Do Electrocardiogram Models Scale?

cs.LG · 2026-05-17 · conditional · novelty 6.0

Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.

Empowering VLMs for Few-Shot Multimodal Time Series Classification via Tailored Agentic Reasoning

cs.AI · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

MarsTSC is a VLM agentic system with generator, reflector, and modifier roles that iteratively refines a knowledge bank to improve few-shot multimodal time series classification and produce human-readable explanations.

What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.

Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework

cs.LG · 2026-04-29 · unverdicted · novelty 6.0

ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.

CAARL: In-Context Learning for Interpretable Co-Evolving Time Series Forecasting

cs.LG · 2026-04-20 · unverdicted · novelty 6.0

CAARL decomposes co-evolving time series into autoregressive segments, builds a temporal dependency graph, serializes it into a narrative, and uses LLMs for interpretable forecasting via chain-of-thought reasoning.

M3R: Localized Rainfall Nowcasting with Meteorology-Informed MultiModal Attention

cs.LG · 2026-04-15 · unverdicted · novelty 6.0

M3R improves localized rainfall nowcasting by using weather station time series as queries in multimodal attention to selectively extract precipitation patterns from radar imagery.

A General Framework for Generative Self-supervised Learning in Non-invasive Estimation of Physiological Parameters Using Photoplethysmography

eess.SP · 2026-04-03 · unverdicted · novelty 6.0

TS2TC combines cross-temporal fusion generative anchor pretraining with dual-process transfer to achieve 2.49% lower RMSE than prior methods on PPG parameter estimation using only 10% labeled data.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Titans: Learning to Memorize at Test Time cs.LG · 2024-12-31 · unverdicted · none · ref 78 · internal anchor
Titans combine attention for current context with a learnable neural memory for long-term history, achieving better performance and scaling to over 2M-token contexts on language, reasoning, genomics, and time-series tasks.
TSNN: A Non-parametric and Interpretable Framework for Traffic Time Series Forecasting cs.LG · 2026-05-09 · unverdicted · none · ref 34 · internal anchor
TSNN matches time series entries to a training-derived memory bank to forecast traffic without any trainable parameters and achieves competitive accuracy on four real-world datasets.

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer