hub Mixed citations

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam · 2022 · cs.LG · arXiv 2211.14730

Mixed citation behavior. Most common role is background (64%).

72 Pith papers citing it

Background 64% of classified citations

open full Pith review browse 72 citing papers arXiv PDF

abstract

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 8 baseline 2 method 1

citation-polarity summary

background 7 baseline 2 unclear 1 use method 1

representative citing papers

Prototype-Guided Classification Sub-Task Decoupling Framework: Enhancing Generalization and Interpretability for Multivariate Time Series

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

PDFTime reformulates multivariate time series classification as a multi-stage prototype-based decision process, claiming SOTA results on UCR and UEA benchmarks.

Olivia: Harmonizing Time Series Foundation Models with Power Spectral Density

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

Olivia harmonizes time series datasets via normalized power spectral density using a Harmonizer module and resonator-based HarmonicAttention, achieving state-of-the-art zero-shot, few-shot, and full-shot forecasting on TSLib, GIFT-Eval, and GluonTS benchmarks.

1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.

Looped SSMs: Depth-Recurrence and Input Reshaping for Time Series Classification

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Looped SSMs with shared parameters across depth match or exceed standard SSMs with more parameters on time series classification, with additional gains from input reshaping techniques.

NeuroAtlas: Benchmarking Foundation Models for Clinical EEG and Brain-Computer Interfaces

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

NeuroAtlas benchmarks foundation models on 42 EEG datasets and reports that EEG-specific models do not consistently outperform generic time-series models, standard metrics miss clinical utility, and rankings vary by domain.

SeesawNet: Towards Non-stationary Time Series Forecasting with Balanced Modeling of Common and Specific Dependencies

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

SeesawNet dynamically balances common and instance-specific dependencies via ASNA in temporal and channel dimensions, outperforming prior methods on non-stationary forecasting benchmarks.

What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.

FactoryBench: Evaluating Industrial Machine Understanding

cs.AI · 2026-05-08 · unverdicted · novelty 7.0

FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.

Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

MELO aggregates base predictors and their multi-scale EWLS adaptations using MLpol to achieve oracle inequalities against best fixed and time-varying predictors in non-stationary settings.

Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters

cs.LG · 2026-05-07 · accept · novelty 7.0

Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.

Discrete Prototypical Memories for Federated Time Series Foundation Models

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.

From Observations to States: Latent Time Series Forecasting

cs.LG · 2026-01-30 · conditional · novelty 7.0

LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.

Sundial: A Family of Highly Capable Time Series Foundation Models

cs.LG · 2025-02-02 · conditional · novelty 7.0

Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.

Deployment-Side Adaptiveness in Multi-Horizon Volatility Forecasting

cs.LG · 2026-06-26 · unverdicted · novelty 6.0

Validation-based selection of inference-time rollout rules for multi-output volatility forecasters yields low-cost improvements over default MIMO deployment and recovers much of ensemble benefit at lower cost.

How Good Can Linear Models Be for Time-Series Forecasting?

cs.LG · 2026-06-25 · conditional · novelty 6.0

Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.

Retrieval-Augmented Personalization with Foundation Models for Wearable Stress Detection

cs.LG · 2026-06-23 · unverdicted · novelty 6.0

Retrieval from out-of-domain foundation models enables personalization of a lightweight transformer for stress detection, yielding +3.92% accuracy and +4.76% F1 gains on WESAD without user labels.

Mechanical Field Networks: Structured Neural Dynamics for Multivariate Systems

cs.LG · 2026-06-08 · unverdicted · novelty 6.0

MF-Net learns a shared field state and mechanical transition rule from trajectories to deliver competitive forecasting and recoverable relation matrices on Lorenz-96 and real systems.

STaT: Resolving Shape Distortion in Non-Stationary Time Series via Tri-Modal Synergy

cs.LG · 2026-05-25 · unverdicted · novelty 6.0

STaT is a Symbolic-Temporal-Textual Alignment model that integrates three modalities to reduce shape distortion in non-stationary time series forecasting, reporting up to 8.9% gains in magnitude metrics and 8.5% less distortion on eight benchmarks.

ChronoVAE-HOPE: Beyond Attention -- A Next-Generation VAE Foundation Model for Specialized Time Series Classification

cs.LG · 2026-05-21 · unverdicted · novelty 6.0 · 2 refs

ChronoVAE-HOPE proposes a VAE foundation model for time series classification that replaces attention with a HOPE Block dual-memory system and uses disentangled trend-seasonal latent representations, pre-trained on Monash and evaluated on UCR datasets.

DAD4TS: Data-Augmentation-Oriented Diffusion Model for Time-Series Forecasting with Small-Scale Data

cs.LG · 2026-05-18 · unverdicted · novelty 6.0 · 2 refs

DAD4TS trains a diffusion-based generator jointly with a forecaster under RL control and geometric projections to produce augmentation samples that boost accuracy on small-scale time-series data, with validation reported on five of six real-world datasets.

How Do Electrocardiogram Models Scale?

cs.LG · 2026-05-17 · conditional · novelty 6.0

Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.

cs.LG · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

TOA augments attention with learnable sequence-space operators and stochastic regularization to enable signed temporal mixing, yielding gains on forecasting and related benchmarks when added to PatchTST and iTransformer.

Empowering VLMs for Few-Shot Multimodal Time Series Classification via Tailored Agentic Reasoning

cs.AI · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

MarsTSC is a VLM agentic system with generator, reflector, and modifier roles that iteratively refines a knowledge bank to improve few-shot multimodal time series classification and produce human-readable explanations.

What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.

citing papers explorer

Showing 7 of 7 citing papers after filters.

From Observations to States: Latent Time Series Forecasting cs.LG · 2026-01-30 · conditional · none · ref 10 · internal anchor
LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.
Sundial: A Family of Highly Capable Time Series Foundation Models cs.LG · 2025-02-02 · conditional · none · ref 16 · internal anchor
Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.
How Good Can Linear Models Be for Time-Series Forecasting? cs.LG · 2026-06-25 · conditional · none · ref 16 · internal anchor
Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.
How Do Electrocardiogram Models Scale? cs.LG · 2026-05-17 · conditional · none · ref 20 · internal anchor
Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.
AlphaCast: A Human Wisdom-LLM Intelligence Co-Reasoning Framework for Interactive Time Series Forecasting cs.AI · 2025-11-12 · conditional · none · ref 16 · internal anchor
AlphaCast is a training-free LLM framework that performs interactive multi-stage reasoning for time series forecasting by integrating feature extraction, knowledge bases, case libraries, and contextual pools.
ReNF: Rethinking the Design of Neural Long-Term Time Series Forecasters cs.LG · 2025-09-30 · conditional · none · ref 12 · internal anchor
ReNF proposes Boosted Direct Output (BDO) and parameter smoothing so a basic temporal MLP outperforms complex state-of-the-art models on long-term time series forecasting benchmarks by implicitly combining forecasts to reduce uncertainty.
Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images cs.LG · 2025-08-06 · conditional · none · ref 26 · internal anchor
PatchECG applies masked patch training and disordered attention to handle asynchronous and partially missing ECG signals from varied layouts, reaching average AUROC 0.835 on simulated conditions and 0.778 on real hospital images for atrial fibrillation.

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer