PDFTime reformulates multivariate time series classification as a multi-stage prototype-based decision process, claiming SOTA results on UCR and UEA benchmarks.
hub Mixed citations
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Mixed citation behavior. Most common role is background (64%).
abstract
We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Olivia harmonizes time series datasets via normalized power spectral density using a Harmonizer module and resonator-based HarmonicAttention, achieving state-of-the-art zero-shot, few-shot, and full-shot forecasting on TSLib, GIFT-Eval, and GluonTS benchmarks.
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
Looped SSMs with shared parameters across depth match or exceed standard SSMs with more parameters on time series classification, with additional gains from input reshaping techniques.
NeuroAtlas benchmarks foundation models on 42 EEG datasets and reports that EEG-specific models do not consistently outperform generic time-series models, standard metrics miss clinical utility, and rankings vary by domain.
SeesawNet dynamically balances common and instance-specific dependencies via ASNA in temporal and channel dimensions, outperforming prior methods on non-stationary forecasting benchmarks.
Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.
FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.
MELO aggregates base predictors and their multi-scale EWLS adaptations using MLpol to achieve oracle inequalities against best fixed and time-varying predictors in non-stationary settings.
Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.
FeDPM learns and aligns local discrete prototypical memories across domains to create a unified discrete latent space for LLM-based time series foundation models in a federated setting.
LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.
Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.
Validation-based selection of inference-time rollout rules for multi-output volatility forecasters yields low-cost improvements over default MIMO deployment and recovers much of ensemble benefit at lower cost.
Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.
Retrieval from out-of-domain foundation models enables personalization of a lightweight transformer for stress detection, yielding +3.92% accuracy and +4.76% F1 gains on WESAD without user labels.
MF-Net learns a shared field state and mechanical transition rule from trajectories to deliver competitive forecasting and recoverable relation matrices on Lorenz-96 and real systems.
STaT is a Symbolic-Temporal-Textual Alignment model that integrates three modalities to reduce shape distortion in non-stationary time series forecasting, reporting up to 8.9% gains in magnitude metrics and 8.5% less distortion on eight benchmarks.
ChronoVAE-HOPE proposes a VAE foundation model for time series classification that replaces attention with a HOPE Block dual-memory system and uses disentangled trend-seasonal latent representations, pre-trained on Monash and evaluated on UCR datasets.
DAD4TS trains a diffusion-based generator jointly with a forecaster under RL control and geometric projections to produce augmentation samples that boost accuracy on small-scale time-series data, with validation reported on five of six real-world datasets.
Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.
TOA augments attention with learnable sequence-space operators and stochastic regularization to enable signed temporal mixing, yielding gains on forecasting and related benchmarks when added to PatchTST and iTransformer.
MarsTSC is a VLM agentic system with generator, reflector, and modifier roles that iteratively refines a knowledge bank to improve few-shot multimodal time series classification and produce human-readable explanations.
MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.
citing papers explorer
-
From Observations to States: Latent Time Series Forecasting
LatentTSF improves time series forecasting accuracy and representation quality by shifting prediction from observation space to a learned latent state space via autoencoding.
-
Sundial: A Family of Highly Capable Time Series Foundation Models
Sundial uses TimeFlow Loss for native pre-training of Transformers on continuous time series from TimeBench, achieving SOTA point and probabilistic forecasting with millisecond inference.
-
How Good Can Linear Models Be for Time-Series Forecasting?
Optimized Ridge regression with series-specific preprocessing beats prior linear forecasters and exceeds Transformer, MLP, and CNN baselines on six of eight time-series benchmarks.
-
How Do Electrocardiogram Models Scale?
Empirical scaling study of ECG models finds SSL scales robustly while ResNets show 1.3-2.5x better parameter efficiency and SSL up to 16x better data efficiency than supervised baselines on out-of-distribution tasks.
-
AlphaCast: A Human Wisdom-LLM Intelligence Co-Reasoning Framework for Interactive Time Series Forecasting
AlphaCast is a training-free LLM framework that performs interactive multi-stage reasoning for time series forecasting by integrating feature extraction, knowledge bases, case libraries, and contextual pools.
-
ReNF: Rethinking the Design of Neural Long-Term Time Series Forecasters
ReNF proposes Boosted Direct Output (BDO) and parameter smoothing so a basic temporal MLP outperforms complex state-of-the-art models on long-term time series forecasting benchmarks by implicitly combining forecasts to reduce uncertainty.
-
Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images
PatchECG applies masked patch training and disordered attention to handle asynchronous and partially missing ECG signals from varied layouts, reaching average AUROC 0.835 on simulated conditions and 0.778 on real hospital images for atrial fibrillation.