Only two of seven LLMs produce positive returns on live Polymarket data, with MiMo-V2-Flash at 17.6% CWR and Gemini-3-Flash at 6.2% CWR while the other five lose money.
hub Mixed citations
A decoder-only foundation model for time-series forecasting
Mixed citation behavior. Most common role is background (60%).
abstract
Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time-series corpus, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
A three-phase DRL framework for personalized portfolio management using a ticker-free encoder pretrained with a time series foundation model, an objective-conditioned MoE actor-critic, and inference-time LoRA adaptation from brokerage data.
GNSS-FM is a self-supervised foundation model for GNSS displacement time series that outperforms task-specific baselines on 90-day forecasting and seismic step localization after pretraining on global station data.
Hybrid TimesFM plus ridge regression on covariates forecasts 1-MeV electron flux with average R² of 0.9 on out-of-sample 2024 data, outperforming linear regression, CNN, LSTM and Transformer models.
SurF applies the Time Rescaling Theorem as a learnable bijection to create a single generative model for forecasting irregular multivariate event streams that outperforms or matches baselines on six benchmarks.
TimeClaw is an exploratory execution learning system that turns multiple valid tool-use paths into hierarchical distilled experience for improved time-series reasoning without test-time adaptation.
FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.
Time series foundation models match the performance of specialized models for day-ahead load forecasting while providing explanations that match domain knowledge on weather and calendar effects.
Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.
Aionoscope shows that time-series representations recover coarse signal types reliably but expose dense latent states like phase and amplitude much less reliably, with best dense-probe R² at 0.689 versus oracle 0.999.
A JEPA-based model with domain-informed multi-view self-distillation learns light-curve representations that outperform hand-crafted features on 15 of 16 StarEmbed metrics and adapts competitively to other irregular time-series datasets.
Tyan-WP is a pretrained wind power foundation model that outperforms site-specific TSMs and generic LTSMs in zero-shot ultra-short-term probabilistic forecasting on U.S. and U.K. sites via static embeddings and PAMF module.
GeoGNN is a two-tower GNN that learns geographic cell embeddings from adjacency graphs and matches them to temporal representations via dot-product similarity plus classification, improving geolocalization accuracy by ~27% on electricity datasets.
GITCO delivers +1.95% average MASE reduction on TimesFM 2.5 across 53 datasets by gated inference-time suppression of anomalous patches, capturing 89.9% of the improvement upper bound.
Attentive Neural Processes outperform Gaussian Processes and neural networks on light curve interpolation quality, feature recovery, calibration, and speed for 15 transient classes under realistic Rubin cadences.
ChronoVAE-HOPE proposes a VAE foundation model for time series classification that replaces attention with a HOPE Block dual-memory system and uses disentangled trend-seasonal latent representations, pre-trained on Monash and evaluated on UCR datasets.
MILM fine-tunes LLMs on XML-encoded multimodal irregular time series via a two-stage process that exploits informative sampling patterns to achieve top performance on EHR classification datasets.
RareCP improves interval efficiency for time series conformal prediction by retrieving and weighting regime-specific calibration examples while adapting to drift and maintaining coverage.
S4 models exhibit stable time-continuity unlike sensitive S6 models, with task continuity predicting performance and enabling temporal subsampling for better efficiency.
Foundation models outperform dataset-specific machine learning in energy time series forecasting across 54 datasets in 9 categories.
BLF achieves state-of-the-art binary forecasting on ForecastBench by using linguistic belief states updated in tool-use loops, hierarchical multi-trial logit averaging, and hierarchical Platt scaling calibration.
LASS-ODE-Power is a pretrained model that predicts power-system dynamic trajectories across regimes in a zero-shot manner after large-scale ODE pretraining and targeted fine-tuning.
MICA adapts infini compressive attention to the channel dimension, enabling scalable cross-channel dependencies in Transformers and cutting forecast error by 5.4% on average versus channel-independent baselines.
DynLMC creates synthetic time series data with dynamic inter-channel correlations that improve zero-shot forecasting in foundation models across multiple benchmarks.
citing papers explorer
-
PolyBench: Benchmarking LLM Forecasting and Trading Capabilities on Live Prediction Market Data
Only two of seven LLMs produce positive returns on live Polymarket data, with MiMo-V2-Flash at 17.6% CWR and Gemini-3-Flash at 6.2% CWR while the other five lose money.
-
TimeClaw: A Time-Series AI Agent with Exploratory Execution Learning
TimeClaw is an exploratory execution learning system that turns multiple valid tool-use paths into hierarchical distilled experience for improved time-series reasoning without test-time adaptation.
-
FactoryBench: Evaluating Industrial Machine Understanding
FactoryBench reveals that frontier LLMs achieve under 50% on structured causal questions and under 18% on decision-making in industrial robotic telemetry.
-
Explainable Load Forecasting with Covariate-Informed Time Series Foundation Models
Time series foundation models match the performance of specialized models for day-ahead load forecasting while providing explanations that match domain knowledge on weather and calendar effects.
-
A Quantum Inspired Variational Kernel and Explainable AI Framework for Cross Region Solar and Wind Energy Forecasting
A hybrid classical-plus-quantum-inspired framework for cross-region renewable energy forecasting matches top baselines within 1% accuracy and separates calm versus stormy conditions with a 15-fold higher Fisher discriminant ratio than a tuned radial basis kernel.
-
Degradation-aware Predictive Energy Management for Fuel Cell-Battery Ship Power System with Data-driven Load Forecasting
A degradation-aware predictive controller for hybrid ship power systems reduces hydrogen consumption by up to 5.8% and fuel cell degradation by up to 36.4% versus a filter-based benchmark on real harbor tug data.