pith. machine review for the scientific record. sign in

arxiv: 2506.10630 · v2 · submitted 2025-06-12 · 💻 cs.LG · cs.AI

Recognition: unknown

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

Yitong Zhou , Yucong Luo , Mingyue Cheng , Qi Liu , Jiahao Wang , Daoyu Wang , Enhong Chen

Authors on Pith no claims yet
classification 💻 cs.LG cs.AI
keywords reasoningseriestimeforecastingllmsthinkingabilityapproach
0
0 comments X
read the original abstract

To advance time series forecasting (TSF), various methods have been proposed to improve prediction accuracy, evolving from statistical techniques to data-driven deep learning architectures. Despite their effectiveness, most existing methods still adhere to a fast thinking paradigm-relying on extracting historical patterns and mapping them to future values as their core modeling philosophy, lacking an explicit thinking process that incorporates intermediate time series reasoning. Meanwhile, emerging slow-thinking LLMs (e.g., OpenAI-o1) have shown remarkable multi-step reasoning capabilities, offering an alternative way to overcome these issues. However, prompt engineering alone presents several limitations - including high computational cost, privacy risks, and limited capacity for in-depth domain-specific time series reasoning. To address these limitations, a more promising approach is to train LLMs to develop slow thinking capabilities and acquire strong time series reasoning skills. For this purpose, we propose Time-R1, a two-stage reinforcement fine-tuning framework designed to enhance multi-step reasoning ability of LLMs for time series forecasting. Specifically, the first stage conducts supervised fine-tuning for warmup adaptation, while the second stage employs reinforcement learning to improve the model's generalization ability. Particularly, we design a fine-grained multi-objective reward specifically for time series forecasting, and then introduce GRIP (group-based relative importance for policy optimization), which leverages non-uniform sampling to further encourage and optimize the model's exploration of effective reasoning paths. Experiments demonstrate that Time-R1 significantly improves forecast performance across diverse datasets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting

    cs.LG 2026-04 unverdicted novelty 7.0

    CastFlow introduces a role-specialized agentic workflow with memory retrieval and multi-view toolkit for iterative ensemble time series forecasting, using two-stage SFT+RLVR training on a domain-specific LLM to outper...

  2. LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics

    cs.AI 2026-04 unverdicted novelty 7.0

    LLaTiSA is a vision-language model trained on a new 83k-sample hierarchical time series reasoning dataset that shows superior performance and out-of-distribution generalization on stratified TSR tasks.

  3. GeoDecider: A Coarse-to-Fine Agentic Workflow for Explainable Lithology Classification

    cs.AI 2026-05 unverdicted novelty 6.0

    GeoDecider introduces a coarse-to-fine agentic workflow using LLMs for explainable lithology classification from well logs, combining a base classifier, tool-augmented reasoning, and geological refinement to outperfor...

  4. GeoMind: An Agentic Workflow for Lithology Classification with Reasoned Tool Invocation

    cs.AI 2026-04 unverdicted novelty 6.0

    GeoMind applies an agentic workflow with tool-augmented modules and process supervision to outperform static models on lithology classification from well logs while producing traceable decisions.

  5. TimeRFT: Stimulating Generalizable Time Series Forecasting for TSFMs via Reinforcement Finetuning

    eess.SP 2026-04 unverdicted novelty 5.0

    TimeRFT applies reinforcement learning with multi-faceted step-wise rewards and informative sample selection to improve generalization and accuracy in TSFM adaptation beyond supervised fine-tuning.