Plan before Solving: Problem-Aware Strategy Routing for Mathematical Reasoning with LLMs

Feng Tian; Jian Zhang; Jie Ma; Jun Liu; Lingling Zhang; Shihao Qi; Tongliang Liu; Ziang Yin

arxiv: 2509.24377 · v1 · pith:44R26G4Gnew · submitted 2025-09-29 · 💻 cs.AI

Plan before Solving: Problem-Aware Strategy Routing for Mathematical Reasoning with LLMs

Shihao Qi , Jie Ma , Ziang Yin , Lingling Zhang , Jian Zhang , Jun Liu , Feng Tian , Tongliang Liu This is my paper

classification 💻 cs.AI

keywords reasoningmathematicalstrategyroutingacrossprismadaptiveapproach

0 comments

read the original abstract

Existing methods usually leverage a fixed strategy, such as natural language reasoning, code-augmented reasoning, tool-integrated reasoning, or ensemble-based reasoning, to guide Large Language Models (LLMs) to perform mathematical reasoning. Our analysis reveals that the single strategy cannot adapt to problem-specific requirements and thus overlooks the trade-off between effectiveness and efficiency. To address these issues, we propose Planning and Routing through Instance-Specific Modeling (PRISM), a novel framework that decouples mathematical reasoning into two stages: strategy planning and targeted execution. Specifically, we first curate a multi-strategy preference dataset, which we call MathStrat, capturing correctness, process quality, and computational efficiency for each problem--strategy pair. Then, we train a lightweight Strategy Adapter based on the dataset to obtain confidence distributions over the mentioned four reasoning strategies. At inference time, an adaptive routing policy dynamically tailors the reasoning approach based on predictor confidence. It directs the model to use single-strategy execution for high-confidence predictions, dual-strategy verification for competitive scenarios, or comprehensive multi-strategy exploration for uncertain cases. Extensive experiments across five mathematical reasoning benchmarks demonstrate that PRISM consistently outperforms individual strategies and ensemble baselines, achieving improvements ranging from 0.9% to 7.6% across different base models. The adaptive routing approach shows particularly strong benefits for mathematical reasoning tasks across diverse model architectures. Our code is released at https://github.com/reml-group/PRISM.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Beyond Accuracy: Evaluating Strategy Diversity in LLM Mathematical Reasoning
cs.AI 2026-05 unverdicted novelty 7.0

Frontier LLMs achieve 95-100% accuracy on AMC/AIME problems but recover far fewer distinct valid strategies than human references, while collectively generating 50 novel strategies.
When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions
cs.LG 2026-05 unverdicted novelty 5.0

Early entropy dynamics during LLM decoding mark when explicit reasoning becomes beneficial, enabling the training-free EDRM router that selects strategies per instance and yields 41-55% token savings with accuracy gai...