Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
Exploring the System 1 Thinking Capability of Large Reasoning Models
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
This paper explores the system 1 thinking capability of Large Reasoning Models (LRMs), the intuitive ability to respond efficiently with minimal token usage. While existing LRMs rely on long-chain reasoning and excel at complex tasks, their system 1 thinking ability remains largely underexplored. This capability is essential as it reflects models' difficulty awareness and reasoning efficiency, both critical for real-world applications. We propose S1-Bench, a multi-domain, multilingual benchmark comprising model-simple system 1 questions. Our investigation of 28 LRMs reveals under-accuracy and inefficiency on system 1 problems. We find existing efficient reasoning methods either generalize poorly to simple questions or sacrifice performance for efficiency. Further exploration uncovers LRMs' early difficulty awareness accompanied by lower confidence, and shows that problem difficulty is implicitly encoded in hidden states.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
CoRD uses collaborative multi-teacher step-wise decoding with perplexity-guided beam search to generate higher-quality Long-CoT data that lets smaller models reach near-teacher performance with less supervision.
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
citing papers explorer
-
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
-
Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding
CoRD uses collaborative multi-teacher step-wise decoding with perplexity-guided beam search to generate higher-quality Long-CoT data that lets smaller models reach near-teacher performance with less supervision.
-
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.