Large Language Models have Intrinsic Self-Correction Ability
read the original abstract
Large language models (LLMs) have attracted significant attention for their exceptional abilities in various natural language processing tasks, but they suffer from hallucinations that will cause performance degradation. One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation, a technique known as self-correction. Among the two types of self-correction, intrinsic self-correction is considered a promising direction because it does not utilize external knowledge. However, recent works doubt the validity of LLM's ability to conduct intrinsic self-correction. In this paper, we present a novel perspective on the intrinsic self-correction capabilities of LLMs through theoretical analyses and empirical experiments. In addition, we identify two critical factors for successful self-correction: zero temperature and fair prompts. Leveraging these factors, we demonstrate that intrinsic self-correction ability is exhibited across multiple existing LLMs. Our findings offer insights into the fundamental theories underlying the self-correction behavior of LLMs and remark on the importance of unbiased prompts and zero temperature settings in harnessing their full potential.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
How LLMs Detect and Correct Their Own Errors: The Role of Internal Confidence Signals
LLMs implement a second-order confidence architecture where the PANL activation encodes both error likelihood and the ability to correct it, beyond verbal confidence or log-probabilities.
-
The Illusion of Insight in Reasoning Models
Mid-reasoning shifts in reasoning models are rare symptoms of unstable inference that seldom improve accuracy and do not reflect intrinsic self-correction.
-
Unlocking Zero-Shot Geospatial Reasoning via Indirect Rewards
Geo-R1 uses indirect proxy rewards from cross-view alignment with geolocation metadata to drive reinforcement learning, enabling zero-shot geospatial reasoning that transfers across 25+ tasks and sometimes exceeds sup...
-
Evidence-Supported Credit Risk Report Generation Using News-Centric Financial Knowledge Graphs
FinKG-News constructs news-centric financial knowledge graphs to support in-context learning for credit risk report generation across three dimensions, claiming 19-34% quality gains and fewer hallucinations than baselines.
-
The Future of Facts: Tracing the Factual Generation-Verification Gap
Empirical tracing across model families shows verification precedes and outlasts generation for facts, with updates producing simultaneous verification of old and new answers.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.