When can LLMs actually correct their own mistakes? A critical survey of self-correction of LLMs

Ryo Kamoi, Yusen Zhang, Nan Zhang, Jiawei Han, Rui Zhang · 2024 · DOI 10.1162/tacl_a_00713

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

representative citing papers

RubricRefine: Improving Tool-Use Agent Reliability with Training-Free Pre-Execution Refinement

cs.LG · 2026-05-10 · unverdicted · novelty 7.0 · 3 refs

RubricRefine is a training-free pre-execution method that creates rubrics to score and fix inter-tool contract violations in agent code, reaching 0.86 average on M3ToolEval across seven models with zero executions and lower latency.

Guiding LLM-based Loop Invariant Synthesis via Feedback on Local Reasoning Errors

cs.PL · 2026-05-18 · unverdicted · novelty 6.0

LORIS detects local reasoning errors in LLM-generated proofs for loop invariants by translating natural-language steps to first-order logic implications and using invalid implications to refine the invariants, achieving 93.1% success on 460 C programs.

Weighted Rules under the Stable Model Semantics

cs.AI · 2026-05-10 · unverdicted · novelty 6.0

Weighted rules extend stable model semantics to support probabilistic reasoning, model ranking, and statistical inference in answer set programs.

ReflectCAP: Detailed Image Captioning with Reflective Memory

cs.AI · 2026-04-14 · unverdicted · novelty 6.0

ReflectCAP distills model-specific hallucination and oversight patterns into Structured Reflection Notes that steer LVLMs toward more factual and complete image captions, reaching the Pareto frontier on factuality-coverage trade-offs.

Correction and Corruption: A Two-Rate View of Error Flow in LLM Protocols

cs.LG · 2026-04-20 · unverdicted · novelty 5.0

A two-rate measurement (correction c and corruption γ) for LLM protocol steps predicts accuracy changes from paired correctness bits and flags three failure modes including mixture shift on GSM8K.

sciwrite-lint: Verification Infrastructure for the Age of Science Vibe-Writing

cs.DL · 2026-04-09

citing papers explorer

Showing 6 of 6 citing papers.

RubricRefine: Improving Tool-Use Agent Reliability with Training-Free Pre-Execution Refinement cs.LG · 2026-05-10 · unverdicted · none · ref 13 · 3 links
RubricRefine is a training-free pre-execution method that creates rubrics to score and fix inter-tool contract violations in agent code, reaching 0.86 average on M3ToolEval across seven models with zero executions and lower latency.
Guiding LLM-based Loop Invariant Synthesis via Feedback on Local Reasoning Errors cs.PL · 2026-05-18 · unverdicted · none · ref 21
LORIS detects local reasoning errors in LLM-generated proofs for loop invariants by translating natural-language steps to first-order logic implications and using invalid implications to refine the invariants, achieving 93.1% success on 460 C programs.
Weighted Rules under the Stable Model Semantics cs.AI · 2026-05-10 · unverdicted · none · ref 64
Weighted rules extend stable model semantics to support probabilistic reasoning, model ranking, and statistical inference in answer set programs.
ReflectCAP: Detailed Image Captioning with Reflective Memory cs.AI · 2026-04-14 · unverdicted · none · ref 13
ReflectCAP distills model-specific hallucination and oversight patterns into Structured Reflection Notes that steer LVLMs toward more factual and complete image captions, reaching the Pareto frontier on factuality-coverage trade-offs.
Correction and Corruption: A Two-Rate View of Error Flow in LLM Protocols cs.LG · 2026-04-20 · unverdicted · none · ref 7
A two-rate measurement (correction c and corruption γ) for LLM protocol steps predicts accuracy changes from paired correctness bits and flags three failure modes including mixture shift on GSM8K.
sciwrite-lint: Verification Infrastructure for the Age of Science Vibe-Writing cs.DL · 2026-04-09 · unreviewed · ref 19

When can LLMs actually correct their own mistakes? A critical survey of self-correction of LLMs

fields

years

verdicts

representative citing papers

citing papers explorer