pith. sign in

Learning to Edit Knowledge via Instruction-based Chain-of-Thought Prompting

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Large language models (LLMs) can effectively handle outdated information through knowledge editing. However, current approaches face two key limitations: (I) Poor generalization: Most approaches rigidly inject new knowledge without ensuring that the model can use it effectively to solve practical problems. (II) Narrow scope: Current methods focus primarily on structured fact triples, overlooking the diverse unstructured forms of factual information (e.g., news, articles) prevalent in real-world contexts. To address these challenges, we propose a new paradigm: teaching LLMs to edit knowledge via Chain of Thoughts (CoTs) reasoning (CoT2Edit). We first leverage language model agents for both structured and unstructured edited data to generate CoTs, building high-quality instruction data. The model is then trained to reason over edited knowledge through supervised fine-tuning (SFT) and Group Relative Policy Optimization (GRPO). At inference time, we integrate Retrieval-Augmented Generation (RAG) to dynamically retrieve relevant edited facts for real-time knowledge editing. Experimental results demonstrate that our method achieves strong generalization across six diverse knowledge editing scenarios with just a single round of training on three open-source language models. The codes are available at https://github.com/FredJDean/CoT2Edit.

fields

cs.CV 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

CRANE: Knowledge Editing for Reasoning MLLMs

cs.CV · 2026-06-08 · unverdicted · novelty 7.0

CRANE uses dual-library retrieval plus two-phase training (SFT then GRPO with cognitive routing reward) to reach 96.9% grounded success on conflict edits in reasoning MLLMs.

citing papers explorer

Showing 1 of 1 citing paper.

  • CRANE: Knowledge Editing for Reasoning MLLMs cs.CV · 2026-06-08 · unverdicted · none · ref 22 · internal anchor

    CRANE uses dual-library retrieval plus two-phase training (SFT then GRPO with cognitive routing reward) to reach 96.9% grounded success on conflict edits in reasoning MLLMs.