TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization

· 2026 · cs.CL · arXiv 2605.21318

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Large language models (LLMs) are highly sensitive to the prompts used to specify task objectives and behavioral constraints. Many recent prompt optimization methods iteratively rewrite prompts using LLM-generated feedback, but the resulting prompts often become longer, accumulate narrow sample-specific rules, and generalize poorly beyond the training distribution. We study this failure mode as prompt distributional overfitting and argue that it reflects a lack of representation control in discrete text-space optimization. We formalize this view through representational inefficiency, a dual-factor measure that decomposes prompt inefficiency into capacity cost and scope narrowness, attributing distributional prompt overfitting to their coupled growth during optimization. We propose TextReg, a regularization framework that realizes a soft-penalty objective through regularized textual gradients, combining Dual-Evidence Gradient Purification, Semantic Edit Regularization, and Regularization-Guided Prompt Update. Across multiple reasoning benchmarks, TextReg substantially improves out-of-distribution (OOD) generalization, with accuracy gains of up to +11.8% over TextGrad and +16.5% over REVOLVE.

representative citing papers

Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity

cs.CL · 2026-06-01 · unverdicted · novelty 6.0

Peer agreement misleads initially correct LLMs more than it corrects initially wrong ones, with authority labels biasing choices independently of accuracy and reasoning prompts failing to mitigate the asymmetry.

Why Prompt Optimization Works, and Why It Sometimes Doesn't: A Causal-Inspired Edit-Level Analysis

cs.CL · 2026-05-26 · unverdicted · novelty 6.0

Observational causal-inspired analysis finds prompt optimization failures arise from systematic interactions between edit families and task characteristics rather than random artifacts.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity cs.CL · 2026-06-01 · unverdicted · none · ref 46 · internal anchor
Peer agreement misleads initially correct LLMs more than it corrects initially wrong ones, with authority labels biasing choices independently of accuracy and reasoning prompts failing to mitigate the asymmetry.
Why Prompt Optimization Works, and Why It Sometimes Doesn't: A Causal-Inspired Edit-Level Analysis cs.CL · 2026-05-26 · unverdicted · none · ref 16 · internal anchor
Observational causal-inspired analysis finds prompt optimization failures arise from systematic interactions between edit families and task characteristics rather than random artifacts.

TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization

fields

years

verdicts

representative citing papers

citing papers explorer