Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

Leman Akoglu; Pierre Jinghong Liang; Tianyang Zhou; Wenbo Chen

arxiv: 2605.29076 · v1 · pith:GE64MTT3new · submitted 2026-05-27 · 💻 cs.CL · cs.AI· cs.LG

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

Tianyang Zhou , Wenbo Chen , Pierre Jinghong Liang , Leman Akoglu This is my paper

classification 💻 cs.CL cs.AIcs.LG

keywords reasoningtextlearningoptimizationpromptclassificationcompactcomplex

0 comments

read the original abstract

LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable instructions but struggles with performance and scalability. We introduce eXTC (eXplainable Text Classifier) with three progressive stages: (1) learning a Standard Operating Procedure (SOP, or rulebook) in natural language via a new Structured Prompt Optimization algorithm; (2) SOP-grounded reasoning distillation from a large teacher LLM into a compact LM; and (3) expanding reasoning capabilities beyond the initial SOP via reinforcement learning. This design enables eXTC to provide (i) fast inference via a compact LM, with (ii) inference-time local reasoning traces, alongside a global, modular explanation of its learned domain rules, while (iii) significantly outperforming existing paradigms across diverse benchmarks in both classification performance and explanation quality, with stage-by-stage gains.

This paper has not been read by Pith yet.

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

discussion (0)