Thinking mode in Qwen3 models improves class-level performance on planning constraints but worsens precision constraints in IFEval, with 10-20% prompt-level flips and directional consistency in Hunyuan models.
Yongchan Kwon, Shang Zhu, Federico Bianchi, Kait- lyn Zhou, and James Zou
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.
citing papers explorer
-
Prompt Governance? On Governing Technologies Governed by Natural Language
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.