Recognition: 2 theorem links
· Lean TheoremA Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT
Pith reviewed 2026-05-15 07:04 UTC · model grok-4.3
The pith
A catalog of prompt patterns provides reusable solutions to common problems in LLM conversations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that prompt patterns act as reusable solutions to common problems in output generation and interaction with large language models, allowing for systematic improvement in prompt engineering through a documented catalog that can be adapted across domains.
What carries the argument
The prompt pattern, defined as a structured template providing reusable solutions to problems in LLM prompting, analogous to software design patterns.
If this is right
- Prompts can be constructed by combining multiple patterns to handle complex tasks.
- The catalog enables knowledge transfer of effective prompting strategies.
- Patterns support automation of software development tasks using LLMs.
- Outputs gain specific qualities and quantities as enforced by the patterns.
Where Pith is reading between the lines
- Adopting these patterns could standardize prompt engineering practices across teams.
- Future tools might automatically suggest or generate prompts based on the catalog.
- The framework could extend to other AI systems beyond language models.
- Testing the patterns on emerging LLMs would validate their broad applicability.
Load-bearing premise
The documented patterns will transfer effectively to new domains, tasks, and different large language models.
What would settle it
A controlled experiment showing that prompts built with the catalog produce no better results than ad-hoc prompts on a new set of tasks would falsify the claim.
read the original abstract
Prompt engineering is an increasingly important skill set needed to converse effectively with large language models (LLMs), such as ChatGPT. Prompts are instructions given to an LLM to enforce rules, automate processes, and ensure specific qualities (and quantities) of generated output. Prompts are also a form of programming that can customize the outputs and interactions with an LLM. This paper describes a catalog of prompt engineering techniques presented in pattern form that have been applied to solve common problems when conversing with LLMs. Prompt patterns are a knowledge transfer method analogous to software patterns since they provide reusable solutions to common problems faced in a particular context, i.e., output generation and interaction when working with LLMs. This paper provides the following contributions to research on prompt engineering that apply LLMs to automate software development tasks. First, it provides a framework for documenting patterns for structuring prompts to solve a range of problems so that they can be adapted to different domains. Second, it presents a catalog of patterns that have been applied successfully to improve the outputs of LLM conversations. Third, it explains how prompts can be built from multiple patterns and illustrates prompt patterns that benefit from combination with other prompt patterns.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a catalog of prompt patterns for enhancing interactions with LLMs like ChatGPT. It outlines a framework for documenting these patterns, provides examples of individual patterns and their combinations, and positions them as reusable solutions analogous to software design patterns for common problems in prompt engineering.
Significance. If the patterns prove effective, the work could provide a practical knowledge transfer mechanism for prompt engineering, helping developers and users structure prompts more systematically. The analogy to software patterns is apt and the framework for documentation is a useful contribution, though the absence of rigorous evaluation metrics means the significance is primarily in organization and illustration rather than proven efficacy.
major comments (2)
- [Abstract] Abstract: The claim that the patterns 'have been applied successfully to improve the outputs of LLM conversations' is not backed by quantitative validation, error measures, or comparative baselines; the support consists solely of illustrative examples constructed by the authors.
- [Contributions] Contributions: The second listed contribution asserts a catalog of successfully applied patterns, but the manuscript does not specify the criteria or evidence used to determine success, which directly affects the generalizability of the framework to different domains and LLMs.
minor comments (2)
- [Introduction] Introduction: Consider adding more references to existing prompt engineering literature to better contextualize the novelty of the pattern catalog relative to prior work.
- [Pattern catalog sections] Pattern descriptions: Some individual pattern sections could benefit from explicit discussion of potential limitations or edge cases where the pattern may not improve outputs.
Simulated Author's Rebuttal
We thank the referee for their constructive comments and recommendation for minor revision. We agree that the claims about successful application require clarification to accurately reflect the illustrative nature of the examples and the scope of the contribution as a framework and catalog.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that the patterns 'have been applied successfully to improve the outputs of LLM conversations' is not backed by quantitative validation, error measures, or comparative baselines; the support consists solely of illustrative examples constructed by the authors.
Authors: We agree that the wording in the abstract could imply empirical validation beyond what is provided. The paper positions the patterns as reusable solutions analogous to software design patterns, where contributions typically begin with illustrative examples. In the revised manuscript, we will update the abstract to state that the patterns 'are illustrated through examples demonstrating their use in improving LLM conversation outputs,' thereby removing any implication of quantitative success and aligning the text with the paper's focus on organization and illustration. revision: yes
-
Referee: [Contributions] Contributions: The second listed contribution asserts a catalog of successfully applied patterns, but the manuscript does not specify the criteria or evidence used to determine success, which directly affects the generalizability of the framework to different domains and LLMs.
Authors: We accept this point and will revise the contributions section to explicitly state that the catalog is derived from the authors' practical experience applying the patterns to common prompt engineering tasks in software development contexts. Success is demonstrated qualitatively via the provided examples rather than formal criteria or metrics. We will also add text noting the framework's intended adaptability while acknowledging that broader empirical validation across domains and LLMs remains future work. revision: yes
Circularity Check
No significant circularity: descriptive catalog without derivations or self-referential reductions
full rationale
The paper is a descriptive catalog of prompt patterns for LLM interactions, providing a documentation framework and illustrative examples of patterns and their combinations. It contains no equations, fitted parameters, predictions, or derivation chains. Claims of successful application rest on author-constructed examples rather than quantitative metrics, but this is not circularity per the rules, as no step reduces by construction to its own inputs via self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The work is self-contained as a knowledge-transfer contribution analogous to software patterns, with no ansatzes smuggled via citation or uniqueness theorems imported from prior author work that would force the result.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Prompt patterns provide reusable solutions to common problems in LLM output generation and interaction
Forward citations
Cited by 20 Pith papers
-
CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute Budget Allocation
CA-SQL achieves 51.72% execution accuracy on the challenging tier of the BIRD benchmark using GPT-4o-mini by scaling exploration breadth according to estimated task difficulty, evolutionary prompt seeding, and candida...
-
When Prompt Under-Specification Improves Code Correctness: An Exploratory Study of Prompt Wording and Structure Effects on LLM-Based Code Generation
Structurally rich task descriptions make LLMs robust to prompt under-specification, and under-specification can enhance code correctness by disrupting misleading lexical or structural cues.
-
Figures as Interfaces: Toward LLM-Native Artifacts for Scientific Discovery
LLM-native figures embed provenance and enable direct LLM interaction with scientific visualizations to accelerate discovery and improve reproducibility.
-
Architecture Without Architects: How AI Coding Agents Shape Software Architecture
AI coding agents perform vibe architecting by making prompt-driven architectural choices that produce structurally different systems for identical tasks.
-
Automated Design of Agentic Systems
Meta Agent Search uses a meta-agent to iteratively program novel agentic systems in code, producing agents that outperform state-of-the-art hand-designed ones across coding, science, and math while transferring across...
-
Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space
LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via inte...
-
From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification
Open-weight LLMs reach 81-91% success generating formally verified Dafny code for complex algorithmic problems when given structural signatures and self-healing verifier feedback.
-
SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
The paper systematizes agentic skills beyond tool use, providing design pattern and representation-scope taxonomies plus security analysis of malicious skill infiltration in agent marketplaces.
-
Making OpenAPI Documentation Agent-Ready: Detecting Documentation and REST Smells with a Multi-Agent LLM System
Hermes uses multi-agent LLMs to detect 2450 documentation and REST smells across 600 OpenAPI endpoints, demonstrating that structurally valid microservice APIs are often not semantically ready for agent consumption.
-
The Readability Spectrum: Patterns, Issues, and Prompt Effects in LLM-Generated Code
LLM-generated code matches human-written code in overall readability but exhibits different issue patterns, and prompt engineering has limited impact on improving it.
-
User Reviews as a Source for Usability Requirements: A Precursor Study on Using Large Language Models
LLMs can detect usability content in user reviews with F-scores comparable to humans, though performance depends strongly on prompt design.
-
Benchmarking LLM-Based Static Analysis for Secure Smart Contract Development: Reliability, Limitations, and Potential Hybrid Solutions
LLMs for smart contract security analysis show lexical bias from identifier names causing high false positives, with prompting creating precision-recall trade-offs, positioning them as complements rather than replacem...
-
Conventional Commit Classification using Large Language Models and Prompt Engineering
Few-shot prompting with the 32B DeepSeek-R1 model achieves the highest accuracy on a balanced set of 3,200 conventional commits mined from InfluxDB, while chain-of-thought adds no benefit and larger model scale improv...
-
Enhanced Self-Learning with Epistemologically-Informed LLM Dialogue
CausaDisco integrates Aristotle's Four Causes into LLM prompts to produce more engaging, exploratory, and multifaceted self-learning dialogues, as evidenced by controlled user studies.
-
STaR-DRO: Stateful Tsallis Reweighting for Group-Robust Structured Prediction
STaR-DRO applies momentum-smoothed Tsallis reweighting to focus learning on hard groups in structured prediction, yielding F1 gains on clinical label extraction.
-
LLM2Manim: Pedagogy-Aware AI Generation of STEM Animations
LLM2Manim pipeline generates pedagogy-aware Manim animations for STEM, producing slightly better student post-test scores (83% vs 78%), learning gains (d=0.67), and engagement than PowerPoint in a controlled study.
-
The PICCO Framework for Large Language Model Prompting: A Taxonomy and Reference Architecture for Prompt Structure
PICCO is a five-element reference architecture (Persona, Instructions, Context, Constraints, Output) for structuring LLM prompts, derived from synthesizing prior frameworks along with a taxonomy distinguishing prompt ...
-
Transparent and Controllable Recommendation Filtering via Multimodal Multi-Agent Collaboration
A multi-agent multimodal system with fact-grounded adjudication and a dynamic two-tier preference graph cuts false positives in content filtering by 74.3% and nearly doubles F1-score versus text-only baselines while s...
-
Nanomentoring: Investigating How Quickly People Can Help People Learn Feature-Rich Software
Experts can deliver helpful advice on over half of short 'nanoquestions' about feature-rich software in under one minute.
-
From System 1 to System 2: A Survey of Reasoning Large Language Models
The survey organizes the shift of LLMs toward deliberate System 2 reasoning, covering model construction techniques, performance on math and coding benchmarks, and future research directions.
Reference graph
Works this paper leans on
-
[1]
On the Opportunities and Risks of Foundation Models
R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al. , “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[2]
Y . Bang, S. Cahyawijaya, N. Lee, W. Dai, D. Su, B. Wilie, H. Lovenia, Z. Ji, T. Y u, W. Chung et al. , “A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and int eractivity,” arXiv preprint arXiv:2302.04023 , 2023
-
[3]
How well does chatgpt do when taking the med ical licensing exams?
A. Gilson, C. Safranek, T. Huang, V . Socrates, L. Chi, R. A . Taylor, and D. Chartash, “How well does chatgpt do when taking the med ical licensing exams?” medRxiv, pp. 2022–12, 2022
work page 2022
-
[4]
Architecting the future of software engi neering,
A. Carleton, M. H. Klein, J. E. Robert, E. Harper, R. K. Cun ningham, D. de Niz, J. T. Foreman, J. B. Goodenough, J. D. Herbsleb, I. O zkaya, and D. C. Schmidt, “Architecting the future of software engi neering,” Computer, vol. 55, no. 9, pp. 89–93, 2022
work page 2022
-
[5]
Github copilot · your ai pair programmer
“Github copilot · your ai pair programmer.” [Online]. Av ailable: https://github.com/features/copilot
-
[6]
Is github’s copilo t as bad as humans at introducing vulnerabilities in code?
O. Asare, M. Nagappan, and N. Asokan, “Is github’s copilo t as bad as humans at introducing vulnerabilities in code?” arXiv preprint arXiv:2204.04741, 2022
-
[7]
Asleep at the keyboard? assessing the security of github copilot’s co de contribu- tions,
H. Pearce, B. Ahmad, B. Tan, B. Dolan-Gavitt, and R. Karri , “Asleep at the keyboard? assessing the security of github copilot’s co de contribu- tions,” in 2022 IEEE Symposium on Security and Privacy (SP) . IEEE, 2022, pp. 754–768
work page 2022
-
[8]
Krochmalski, IntelliJ IDEA Essentials
J. Krochmalski, IntelliJ IDEA Essentials . Packt Publishing Ltd, 2014
work page 2014
-
[9]
P . Liu, W. Y uan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre- train, prompt, and predict: A systematic survey of promptin g methods in natural language processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 1–35, 2023
work page 2023
- [10]
-
[11]
D. C. Schmidt, M. Stal, H. Rohnert, and F. Buschmann, Pattern-oriented software architecture, patterns for concurrent and networ ked objects . John Wiley & Sons, 2013
work page 2013
-
[12]
ChatGPT: Large-Scale Generative Language Mo dels for Automated Content Creation,
OpenAI, “ChatGPT: Large-Scale Generative Language Mo dels for Automated Content Creation,” https://openai.com/blog/c hatgpt/, 2023, [Online; accessed 19-Feb-2023]
work page 2023
-
[13]
DALL·E 2: Creating Images from Text,
——, “DALL·E 2: Creating Images from Text,” https://openai.com/dall-e-2/ , 2023, [Online; accessed 1 9-Feb-2023]
work page 2023
-
[14]
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
D. Zhou, N. Sch¨ arli, L. Hou, J. Wei, N. Scales, X. Wang, D . Schu- urmans, O. Bousquet, Q. Le, and E. Chi, “Least-to-most promp ting enables complex reasoning in large language models,” arXiv preprint arXiv:2205.10625, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[15]
Graphviz and dynagraph—static and dynamic graph drawing t ools,
J. Ellson, E. R. Gansner, E. Koutsofios, S. C. North, and G . Woodhull, “Graphviz and dynagraph—static and dynamic graph drawing t ools,” Graph drawing software , pp. 127–148, 2004
work page 2004
-
[16]
Building a virtual machine inside a javascrip t library,
S. Owen, “Building a virtual machine inside a javascrip t library,” https://www.engraved.blog/building-a-virtual-machine-inside/, 2022, accessed: 2023-02-20
work page 2022
-
[17]
Applying Software Patterns to Address Interoperability in Blockchain-based Healthcare Apps
P . Zhang, J. White, D. C. Schmidt, and G. Lenz, “Applying software patterns to address interoperability in blockcha in-based healthcare apps,” CoRR, vol. abs/1706.03700, 2017. [Online]. Available: http://arxiv.org/abs/1706.03700
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[18]
A pattern collection for blockchain-based applications,
X. Xu, C. Pautasso, L. Zhu, Q. Lu, and I. Weber, “A pattern collection for blockchain-based applications,” in Proceedings of the 23rd European Conference on Pattern Languages of Programs , 2018, pp. 1–20
work page 2018
-
[19]
Chatgpt: five priorities for research,
E. A. van Dis, J. Bollen, W. Zuidema, R. van Rooij, and C. L . Bockting, “Chatgpt: five priorities for research,” Nature, vol. 614, no. 7947, pp. 224–226, 2023
work page 2023
-
[20]
Prompt programming for la rge language models: Beyond the few-shot paradigm,
L. Reynolds and K. McDonell, “Prompt programming for la rge language models: Beyond the few-shot paradigm,” CoRR, vol. abs/2102.07350,
-
[21]
Available: https://arxiv.org/abs/2102
[Online]. Available: https://arxiv.org/abs/2102. 07350
-
[22]
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. H. Chi, Q. Le, and D. Zhou, “Chain of thought prompting elicits reasoning i n large language models,” CoRR, vol. abs/2201.11903, 2022. [Online]. Available: https://arxiv.org/abs/2201.11903
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[23]
Emergent Abilities of Large Language Models
J. Wei, Y . Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borge aud, D. Y ogatama, M. Bosma, D. Zhou, D. Metzler, E. H. Chi, T. Hashimoto, O. Vinyals, P . Liang, J. Dean, and W. Fedus, “Em ergent abilities of large language models,” 2022. [Online]. Avail able: https://arxiv.org/abs/2206.07682
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[24]
Large language models are human-level prompt engine ers,
Y . Zhou, A. I. Muresanu, Z. Han, K. Paster, S. Pitis, H. Ch an, and J. Ba, “Large language models are human-level prompt engine ers,”
-
[25]
Available: https://arxiv.org/abs/2211
[Online]. Available: https://arxiv.org/abs/2211. 01910
-
[26]
Autoprompt: Eliciting knowledge from language models wit h automatically generated prompts,
T. Shin, Y . Razeghi, R. L. L. IV , E. Wallace, and S. Singh, “Autoprompt: Eliciting knowledge from language models wit h automatically generated prompts,” CoRR, vol. abs/2010.15980, 2020. [Online]. Available: https://arxiv.org/abs/2010.15980
-
[27]
Language models are unsupervised multitask learners,
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sut skever, “Language models are unsupervised multitask learners,” 20 19
-
[28]
Least- to- most prompting enables complex reasoning in large language models,
D. Zhou, N. Sch¨ arli, L. Hou, J. Wei, N. Scales, X. Wang, D. Schuurmans, C. Cui, O. Bousquet, Q. Le, and E. Chi, “Least- to- most prompting enables complex reasoning in large language models,”
-
[29]
Available: https://arxiv.org/abs/2205
[Online]. Available: https://arxiv.org/abs/2205. 10625
-
[30]
Maieutic prompting: Logically consiste nt reasoning with recursive explanations,
J. Jung, L. Qin, S. Welleck, F. Brahman, C. Bhagavatula, R. L. Bras, and Y . Choi, “Maieutic prompting: Logically consiste nt reasoning with recursive explanations,” 2022. [Online]. A vailable: https://arxiv.org/abs/2205.11822
-
[31]
Ask me anything: A simple strategy for prompting language models,
S. Arora, A. Narayan, M. F. Chen, L. Orr, N. Guha, K. Bhatia, I. Chami, and C. Re, “Ask me anything: A simple strategy for prompting language models,” in International Conference on Learning Representations , 2023. [Online]. Available: https://openreview.net/forum?id=bhUPJnS2g0X
work page 2023
-
[32]
Design guidelines for prompt e ngineering text-to-image generative models,
V . Liu and L. B. Chilton, “Design guidelines for prompt e ngineering text-to-image generative models,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems , 2022, pp. 1–23
work page 2022
-
[33]
P . Maddigan and T. Susnjak, “Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large la nguage models,” arXiv preprint arXiv:2302.02094 , 2023
-
[34]
Ptr: Prompt t uning with rules for text classification,
X. Han, W. Zhao, N. Ding, Z. Liu, and M. Sun, “Ptr: Prompt t uning with rules for text classification,” AI Open , vol. 3, pp. 182–192, 2022
work page 2022
-
[35]
Can chatg pt write a good boolean query for systematic review literature searc h?
S. Wang, H. Scells, B. Koopman, and G. Zuccon, “Can chatg pt write a good boolean query for systematic review literature searc h?” arXiv preprint arXiv:2302.03495 , 2023
-
[36]
Conversational automated progr am repair,
C. S. Xia and L. Zhang, “Conversational automated progr am repair,” arXiv preprint arXiv:2301.13246 , 2023
-
[37]
J. H. Choi, K. E. Hickman, A. Monahan, and D. Schwarcz, “C hatgpt goes to law school,” Available at SSRN , 2023
work page 2023
-
[38]
Mathematical c apabilities of chatgpt,
S. Frieder, L. Pinchetti, R.-R. Griffiths, T. Salvatori , T. Lukasiewicz, P . C. Petersen, A. Chevalier, and J. Berner, “Mathematical c apabilities of chatgpt,” arXiv preprint arXiv:2301.13867 , 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.