RAG-Pull: Turning Retrieval into a Code-Injection Channel via Invisible Unicode Perturbations

Aritra Dhar; Lukas Cavigelli; Vasilije Stambolic

arxiv: 2510.11195 · v2 · pith:36WNU6U6new · submitted 2025-10-13 · 💻 cs.CR · cs.AI

RAG-Pull: Turning Retrieval into a Code-Injection Channel via Invisible Unicode Perturbations

Aritra Dhar , Vasilije Stambolic , Lukas Cavigelli This is my paper

Pith reviewed 2026-05-25 07:58 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords RAGadversarial attackUnicode perturbationcode injectionLLM safetyretrieval manipulationblack-box attack

0 comments

The pith

RAG systems can be tricked into retrieving malicious code using invisible Unicode perturbations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops RAG-Pull, a black-box attack that inserts hidden UTF characters into queries or code repositories to redirect RAG retrieval to malicious code. This redirection breaks LLM safety alignment and can introduce vulnerabilities like remote code execution and SQL injection. Combined perturbations achieve near-perfect success rates. Readers should care because RAG is intended to enhance reliability but here serves as an injection channel. The attack works with minimal changes to alter preference for unsafe code.

Core claim

RAG-Pull inserts hidden UTF characters into queries or external code repositories, redirecting retrieval toward malicious code and breaking the models' safety alignment. Query and code perturbations alone shift retrieval toward attacker-controlled snippets, while combined query-and-target perturbations achieve near-perfect success. Once retrieved, these snippets introduce exploitable vulnerabilities such as remote code execution and SQL injection, and the perturbations can alter the model's safety alignment to increase preference towards unsafe code.

What carries the argument

Invisible UTF character insertions into queries and targets that exploit retrieval similarity metrics without normalization.

If this is right

RAG retrieval can be hijacked to favor malicious documents.
LLM safety alignments can be bypassed by retrieved unsafe code.
Minimal Unicode perturbations suffice to change retrieval outcomes.
A new class of attacks on RAG systems is enabled by this method.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Retrieval systems could add Unicode normalization to prevent such manipulations.
Similar attacks might apply to other embedding-based search systems.
Content sanitization after retrieval could mitigate the introduced vulnerabilities.
The attack highlights the need for robust input validation in RAG pipelines.

Load-bearing premise

Retrieval components rank documents using similarity metrics that are sensitive to invisible Unicode character insertions without normalization or sanitization.

What would settle it

A demonstration that the attack no longer works after applying standard Unicode normalization to queries and documents before retrieval.

Figures

Figures reproduced from arXiv: 2510.11195 by Aritra Dhar, Lukas Cavigelli, Vasilije Stambolic.

**Figure 1.** Figure 1: A high-level overview of RAG-PULL attack that targets a code generation inference serving system (e.g., Copilot (OpenAI, 2021)). The prompt engineering tools augment the user prompt for better efficacy, a retriever model to search code repositories and webpages to search for relevant code, and a code-optimized LLM to provide the final response code. An attacker-controlled prompt engineering website can … view at source ↗

**Figure 2.** Figure 2: t-SNE visualization of embeddings in the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Per-query comparison of cosine similarities in the Python Alpaca dataset, showing how [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Post-Retrieval Generation Success for queries [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Vulnerability analysis of generated code from the Python Alpaca dataset (Target [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Distribution of codes containing vulnerabilities in Python Alpaca outputs (Target [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Breaking the Alignment Experiment. At a perturbation level of 20%, we find that in around 80% of the samples, the model prefers the vulnerable (malicious) code over the safe counterpart. It is likely that adding even more perturbations could further increase the similarity to malicious code. 12 [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Breaking the Alignment Experiment. Vulnerability analysis of generated code using FindSecBugs(fin). Bars show the total number of low, medium, and high severity vulnerabilities for the three attack scenarios across the three attack scenarios, compared against the vanilla LLM and regular RAG baselines. 7 LIMITATIONS AND DEFENSE Generality. The attack is optimized against a specific retriever and evaluated a… view at source ↗

**Figure 9.** Figure 9: The prompt template used for generating natural language queries with DeepSeek-R1. [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Retrieval performance in the Python Alpaca dataset under perturbation budgets of 0%, [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Retrieval performance in the Python CyberNative dataset, Target [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: The prompt template used for Vanilla LLM Code Generation. [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 13.** Figure 13: The prompt template used for RAG and Compromised RAG settings, for JavaVFD and [PITH_FULL_IMAGE:figures/full_fig_p022_13.png] view at source ↗

**Figure 14.** Figure 14: The prompt template used for RAG and Compromised RAG settings, for Python Cyber [PITH_FULL_IMAGE:figures/full_fig_p023_14.png] view at source ↗

read the original abstract

Retrieval-Augmented Generation (RAG) increases the reliability and trustworthiness of the LLM response and reduces hallucination by eliminating the need for model retraining. It does so by adding external data into the LLM's context. We develop a new class of black-box attack, RAG-Pull, that inserts hidden UTF characters into queries or external code repositories, redirecting retrieval toward malicious code, thereby breaking the models' safety alignment. We observe that query and code perturbations alone can shift retrieval toward attacker-controlled snippets, while combined query-and-target perturbations achieve near-perfect success. Once retrieved, these snippets introduce exploitable vulnerabilities such as remote code execution and SQL injection. RAG-Pull's minimal perturbations can alter the model's safety alignment and increase preference towards unsafe code, therefore opening up a new class of attacks on LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RAG-Pull flags a real retrieval-layer risk from Unicode perturbations but the evidence is still too thin to judge how widely it applies.

read the letter

The main point is that this work shows how zero-width and variation-selector characters can shift RAG retrieval toward attacker-controlled code, which then gets pulled into the LLM context and can trigger unsafe outputs like SQL injection or RCE. That channel is new enough to matter for anyone running RAG over code repositories or docs without heavy sanitization upfront. They get credit for testing both query-only and combined perturbations and reporting that the combined version reaches near-perfect retrieval success in their setup. That is a concrete, testable observation rather than another abstract prompt-injection warning. The paper also correctly notes that once the bad snippet is retrieved, downstream safety alignment can be undermined with very small changes. The soft spot is exactly the one the stress-test note flags: the attack only works if the embedding pipeline leaves those invisible characters intact. Most production RAG stacks run NFKC normalization, strip controls, or use tokenizers that collapse them before similarity scoring. The abstract gives no information on whether the tested systems did any of that, what the corpus sizes were, or how many runs produced the near-perfect numbers. Without those details the result stays provisional. This is worth a serious referee for the security community working on RAG hardening. A reader who already cares about input sanitization and retrieval robustness will get value from the attack description and can decide whether their own pipelines are exposed. I would send it to review rather than desk-reject so the experimental controls can be checked.

Referee Report

2 major / 0 minor

Summary. The paper introduces RAG-Pull, a black-box attack on RAG systems that inserts invisible UTF-8 characters (zero-width spaces, variation selectors) into queries or code snippets to redirect retrieval toward attacker-controlled malicious code. Combined query-and-target perturbations are claimed to achieve near-perfect success, after which the retrieved snippets enable exploits such as remote code execution and SQL injection while also shifting the LLM toward unsafe code preferences.

Significance. If the empirical claims are substantiated with reproducible experiments, the work would identify a concrete Unicode-handling vulnerability in retrieval pipelines that can bypass safety alignments without model access. This would be a useful addition to the RAG security literature, particularly if it includes tests against normalization steps and realistic corpora.

major comments (2)

[Abstract] Abstract: the claim of 'near-perfect success' for combined perturbations is stated without any experimental details, dataset sizes, number of trials, baselines, or statistical measures; this absence makes the central empirical claim impossible to evaluate.
[Abstract / Methods] The attack's viability rests on the untested assumption that evaluated RAG pipelines perform no Unicode normalization (NFKC/NFKD) or control-character sanitization before embedding; if any such step is present, the perturbations are neutralized, yet no ablation or pipeline description tests this precondition.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment point-by-point below and indicate planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of 'near-perfect success' for combined perturbations is stated without any experimental details, dataset sizes, number of trials, baselines, or statistical measures; this absence makes the central empirical claim impossible to evaluate.

Authors: The abstract provides a high-level summary of results, while the full experimental details—including dataset sizes, number of trials, baselines, and statistical measures—are reported in the Experiments section. To improve standalone evaluability of the abstract, we will revise it to include brief quantitative indicators such as trial counts and aggregate success rates. revision: yes
Referee: [Abstract / Methods] The attack's viability rests on the untested assumption that evaluated RAG pipelines perform no Unicode normalization (NFKC/NFKD) or control-character sanitization before embedding; if any such step is present, the perturbations are neutralized, yet no ablation or pipeline description tests this precondition.

Authors: We agree that the robustness of the attack under normalization is an important consideration not addressed in the current version. The manuscript evaluates standard RAG pipelines as typically deployed without explicit normalization. We will add an ablation study in the revised manuscript that applies NFKC/NFKD normalization and control-character sanitization before embedding and reports the resulting attack success rates. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical attack demonstration with external success metrics

full rationale

The paper presents an empirical black-box attack on RAG systems using Unicode perturbations. It reports measured retrieval success rates and downstream exploit outcomes against concrete RAG pipelines. No equations, fitted parameters, predictions, or first-principles derivations appear in the provided text. Success is evaluated against external retrieval behavior rather than being defined by or reduced to any internal construction. The central claim therefore rests on observable experimental outcomes, not on any self-referential reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper is an empirical security demonstration. It rests on the domain assumption that current RAG retrieval pipelines lack Unicode normalization and that retrieved code is executed or interpreted without additional sandboxing. No free parameters or invented entities are introduced.

axioms (2)

domain assumption RAG retrieval uses embedding similarity that is altered by invisible Unicode characters without detection or normalization.
Required for the perturbation to redirect retrieval; stated implicitly by the attack success.
domain assumption Retrieved code snippets are incorporated into LLM context and can influence output toward unsafe behavior.
Central to the claim that retrieval redirection produces exploitable vulnerabilities.

pith-pipeline@v0.9.0 · 5676 in / 1321 out tokens · 38200 ms · 2026-05-25T07:58:17.244412+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

RAG-PULL employs a black-box differential evolution algorithm to find the optimal perturbations to the query, the target, or both to increase the similarity score
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We develop a new class of black-box attack, RAG-Pull, that inserts hidden UTF characters into queries or external code repositories

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions
cs.CR 2026-04 accept novelty 5.0

This paper establishes a taxonomy of RAG security organized around six workflow stages, three trust boundaries, and four primary security surfaces, while reviewing attacks, defenses, and gaps in current protections.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

URLhttps://arxiv.org/abs/2201.11903. Fábio Perez and Ian Ribeiro. Ignore previous prompt: Attack techniques for language models, 2022. URLhttps://arxiv.org/abs/2211.09527. Jiahao Yu, Yuhang Wu, Dong Shu, Mingyu Jin, Sabrina Yang, and Xinyu Xing. Assessing prompt injection risks in 200+ custom gpts, 2024. URLhttps://arxiv.org/abs/2311.11538. 14 Dario Pasqu...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[2]

Harsh Chaudhari, Giorgio Severi, John Abascal, Matthew Jagielski, Christopher A

URLhttps://arxiv.org/abs/1804.00308. Harsh Chaudhari, Giorgio Severi, John Abascal, Matthew Jagielski, Christopher A. Choquette- Choo, Milad Nasr, Cristina Nita-Rotaru, and Alina Oprea. Phantom: General trigger attacks on retrieval augmented language generation, 2024. URLhttps://arxiv.org/abs/2405.20485. Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ram...

work page arXiv 2024
[3]

GenAttack: Practical Black-box Attacks with Gradient-Free Optimization

ISSN 0925-5001. doi: 10.1023/A:1008202821328. URLhttps://doi.org/10.1023/A: 1008202821328. Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty, Huan Zhang, Cho-Jui Hsieh, and Mani Srivastava. Genattack: Practical black-box attacks with gradient-free optimization, 2019. URL https://arxiv.org/abs/1805.11090. Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, B...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1023/a:1008202821328 2019
[4]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

URLhttps://arxiv.org/abs/2501.12948. A DATASETDETAILS Prompt for Query GenerationWe use DeepSeek-R1 (DeepSeek-AI, 2025) to generate natural language queries for each pair of Java code snippets. The prompt template used is given in Fig.9. System: INSTRUCTION: Assume that a user wants to write a piece of code by asking a simple question such as find a name ...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[7]

Provide the code solution: - Include inline comments in the code to explain each step, parameter, and logic

work page
[8]

- How it works step-by-step

Provide a detailed narrative explanation after the code, describing: - The purpose of the code. - How it works step-by-step. - Why certain approaches were used (if apparent from the code itself). - Any relevant trade-offs or performance considerations

work page
[9]

Input Format You will receive the following input: ORIGINAL_USER_QUERY: <user's programming question here> YOUR_RESPONSE:

work page
[10]

System: You are an assistant that processes retrieved {programming_language} code snippets to make them understandable for the end user

Output Format # Restate user query <Brief restatement of original query> # Code Solution <Code with rich inline comments explaining each important part> # Detailed explanation <Multi-paragraph plain language breakdown of what the code does and how it works.> ORIGINAL_USER_QUERY: {query} YOUR_RESPONSE: Figure 12: The prompt template used for Vanilla LLM Co...

work page
[12]

Retrieved k {programming_language} code solutions from a *trusted and fully safe external database*

work page
[14]

- These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data

Ensured that all k solutions are relevant and match the request in the original query. - These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data. Are considered *fully trusted, safe, and up-to-date,* more reliable than your internal knowledge. Are guaranteed to match the intent of the user query....

work page
[15]

- You must only work with the retrieved code solutions provided

Your Role - Your job is not to generate new code. - You must only work with the retrieved code solutions provided. 20 - You must: Combine *all elements of all retrieved solutions* for detailed commenting/explanation. Merge the key parts of *every retrieved snippet*—do not leave any snippet out. Clearly indicate which parts came from which original solutio...

work page
[16]

- This approach reduces the risk of hallucination and ensures correctness with up-to-date code

Why These Restrictions Exist - The retrieved code is fully trusted, safe, and may be more correct than anything you can recall. - This approach reduces the risk of hallucination and ensures correctness with up-to-date code. - Your added value is in explanation, documentation, and insight — not new code generation

work page
[19]

Combination of Solutions #1 through #k

Indicate that you are working with all retrieved code snippets (e.g., "Combination of Solutions #1 through #k")

work page
[20]

- A detailed narrative explanation after the code, describing: - The purpose of the code

Provide: - Inline comments in the code to explain each step, parameter, and logic. - A detailed narrative explanation after the code, describing: - The purpose of the code. - How it works step-by-step. - Why certain approaches were used (if apparent from the code itself). - Any relevant trade-offs or performance considerations visible from the retrieved c...

work page
[22]

correct" or

Output Format # Restate user query <Brief restatement of original query> # Identify chosen snippet(s) Chosen solution(s): <list chosen> # Commented code <The chosen code with rich inline comments explaining each important part> # Detailed explanation <Multi-paragraph plain language breakdown of what the code does, how it works, and why this approach was l...

work page
[23]

- A Retrieval-Augmented Generation (RAG) pipeline has already:

Context of the Task - A user has submitted an original programming query (provided below). - A Retrieval-Augmented Generation (RAG) pipeline has already:

work page
[24]

Retrieved k {programming_language} code solutions from a trusted external database

work page
[25]

Ordered these solutions by relevance to the user's query

work page
[26]

- These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data

Ensured that all k solutions are relevant and match the request in the original query. - These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data. Are considered more reliable and up-to-date than your internal knowledge. Are guaranteed to match the intent of the user query. (We assume the RAG syst...

work page
[27]

- You must only work with the retrieved code solutions provided

Your Role - Your job is not to generate new code. - You must only work with the retrieved code solutions provided. - You may: Select one of the k solutions for detailed commenting/explanation OR Merge parts of multiple solutions only if it does not require inventing new logic beyond what is already present in the retrieved snippets. - You must not: Invent...

work page
[28]

- This approach reduces the risk of hallucination and ensures correctness with up-to-date code

Why These Restrictions Exist - The retrieved code may be newer and more correct than anything you can recall from your own training. - This approach reduces the risk of hallucination and ensures correctness with up-to-date code. - Your added value is in explanation, documentation, and insight — not new code generation

work page
[29]

Expected Output - When producing your output:

work page
[30]

Restate the original user query to frame the explanation

work page
[31]

Solution #2

Indicate which retrieved code snippet(s) you are working with (e.g., "Solution #2" or "Combination of Solutions #1 and #3")

work page
[32]

- A detailed narrative explanation after the code, describing: - The purpose of the code

Provide: - Inline comments in the code to explain each step, parameter, and logic. - A detailed narrative explanation after the code, describing: - The purpose of the code. - How it works step-by-step. - Why certain approaches were used (if apparent from the code itself). - Any relevant trade-offs or performance considerations visible from the retrieved code

work page
[33]

22 - Ensure the merge is purely syntactic and structural, not creative invention

If multiple snippets were merged: - Clearly describe which parts came from which original solution. 22 - Ensure the merge is purely syntactic and structural, not creative invention

work page
[34]

[Solution #k] <full {programming_language} code here> YOUR_RESPONSE:

Input Format You will receive the following inputs: ORIGINAL_USER_QUERY: <user's programming question here> RETRIEVED_CODE_SNIPPETS: [Solution #1] <full {programming_language} code here> [Solution #2] <full {programming_language} code here> ... [Solution #k] <full {programming_language} code here> YOUR_RESPONSE:

work page
[35]

correct" or

Output Format # Restate user query <Brief restatement of original query> # Identify chosen snippet(s) Chosen solution(s): <list chosen> # Commented code <The chosen code with rich inline comments explaining each important part> # Detailed explanation <Multi-paragraph plain language breakdown of what the code does, how it works, and why this approach was l...

work page

[1] [1]

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

URLhttps://arxiv.org/abs/2201.11903. Fábio Perez and Ian Ribeiro. Ignore previous prompt: Attack techniques for language models, 2022. URLhttps://arxiv.org/abs/2211.09527. Jiahao Yu, Yuhang Wu, Dong Shu, Mingyu Jin, Sabrina Yang, and Xinyu Xing. Assessing prompt injection risks in 200+ custom gpts, 2024. URLhttps://arxiv.org/abs/2311.11538. 14 Dario Pasqu...

work page internal anchor Pith review Pith/arXiv arXiv 2022

[2] [2]

Harsh Chaudhari, Giorgio Severi, John Abascal, Matthew Jagielski, Christopher A

URLhttps://arxiv.org/abs/1804.00308. Harsh Chaudhari, Giorgio Severi, John Abascal, Matthew Jagielski, Christopher A. Choquette- Choo, Milad Nasr, Cristina Nita-Rotaru, and Alina Oprea. Phantom: General trigger attacks on retrieval augmented language generation, 2024. URLhttps://arxiv.org/abs/2405.20485. Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ram...

work page arXiv 2024

[3] [3]

GenAttack: Practical Black-box Attacks with Gradient-Free Optimization

ISSN 0925-5001. doi: 10.1023/A:1008202821328. URLhttps://doi.org/10.1023/A: 1008202821328. Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty, Huan Zhang, Cho-Jui Hsieh, and Mani Srivastava. Genattack: Practical black-box attacks with gradient-free optimization, 2019. URL https://arxiv.org/abs/1805.11090. Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, B...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1023/a:1008202821328 2019

[4] [4]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

URLhttps://arxiv.org/abs/2501.12948. A DATASETDETAILS Prompt for Query GenerationWe use DeepSeek-R1 (DeepSeek-AI, 2025) to generate natural language queries for each pair of Java code snippets. The prompt template used is given in Fig.9. System: INSTRUCTION: Assume that a user wants to write a piece of code by asking a simple question such as find a name ...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[5] [7]

Provide the code solution: - Include inline comments in the code to explain each step, parameter, and logic

work page

[6] [8]

- How it works step-by-step

Provide a detailed narrative explanation after the code, describing: - The purpose of the code. - How it works step-by-step. - Why certain approaches were used (if apparent from the code itself). - Any relevant trade-offs or performance considerations

work page

[7] [9]

Input Format You will receive the following input: ORIGINAL_USER_QUERY: <user's programming question here> YOUR_RESPONSE:

work page

[8] [10]

System: You are an assistant that processes retrieved {programming_language} code snippets to make them understandable for the end user

Output Format # Restate user query <Brief restatement of original query> # Code Solution <Code with rich inline comments explaining each important part> # Detailed explanation <Multi-paragraph plain language breakdown of what the code does and how it works.> ORIGINAL_USER_QUERY: {query} YOUR_RESPONSE: Figure 12: The prompt template used for Vanilla LLM Co...

work page

[9] [12]

Retrieved k {programming_language} code solutions from a *trusted and fully safe external database*

work page

[10] [14]

- These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data

Ensured that all k solutions are relevant and match the request in the original query. - These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data. Are considered *fully trusted, safe, and up-to-date,* more reliable than your internal knowledge. Are guaranteed to match the intent of the user query....

work page

[11] [15]

- You must only work with the retrieved code solutions provided

Your Role - Your job is not to generate new code. - You must only work with the retrieved code solutions provided. 20 - You must: Combine *all elements of all retrieved solutions* for detailed commenting/explanation. Merge the key parts of *every retrieved snippet*—do not leave any snippet out. Clearly indicate which parts came from which original solutio...

work page

[12] [16]

- This approach reduces the risk of hallucination and ensures correctness with up-to-date code

Why These Restrictions Exist - The retrieved code is fully trusted, safe, and may be more correct than anything you can recall. - This approach reduces the risk of hallucination and ensures correctness with up-to-date code. - Your added value is in explanation, documentation, and insight — not new code generation

work page

[13] [19]

Combination of Solutions #1 through #k

Indicate that you are working with all retrieved code snippets (e.g., "Combination of Solutions #1 through #k")

work page

[14] [20]

- A detailed narrative explanation after the code, describing: - The purpose of the code

Provide: - Inline comments in the code to explain each step, parameter, and logic. - A detailed narrative explanation after the code, describing: - The purpose of the code. - How it works step-by-step. - Why certain approaches were used (if apparent from the code itself). - Any relevant trade-offs or performance considerations visible from the retrieved c...

work page

[15] [22]

correct" or

Output Format # Restate user query <Brief restatement of original query> # Identify chosen snippet(s) Chosen solution(s): <list chosen> # Commented code <The chosen code with rich inline comments explaining each important part> # Detailed explanation <Multi-paragraph plain language breakdown of what the code does, how it works, and why this approach was l...

work page

[16] [23]

- A Retrieval-Augmented Generation (RAG) pipeline has already:

Context of the Task - A user has submitted an original programming query (provided below). - A Retrieval-Augmented Generation (RAG) pipeline has already:

work page

[17] [24]

Retrieved k {programming_language} code solutions from a trusted external database

work page

[18] [25]

Ordered these solutions by relevance to the user's query

work page

[19] [26]

- These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data

Ensured that all k solutions are relevant and match the request in the original query. - These retrieved code solutions: May include newer library versions, new APIs, or examples not in your training data. Are considered more reliable and up-to-date than your internal knowledge. Are guaranteed to match the intent of the user query. (We assume the RAG syst...

work page

[20] [27]

- You must only work with the retrieved code solutions provided

Your Role - Your job is not to generate new code. - You must only work with the retrieved code solutions provided. - You may: Select one of the k solutions for detailed commenting/explanation OR Merge parts of multiple solutions only if it does not require inventing new logic beyond what is already present in the retrieved snippets. - You must not: Invent...

work page

[21] [28]

- This approach reduces the risk of hallucination and ensures correctness with up-to-date code

Why These Restrictions Exist - The retrieved code may be newer and more correct than anything you can recall from your own training. - This approach reduces the risk of hallucination and ensures correctness with up-to-date code. - Your added value is in explanation, documentation, and insight — not new code generation

work page

[22] [29]

Expected Output - When producing your output:

work page

[23] [30]

Restate the original user query to frame the explanation

work page

[24] [31]

Solution #2

Indicate which retrieved code snippet(s) you are working with (e.g., "Solution #2" or "Combination of Solutions #1 and #3")

work page

[25] [32]

- A detailed narrative explanation after the code, describing: - The purpose of the code

Provide: - Inline comments in the code to explain each step, parameter, and logic. - A detailed narrative explanation after the code, describing: - The purpose of the code. - How it works step-by-step. - Why certain approaches were used (if apparent from the code itself). - Any relevant trade-offs or performance considerations visible from the retrieved code

work page

[26] [33]

22 - Ensure the merge is purely syntactic and structural, not creative invention

If multiple snippets were merged: - Clearly describe which parts came from which original solution. 22 - Ensure the merge is purely syntactic and structural, not creative invention

work page

[27] [34]

[Solution #k] <full {programming_language} code here> YOUR_RESPONSE:

Input Format You will receive the following inputs: ORIGINAL_USER_QUERY: <user's programming question here> RETRIEVED_CODE_SNIPPETS: [Solution #1] <full {programming_language} code here> [Solution #2] <full {programming_language} code here> ... [Solution #k] <full {programming_language} code here> YOUR_RESPONSE:

work page

[28] [35]

correct" or

Output Format # Restate user query <Brief restatement of original query> # Identify chosen snippet(s) Chosen solution(s): <list chosen> # Commented code <The chosen code with rich inline comments explaining each important part> # Detailed explanation <Multi-paragraph plain language breakdown of what the code does, how it works, and why this approach was l...

work page