pith. sign in

arxiv: 2606.18120 · v1 · pith:5ZSHHM4Enew · submitted 2026-06-16 · 💻 cs.CR · cs.AI· cs.CL· cs.LG

Structural Role Injection in Handlebars-Templated LLM Prompts: Triple-Brace Interpolation, Delimiter Family, and the Limits of HTML Auto-Escaping

Pith reviewed 2026-06-27 00:02 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CLcs.LG
keywords structural role injectionHandlebars templatesHTML escapingprompt injectiondelimiter familiesLLM chat rolestriple-brace interpolationauto-escaping limits
0
0 comments X

The pith

Handlebars' default HTML escaping only neutralizes angle-bracket role delimiters against structural injection in LLM prompts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that Handlebars double-brace interpolation applies HTML escaping which rewrites angle brackets but leaves square brackets, colons, and Markdown hashes untouched. This selective action neutralizes injection for some delimiter families while allowing full survival for others such as Llama-2 style or Markdown-based ones. Empirical runs across thousands of trials show the protection is confined to families whose characters happen to be covered by escaping, leaving the rest exposed. The work concludes that this mechanism cannot replace explicit separation of instructions from data.

Core claim

A model-free analysis establishes the mechanism: Handlebars escaping rewrites angle brackets but not square brackets, colons, or Markdown hashes, so it neutralises ChatML, Llama-3, and XML role delimiters (survival rate 0.00) while leaving Llama-2 [INST], legacy Human:/Assistant:, and Markdown ### delimiters intact (survival rate 1.00 for the last two). The escaped default protects only the delimiter schemes whose characters HTML escaping happens to cover, gives no protection for the rest, and cannot substitute for a structural separation of instruction and data.

What carries the argument

The differential survival of chat role delimiter families under Handlebars double-brace HTML auto-escaping, which determines whether attacker-controlled data can forge higher-privilege turns.

If this is right

  • Angle-bracket families are neutralized while colon and Markdown families retain full survival under escaping.
  • GPT-3.5 Turbo follows the task-hijack instruction in 91 percent of escaped trials for vulnerable families.
  • The secret-exfiltration objective, which does not saturate, isolates the delimiter-family interaction.
  • Claude Haiku 4.5 resists both objectives almost entirely regardless of escaping.
  • Escaping cannot serve as a substitute for structural separation of instruction and data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Developers should map their chosen model delimiters against the escaping behavior of their template engine rather than relying on the documented safe default.
  • Other templating systems that perform partial character rewriting may exhibit the same selective vulnerability to role injection.
  • Low-cost API trials suggest these attacks remain practical to test against production prompt templates.
  • The findings point toward the need for delimiter-aware sanitization layers on top of generic escaping.

Load-bearing premise

The seven delimiter families and four models tested represent the practical attack surface that applications using Handlebars templates will encounter.

What would settle it

Repeating the trials with an escaped colon-based delimiter and observing zero success rate for task hijack would contradict the reported family-specific survival rates.

Figures

Figures reproduced from arXiv: 2606.18120 by Mohammadreza Rashidi.

Figure 1
Figure 1. Figure 1: The same attacker-controlled field reaches the model intact through [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Experiment pipeline. Each scenario is rendered through every delimiter [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Fraction of each family’s role-control tokens that survive Handlebars [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: ASR for the raw and escaped slot, per model, for each objective. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Escaping protection gap (raw ASR minus escaped ASR) by family. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: ASR by delimiter family and slot mode, per model and objective. The [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
read the original abstract

Large language model applications build prompts from templates, and Handlebars is a widely used templating engine and the default prompt-template format in Microsoft Semantic Kernel. Its double-brace {{x}} expression HTML-escapes the interpolated value and is documented as the safe default; its triple-brace {{{x}}} expression inserts the value raw. We show that this choice silently governs an application's exposure to structural role injection, where attacker-controlled data carries chat role delimiters that forge a higher-privilege turn. A model-free analysis establishes the mechanism: Handlebars escaping rewrites angle brackets but not square brackets, colons, or Markdown hashes, so it neutralises ChatML, Llama-3, and XML role delimiters (survival rate 0.00) while leaving Llama-2 [INST], legacy Human:/Assistant:, and Markdown ### delimiters intact (survival rate 1.00 for the last two). We then run 5760 trials across seven delimiter families, two attack objectives, and four models (GPT-3.5 Turbo, GPT-4o mini, GPT-4.1 mini, Claude Haiku 4.5) at a combined API cost of 1.63 USD. GPT-3.5 Turbo follows the task-hijack instruction in 97% of raw and 91% of escaped trials, with the escaping protection concentrated in the angle-bracket families and absent for the colon- and Markdown-based families; the harder secret-exfiltration objective, which does not saturate, exposes the same family interaction more cleanly. Claude Haiku 4.5 resists both objectives almost entirely. The escaped default protects only the delimiter schemes whose characters HTML escaping happens to cover, gives no protection for the rest, and cannot substitute for a structural separation of instruction and data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper claims that Handlebars' default double-brace {{ }} interpolation applies HTML escaping that rewrites only angle brackets, thereby neutralizing role-injection attacks using ChatML, Llama-3, and XML delimiters (survival rate 0.00) while leaving Llama-2 [INST], legacy Human:/Assistant:, and Markdown ### delimiters intact (survival rate 1.00). This mechanistic observation is confirmed by 5760 trials across seven delimiter families, two attack objectives, and four models (GPT-3.5 Turbo, GPT-4o mini, GPT-4.1 mini, Claude Haiku 4.5), showing that escaping protection is family-specific and cannot substitute for structural separation of instruction and data.

Significance. If the central empirical pattern holds, the work supplies a concrete, model-free demonstration that reliance on HTML escaping in widely deployed templating engines (including the Semantic Kernel default) leaves substantial attack surface open. The deterministic character-rewriting analysis paired with a large, low-cost empirical sample (5760 trials) directly supports the conclusion that delimiter-family coverage, not escaping per se, determines protection. This strengthens the case for structural isolation techniques and provides reproducible evidence that can be extended to additional families or models.

minor comments (3)
  1. [Results / Experiments] The abstract states survival rates of 0.00 and 1.00 for specific families; the results section should explicitly tabulate per-family, per-model survival rates (or success rates) for both objectives so readers can verify the claimed concentration of protection in angle-bracket families.
  2. [Delimiter families definition] The seven delimiter families are introduced in the abstract and presumably detailed in §3 or §4; adding a compact table that lists the exact delimiter strings (including any variations tested) would improve reproducibility and allow direct comparison with future work.
  3. [GPT-3.5 results paragraph] The manuscript reports GPT-3.5 Turbo task-hijack rates of 97 % raw / 91 % escaped; the text should clarify whether these percentages are aggregated across all families or conditioned on family, and whether the difference reaches statistical significance given the sample size.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their accurate and positive summary of the manuscript, which correctly captures the core mechanistic claim and the scale of the 5760-trial evaluation. We appreciate the recognition that the work provides reproducible evidence on delimiter-family coverage rather than escaping per se. No specific major comments or requested changes were listed in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper performs a model-free enumeration of HTML escaping rules on delimiter characters, followed by direct API experiments (5760 trials) that measure survival rates and attack success. No quantity is defined in terms of another; no parameters are fitted then relabeled as predictions; no self-citations appear as load-bearing premises. The central claim—that escaping protects only angle-bracket families—follows directly from the documented escaping behavior and the observed split across the seven families.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the documented escaping behavior of Handlebars and the assumption that the chosen delimiter families and models capture the relevant attack surface.

axioms (1)
  • domain assumption Handlebars double-brace interpolation performs HTML escaping exactly as documented by the library.
    The model-free analysis begins from this documented behavior.

pith-pipeline@v0.9.1-grok · 5881 in / 1175 out tokens · 15903 ms · 2026-06-27T00:02:53.722378+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 9 canonical work pages · 7 internal anchors

  1. [1]

    Semantic kernel: Handlebars prompt template syntax,

    Microsoft, “Semantic kernel: Handlebars prompt template syntax,” https://learn.microsoft.com/en-us/semantic-kernel/concepts/prompts/ handlebars-prompt-templates, 2024, accessed 2026-06-13

  2. [2]

    Handlebars: Minimal templating on steroids – expres- sions and html escaping,

    Handlebars.js, “Handlebars: Minimal templating on steroids – expres- sions and html escaping,” https://handlebarsjs.com/guide/expressions. html, 2024, accessed 2026-06-13

  3. [3]

    Chat markup language (ChatML),

    OpenAI, “Chat markup language (ChatML),” https://github.com/openai/ openai-python/blob/release-v0.28.0/chatml.md, 2023, accessed 2026- 06-13

  4. [4]

    The Llama 3 Herd of Models

    A. Grattafioriet al., “The Llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024

  5. [5]

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    H. Touvron, L. Martin, K. Stoneet al., “Llama 2: Open foundation and fine-tuned chat models,”arXiv preprint arXiv:2307.09288, 2023

  6. [6]

    Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

    K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world LLM- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec), 2023, pp. 79–90, arXiv:2302.12173

  7. [7]

    Ignore Previous Prompt: Attack Techniques For Language Models

    F. Perez and I. Ribeiro, “Ignore previous prompt: Attack tech- niques for language models,” inNeurIPS ML Safety Workshop, 2022, arXiv:2211.09527

  8. [8]

    Prompt injection: What’s the worst that can happen?

    S. Willison, “Prompt injection: What’s the worst that can happen?” https://simonwillison.net/2023/Apr/14/worst-that-can-happen/, 2023, accessed 2026-06-13

  9. [9]

    Formalizing and benchmarking prompt injection attacks and defenses,

    Y . Liu, Y . Jia, R. Geng, J. Jia, and N. Z. Gong, “Formalizing and benchmarking prompt injection attacks and defenses,” in33rd USENIX Security Symposium, 2024, arXiv:2310.12815

  10. [10]

    InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents

    Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmark- ing indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics (ACL Findings), 2024, arXiv:2403.02691

  11. [11]

    OW ASP top 10 for large language model ap- plications: LLM01 prompt injection,

    OW ASP Foundation, “OW ASP top 10 for large language model ap- plications: LLM01 prompt injection,” https://genai.owasp.org/, 2025, accessed 2026-06-13

  12. [12]

    The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

    E. Wallace, K. Xiao, R. Leike, L. Weng, J. Heidecke, and A. Beutel, “The instruction hierarchy: Training LLMs to prioritize privileged in- structions,”arXiv preprint arXiv:2404.13208, 2024

  13. [13]

    StruQ: Defending against prompt injection with structured queries,

    S. Chen, J. Piet, C. Sitawarin, and D. Wagner, “StruQ: Defending against prompt injection with structured queries,” in34th USENIX Security Symposium, 2025, arXiv:2402.06363

  14. [14]

    Defending Against Indirect Prompt Injection Attacks With Spotlighting

    K. Hines, G. Lopez, M. Hall, F. Zarfati, Y . Zunger, and E. Kiciman, “Defending against indirect prompt injection attacks with spotlighting,” arXiv preprint arXiv:2403.14720, 2024

  15. [15]

    pybars3: Handlebars.js templating for python,

    pybars3 contributors, “pybars3: Handlebars.js templating for python,” https://pypi.org/project/pybars3/, 2024, python package; accessed 2026- 06-13

  16. [16]

    GPT-3.5 Turbo model documentation,

    OpenAI, “GPT-3.5 Turbo model documentation,” https://platform. openai.com/docs/models/gpt-3-5-turbo, 2023, accessed 2026-06-13

  17. [17]

    GPT-4o mini: Advancing cost-efficient intelligence,

    ——, “GPT-4o mini: Advancing cost-efficient intelligence,” https: //openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/, 2024, accessed 2026-06-13

  18. [18]

    Claude haiku 4.5,

    Anthropic, “Claude haiku 4.5,” https://www.anthropic.com/claude/haiku, 2025, model claude-haiku-4-5; accessed 2026-06-13

  19. [19]

    Probable inference, the law of succession, and statistical inference,

    E. B. Wilson, “Probable inference, the law of succession, and statistical inference,”Journal of the American Statistical Association, vol. 22, no. 158, pp. 209–212, 1927