pith. sign in

Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 4

verdicts

UNVERDICTED 4

roles

background 1

polarities

background 1

clear filters

representative citing papers

Many-Tier Instruction Hierarchy in LLM Agents

cs.CL · 2026-04-10 · unverdicted · novelty 7.0

ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.

Agent Security is a Systems Problem

cs.CR · 2026-05-18 · unverdicted · novelty 4.0 · 2 refs

The paper argues that agent security is best addressed as a systems problem by applying principles from operating systems, networks, and formal methods rather than relying solely on model robustness improvements.

Security Considerations for Artificial Intelligence Agents

cs.LG · 2026-03-12 · unverdicted · novelty 3.0

Frontier AI agents introduce new confidentiality, integrity, and availability risks through changed assumptions on code-data separation and authority boundaries, requiring layered defenses like sandboxing and policy enforcement.

citing papers explorer

Showing 4 of 4 citing papers.

  • The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck cs.CR · 2026-05-11 · unverdicted · none · ref 28

    PACT achieves perfect security and utility under oracle provenance by enforcing argument-level trust contracts based on semantic roles and cross-step provenance tracking, outperforming invocation-level monitors in AgentDojo evaluations.

  • Many-Tier Instruction Hierarchy in LLM Agents cs.CL · 2026-04-10 · unverdicted · none · ref 27

    ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.

  • Agent Security is a Systems Problem cs.CR · 2026-05-18 · unverdicted · none · ref 76 · 2 links

    The paper argues that agent security is best addressed as a systems problem by applying principles from operating systems, networks, and formal methods rather than relying solely on model robustness improvements.

  • Security Considerations for Artificial Intelligence Agents cs.LG · 2026-03-12 · unverdicted · none · ref 51

    Frontier AI agents introduce new confidentiality, integrity, and availability risks through changed assumptions on code-data separation and authority boundaries, requiring layered defenses like sandboxing and policy enforcement.