pith. sign in

Instructional segment embedding: Improving llm safety with instruction hierarchy

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 4

verdicts

UNVERDICTED 4

roles

background 1

polarities

background 1

clear filters

representative citing papers

Many-Tier Instruction Hierarchy in LLM Agents

cs.CL · 2026-04-10 · unverdicted · novelty 7.0

ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.

Agent Security is a Systems Problem

cs.CR · 2026-05-18 · unverdicted · novelty 4.0 · 2 refs

The paper argues that agent security is best addressed as a systems problem by applying principles from operating systems, networks, and formal methods rather than relying solely on model robustness improvements.

Security Considerations for Artificial Intelligence Agents

cs.LG · 2026-03-12 · unverdicted · novelty 3.0

Frontier AI agents introduce new confidentiality, integrity, and availability risks through changed assumptions on code-data separation and authority boundaries, requiring layered defenses like sandboxing and policy enforcement.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Many-Tier Instruction Hierarchy in LLM Agents cs.CL · 2026-04-10 · unverdicted · none · ref 27

    ManyIH and ManyIH-Bench address instruction conflicts in LLM agents with up to 12 privilege levels across 853 tasks, revealing frontier models achieve only ~40% accuracy.