Plentiful jailbreaks with string compositions

· 2024 · arXiv 2411.01084

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

FinRED: An Expert-Guided Benchmark Generation and Evaluation Framework for Financial LLM Red-Teaming

cs.CR · 2026-06-18 · unverdicted · novelty 7.0

FinRED creates an expert-validated benchmark and rubric for financial LLM safety that maps regulatory standards to specific threats and reduces critical false negatives in evaluation from 28 to 12.

Exploiting Web Search Tools of AI Agents for Data Exfiltration

cs.CR · 2025-10-10 · unverdicted · novelty 4.0

Indirect prompt injection attacks remain effective on LLMs using web search tools, allowing data exfiltration and exposing ongoing weaknesses in current model defenses.

citing papers explorer

Showing 2 of 2 citing papers.

FinRED: An Expert-Guided Benchmark Generation and Evaluation Framework for Financial LLM Red-Teaming cs.CR · 2026-06-18 · unverdicted · none · ref 46
FinRED creates an expert-validated benchmark and rubric for financial LLM safety that maps regulatory standards to specific threats and reduces critical false negatives in evaluation from 28 to 12.
Exploiting Web Search Tools of AI Agents for Data Exfiltration cs.CR · 2025-10-10 · unverdicted · none · ref 25
Indirect prompt injection attacks remain effective on LLMs using web search tools, allowing data exfiltration and exposing ongoing weaknesses in current model defenses.

Plentiful jailbreaks with string compositions

fields

years

verdicts

representative citing papers

citing papers explorer