pith. machine review for the scientific record. sign in

hub

JailbreakBench : An open robustness benchmark for jailbreaking large language models

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

hub tools

citation-role summary

background 1

citation-polarity summary

roles

background 1

polarities

background 1

representative citing papers

Towards an AI co-scientist

cs.AI · 2025-02-26 · unverdicted · novelty 6.0

A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.

Cross-Lingual Jailbreak Detection via Semantic Codebooks

cs.CL · 2026-04-28 · unverdicted · novelty 5.0

Semantic similarity to an English jailbreak codebook detects cross-lingual attacks with high accuracy on curated benchmarks but shows poor separability on diverse unsafe prompts.

citing papers explorer

Showing 17 of 17 citing papers.