CrypFormBench is a new benchmark jointly covering symbolic and computational security to evaluate LLMs on five formal analysis capabilities, with results showing top model Claude-3.5 scores 48.7/100 and most models struggling on generation, transformation, and correction.
Available at:https://datatracker.ietf.org/doc/html/ rfc6749
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CHAP is a new protocol that turns human edits, approvals and handoffs with AI agents into structured, signed, replayable events using a minimal core and optional profiles.
citing papers explorer
-
CrypFormBench: Benchmarking Formal Analysis Capability of Large Language Models for Cryptographic Schemes
CrypFormBench is a new benchmark jointly covering symbolic and computational security to evaluate LLMs on five formal analysis capabilities, with results showing top model Claude-3.5 scores 48.7/100 and most models struggling on generation, transformation, and correction.
-
Collaborative Human-Agent Protocol (CHAP)
CHAP is a new protocol that turns human edits, approvals and handoffs with AI agents into structured, signed, replayable events using a minimal core and optional profiles.