pith. sign in

Title resolution pending

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

years

2026 2 2025 3

representative citing papers

Deceive, Detect, and Disclose: Large Language Models Play Mini-Mafia

cs.AI · 2025-09-27 · unverdicted · novelty 7.0

Mini-Mafia supplies an analytical model logit(p) = v*(m-d) for mafia win probability in LLM role interactions and uses Bayesian inference to estimate per-model parameters that predict tournament results with 76.6% Brier-score improvement over random.

Training a General Purpose Automated Red Teaming Model

cs.CR · 2026-04-24 · unverdicted · novelty 6.0

A pipeline trains general-purpose red teaming models by finetuning small LLMs like Qwen3-8B to generate attacks for both seen and unseen adversarial objectives without relying on existing evaluators.

citing papers explorer

Showing 5 of 5 citing papers.