Title resolution pending

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks

cs.LG · 2026-05-26 · conditional · novelty 5.0

Abliteration and prefilling attacks raise harm success rates on safeguarded open-weight LLMs from below 10% to 16-96% across three benchmarks, and a new ART tuning method reduces those rates by 10-20%.

citing papers explorer

Showing 1 of 1 citing paper.

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks cs.LG · 2026-05-26 · conditional · none · ref 19
Abliteration and prefilling attacks raise harm success rates on safeguarded open-weight LLMs from below 10% to 16-96% across three benchmarks, and a new ART tuning method reduces those rates by 10-20%.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer