Title resolution pending

Chowdhury, A · 2024 · arXiv 2403.04786

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Reasoning-targeted Jailbreak Attacks on Large Reasoning Models via Semantic Triggers and Psychological Framing

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

PRJA achieves 83.6% average success injecting harmful content into LRM reasoning chains on five QA datasets without altering final answers.

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

cs.CR · 2024-09-26 · unverdicted · novelty 2.0

Survey of harmful fine-tuning attacks on LLMs, their variants, defense strategies, mechanical analysis, and evaluation methodologies.

citing papers explorer

Showing 2 of 2 citing papers.

Reasoning-targeted Jailbreak Attacks on Large Reasoning Models via Semantic Triggers and Psychological Framing cs.LG · 2026-04-17 · unverdicted · none · ref 37
PRJA achieves 83.6% average success injecting harmful content into LRM reasoning chains on five QA datasets without altering final answers.
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey cs.CR · 2024-09-26 · unverdicted · none · ref 26
Survey of harmful fine-tuning attacks on LLMs, their variants, defense strategies, mechanical analysis, and evaluation methodologies.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer