Is safety standard same for everyone? user-specific safety evaluation of large language models.arXiv preprint arXiv:2502.15086,

Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Kibum Kim, Chanyoung Park · 2025 · arXiv 2502.15086

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 2

citation-polarity summary

support 2

representative citing papers

Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents

cs.AI · 2025-10-03 · unverdicted · novelty 6.0

TRACE is a reference-free multi-dimensional evaluation framework for tool-augmented LLM reasoning trajectories that uses an evidence bank and is validated on a new meta-evaluation dataset of flawed trajectories.

Beyond Context: Large Language Models' Failure to Grasp Users' Intent

cs.AI · 2025-12-24 · unverdicted · novelty 3.0

LLMs fail to detect hidden harmful intent, allowing systematic bypass of safety mechanisms through framing techniques, with reasoning modes often worsening the issue.

LLM Harms: A Taxonomy and Discussion

cs.CY · 2025-12-05 · unverdicted · novelty 3.0

This paper proposes a taxonomy of LLM harms in five categories and suggests mitigation strategies plus a dynamic auditing system for responsible development.

citing papers explorer

Showing 3 of 3 citing papers.

Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents cs.AI · 2025-10-03 · unverdicted · none · ref 7
TRACE is a reference-free multi-dimensional evaluation framework for tool-augmented LLM reasoning trajectories that uses an evidence bank and is validated on a new meta-evaluation dataset of flawed trajectories.
Beyond Context: Large Language Models' Failure to Grasp Users' Intent cs.AI · 2025-12-24 · unverdicted · none · ref 69
LLMs fail to detect hidden harmful intent, allowing systematic bypass of safety mechanisms through framing techniques, with reasoning modes often worsening the issue.
LLM Harms: A Taxonomy and Discussion cs.CY · 2025-12-05 · unverdicted · none · ref 244
This paper proposes a taxonomy of LLM harms in five categories and suggests mitigation strategies plus a dynamic auditing system for responsible development.

Is safety standard same for everyone? user-specific safety evaluation of large language models.arXiv preprint arXiv:2502.15086,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer