Guardtrace-vl: Detecting unsafe multimodel reasoning via iterative safety supervision

Yuxiao Xiang, Junchi Chen, Zhenchao Jin, Changtao Miao, Haojie Yuan, Qi Chu, Tao Gong, Nenghai Yu · 2025 · arXiv 2511.20994

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

cs.CR · 2026-05-31 · unverdicted · novelty 5.0

BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.

citing papers explorer

Showing 1 of 1 citing paper.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents cs.CR · 2026-05-31 · unverdicted · none · ref 40
BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.

Guardtrace-vl: Detecting unsafe multimodel reasoning via iterative safety supervision

fields

years

verdicts

representative citing papers

citing papers explorer