Phone-use agents avoid harm more often through inability to act than through deliberate safe choices, so benchmarks must separate unsafe judgment from capability failure.
Ghostei-bench: Do mobile agents resilience to environmental injection in dynamic on-device environments?
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
support 1representative citing papers
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.
Safactory integrates three platforms for simulation, data management, and agent evolution to create a unified pipeline for training trustworthy autonomous AI.
citing papers explorer
-
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents
Phone-use agents avoid harm more often through inability to act than through deliberate safe choices, so benchmarks must separate unsafe judgment from capability failure.
-
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.
-
Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence
Safactory integrates three platforms for simulation, data management, and agent evolution to create a unified pipeline for training trustworthy autonomous AI.