{"total":11,"items":[{"citing_arxiv_id":"2606.10749","ref_index":242,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation","primary_cat":"cs.CR","submitted_at":"2026-06-09T12:01:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection. arXiv:2604.11790 [cs.CR] doi:10.48550/arXiv.2604.11790 [241] Weibo Zhao, Jiahao Liu, Bonan Ruan, Shaofei Li, and Zhenkai Liang. 2025. When MCP Servers Attack: Taxonomy, Feasibility, and Mitigation. arXiv:2509.24272 [cs.CR] doi:10.48550/arXiv.2509.24272 [242] Can Zheng, Yuhan Cao, Xiaoning Dong, and Tianxing He. 2025. Demonstrations of Integrity Attacks in Multi-Agent Systems. arXiv:2506.04572 [cs.CL] doi:10.48550/arXiv.2506.04572 [243] Peter Yong Zhong, Siyuan Chen, Ruiqi Wang, McKenna McCall, Ben L. Titzer, Heather Miller, and Phillip B. Gibbons. 2025. RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage."},{"citing_arxiv_id":"2606.09084","ref_index":8,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Context-Fractured Decomposition Attacks on Tool-Using LLM Agents: Exploiting Artifact Provenance Gaps","primary_cat":"cs.CR","submitted_at":"2026-06-08T06:29:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Introduces Context-Fractured Decomposition (CFD) attacks exploiting provenance gaps in tool-using LLM agents to raise jailbreak success rates by up to 28.3 percentage points over baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19328","ref_index":29,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"RoboJailBench: Benchmarking Adversarial Attacks and Defenses in Embodied Robotic Agents","primary_cat":"cs.CR","submitted_at":"2026-05-19T04:07:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"RoboJailBench creates a taxonomy-based benchmark, intent-contrast datasets, and evaluation framework for jailbreak attacks and defenses in embodied robotic AI systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05340","ref_index":46,"ref_count":2,"confidence":0.9,"is_internal_anchor":true,"paper_title":"How Far Are VLMs from Privacy Awareness in the Physical World? An Empirical Study","primary_cat":"cs.CR","submitted_at":"2026-05-06T18:10:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Vision-language models exhibit perceptual fragility and fail to consistently respect privacy constraints when operating in simulated physical environments, with performance declining in cluttered scenes and under conflicting commands.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"what constitutes a privacy violation. This physically grounded autonomy introduces profound and underexplored privacy risks [41, 8, 28, 44]. While a digital model's alignment dictates what it should or should not generate in natural language [25, 34], an embodied VLM's alignment must govern what it is permitted to observe, infer, and manipulate in the physical world [46, 45, 19, 9]. Previous efforts to measure physical-world privacy awareness have highlighted this critical vulner- ability but remain fundamentally constrained by their unimodal, static simulation environments. ∗Equal contribution. †Corresponding author. Project page:https://immersed-privacy.github.io Preprint. arXiv:2605.05340v2 [cs.CR] 8 May 2026"},{"citing_arxiv_id":"2604.27267","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"From Prompt to Physical Actuation: Holistic Threat Modeling of LLM-Enabled Robotic Systems","primary_cat":"cs.CR","submitted_at":"2026-04-29T23:44:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"A unified threat model for LLM-enabled robots reveals three cross-boundary attack chains from user input to unsafe physical actuation due to missing validations and unmediated crossings.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.23775","ref_index":93,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms","primary_cat":"cs.RO","submitted_at":"2026-04-26T15:58:19+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A literature survey that unifies fragmented work on attacks, defenses, evaluations, and deployment challenges for Vision-Language-Action models in robotics.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"physical control space. In black-box scenarios, adversaries construct natural language contexts to bypass safety alignment. For instance, RoboPAIR confirms high success rates for semantic deception across varying permission settings [56]. BadRobot further identifies the underlying architectural flaw of these attacks as the \"Output-Action Mismatch\" [93]. Given the autoregressive nature of VLAs-where language and action tokens share a unified vocabulary yet possess decoupled probability masses-a model processing an adversarial promptpadv and observation o may exhibit a high sequence probability for a safe linguistic refusal (ysafe) while the marginal probability mass for unsafe physical actions (Aunsafe) simultaneously exceeds the execution thresholdγ:"},{"citing_arxiv_id":"2604.18463","ref_index":54,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Using large language models for embodied planning introduces systematic safety risks","primary_cat":"cs.AI","submitted_at":"2026-04-20T16:18:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LLM planners for robots often produce dangerous plans even when planning succeeds, with safety awareness staying flat as model scale improves planning ability.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Adversarial robustnesshas revealed critical vulnerabilities in LLM-controlled robots. RoboPAIR achieves 100% attack success across white-box, gray-box, and black-box settings, demonstrating that chatbot safety alignment does not transfer to embodied safety [53]. BadRobot presents attacks through voice-based interactions exploiting misalignment between linguistic outputs and physical actions [54]. Studies have also shown LLM-driven robots risk enacting discrimination and unlawful actions [55]. These findings underscore the need for embodied-specific safety evaluation. Context-aware safety guardrails.A complementary line of work argues that aligning robotic foun- dation models is insufficient on its own, and that layered, context-aware runtime guardrails are needed"},{"citing_arxiv_id":"2604.11174","ref_index":46,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems","primary_cat":"cs.RO","submitted_at":"2026-04-13T08:34:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"EmbodiedGovBench is a new benchmark framework that measures embodied agent systems on seven governance dimensions including policy adherence, recovery success, and upgrade safety.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"tradition by adding a governance evaluation layer. Embodied safety benchmarks.Recent work has begun to evaluate safety-related properties of em- bodied agents. SafeAgentBench [43] benchmarks safe task planning of embodied LLM agents, AGENTSAFE [44] evaluates safety under hazardous instructions, IS-Bench [45] measures interactive safety of VLM-driven agents, BadRobot [46] studies jailbreaking of embodied LLMs, Agent-SafetyBench [47] provides a broad safety evaluation suite for LLM agents, Huang et al. [48] propose a framework for benchmarking task-planning safety alignment, and Wu et al. [49] introduce EARBench for evaluating physical risk awareness in embodied AI. Afzal et al. [50] survey challenges of testing robotic systems, highlight-"},{"citing_arxiv_id":"2604.19790","ref_index":51,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Hidden Reliability Risks in Large Language Models: Systematic Identification of Precision-Induced Output Disagreements","primary_cat":"cs.AI","submitted_at":"2026-04-02T03:38:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PrecisionDiff is a differential testing framework that uncovers widespread precision-induced behavioral disagreements in aligned LLMs, including safety-critical jailbreak divergences across precision formats.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"InProceedings of the 2025 Confer- ence on Empirical Methods in Natural Language Processing. 34952-34964. [50] Hangtao Zhang, Chenyu Zhu, Xianlong Wang, Ziqi Zhou, Changgan Yin, Minghui Li, Lulu Xue, Yichen Wang, Shengshan Hu, Aishan Liu, et al . 2024. Badrobot: Jailbreaking embodied llms in the physical world.arXiv preprint arXiv:2407.20242 (2024). [51] Jiawen Zhang, Kejia Chen, Lipeng He, Jian Lou, Dan Li, Zunlei Feng, Mingli Song, Jian Liu, Kui Ren, and Xiaohu Yang. 2025. Activation Approximations Can Incur Safety Vulnerabilities in Aligned {LLMs}: Comprehensive Analysis and Defense. In34th USENIX Security Symposium (USENIX Security 25). 339-358. [52] Kunpeng Zhang, Shuai Wang, Jitao Han, Xiaogang Zhu, Xian Li, Shaohua Wang,"},{"citing_arxiv_id":"2604.09651","ref_index":42,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models","primary_cat":"cs.CV","submitted_at":"2026-03-30T03:54:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"FlowHijack is the first dynamics-aware backdoor attack on flow-matching VLAs that achieves high success rates with stealthy triggers while preserving benign performance and making malicious actions kinematically indistinguishable from normal ones.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.07765","ref_index":121,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Toward Seamless Physical Human-Humanoid Interaction: Insights from Control, Intent, and Modeling with a Vision for What Comes Next","primary_cat":"cs.RO","submitted_at":"2025-12-08T17:47:19+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A literature review of pHHI that proposes a taxonomy of interaction types by modality and engagement level while outlining pathways to integrate control, intent, and modeling for more seamless humanoid-human collaboration.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":", Luo, Z., He, X., Xiao, W., Zhang, C., Zhang, W., Kitani, K., Liu, C., Shi, G.: Omnih2o: Universal and dexterous human-to-humanoid whole-body teleoperation and learning. arXiv preprint arXiv:2406.08858 (2024) [120] Robey, A., Ravichandran, Z., Kumar, V., Hassani, H., Pappas, G.J.: Jailbreaking LLM-controlled robots. arXiv preprint arXiv:2410.13691 (2024) [121] Zhang, H., Zhu, C., Wang, X., Zhou, Z., Hu, S., Zhang, L.Y.: Badrobot: Jailbreaking LLM-based embodied AI in the physical world. arXiv preprint arXiv:2407.202423(2024) 53 [122] Westervelt, E.R., Grizzle, J.W., Koditschek, D.E.: Hybrid zero dynamics of planar biped walkers. IEEE transactions on automatic control48(1), 42-56 (2003) [123] Westervelt, E."}],"limit":50,"offset":0}