hub Canonical reference

Skillprobe: Security auditing for emerging agent skill marketplaces via multi-agent collaboration

Zihan Guo, Zhiyu Chen, Xiaohang Nie, Jianghao Lin, Yuanjian Zhou, Weinan Zhang · 2026 · arXiv 2603.21019

Canonical reference. 80% of citing Pith papers cite this work as background.

19 Pith papers citing it

Background 80% of classified citations

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4 baseline 1

citation-polarity summary

background 4 baseline 1

representative citing papers

MalSkillBench: A Runtime-Verified Benchmark of Malicious Agent Skills

cs.CR · 2026-06-05 · unverdicted · novelty 8.0

MalSkillBench supplies the first sandbox-verified dataset of malicious agent skills and shows that existing detectors achieve high recall on code injection but collapse on prompt injection and agent-control attacks.

HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?

cs.CR · 2026-04-16 · unverdicted · novelty 8.0

Harmful skills in open agent ecosystems raise average harm scores from 0.27 to 0.76 across six LLMs by lowering refusal rates when tasks are presented via pre-installed skills.

Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent Collaboration Network

cs.AI · 2026-05-25 · unverdicted · novelty 7.0

Empirical study of EvoMap shows 98% of assets never reused, scores driven by self-reported metadata, and 84% of assets using vacuous validation tests.

Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents

cs.CL · 2026-05-16 · unverdicted · novelty 7.0

SkillTTA synthesizes temporary task-specific skills from retrieved training trajectories to boost LLM agent Pass@1 scores on SpreadsheetBench and BigCodeBench without parameter updates.

SMMBench: A Benchmark for Source-Distributed Multimodal Agent Memory

cs.CL · 2026-05-15 · unverdicted · novelty 7.0

SMMBench is a benchmark evaluating multimodal agents on cross-source reasoning, conflict resolution, preference reasoning, and action prediction, showing current systems struggle with evidence distributed across heterogeneous sources.

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

cs.CR · 2026-05-13 · unverdicted · novelty 7.0

Sefz discovers specification violations in 29.9% of 402 real-world agent skills by translating guardrails into reachability goals and guiding LLM mutations with a multi-armed bandit.

Proteus: A Self-Evolving Red Team for Agent Skill Ecosystems

cs.CR · 2026-05-12 · unverdicted · novelty 7.0

Proteus demonstrates that adaptive red-teaming achieves 40-90% attack success after five rounds and bypasses even strong auditors at up to 41% joint success, revealing that static skill vetting underestimates residual risk.

Trust Me, Import This: Dependency Steering Attacks via Malicious Agent Skills

cs.CR · 2026-05-10 · unverdicted · novelty 7.0

Malicious Skills induce coding agents to hallucinate and import attacker-controlled packages at high rates while evading detection.

Sealing the Audit-Runtime Gap for LLM Skills

cs.CR · 2026-05-06 · unverdicted · novelty 7.0

SIGIL cryptographically seals the audit-runtime gap for LLM skills via an on-chain registry with four publication types, DAO vetting, and a runtime verification loader that enforces integrity and permissions.

Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

Runtime Skill Audit introduces targeted runtime probing to detect malicious LLM agent skills, reporting 90% accuracy and resilience to self-evolving attacks on 100 skills versus static baselines.

Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw

cs.CR · 2026-05-11 · unverdicted · novelty 6.0

DeepTrap automates discovery of contextual vulnerabilities in OpenClaw agents via trajectory optimization, showing that unsafe behavior can be induced while preserving task completion and that final-response checks are insufficient.

Position: Academic Conferences are Potentially Facing Denominator Gaming Caused by Fully Automated Scientific Agents

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

Malicious actors could use AI agents to submit large numbers of fake papers, inflating the submission count and thereby raising the acceptance odds for a small set of chosen legitimate papers under stable conference acceptance rates.

SkillScope: Toward Fine-Grained Least-Privilege Enforcement for Agent Skills

cs.CR · 2026-05-07 · unverdicted · novelty 6.0

SkillScope detects over-privileged LLM agent skills with 94.53% F1 score via graph analysis and replay validation, finding 7,039 problematic skills in the wild and reducing violations by 88.56% while preserving task completion.

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

cs.CR · 2026-05-31 · accept · novelty 5.0

Analysis of 67,453 OpenClaw skills shows three scanners overlap on at most 10.4% of combined positives, with 81.9% flagged by only one scanner and distinct profiles for malicious versus suspicious skills.

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

cs.CR · 2026-05-30 · unverdicted · novelty 5.0

SkillVetBench is a two-stage benchmark combining natural-language semantic vetting and instrumented sandbox execution to detect and provide runtime evidence for malicious skills in open agent platforms, with experiments showing static methods miss up to 89% of threats.

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

cs.SE · 2026-04-09 · accept · novelty 5.0

LLM agent progress depends on externalizing cognitive functions into memory, skills, protocols, and harness engineering that coordinates them reliably.

Responsible Agentic AI Requires Explicit Provenance

cs.AI · 2026-05-16 · unverdicted · novelty 4.0

Explicit provenance across the full agentic AI lifecycle is the necessary condition for making responsibility computable and actionable.

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

cs.CR · 2026-05-01 · unverdicted · novelty 4.0

Proposes a trust schema including verification levels and a biconditional correctness criterion to verify skills in human-in-the-loop agent runtimes, reducing the need for constant oversight.

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

cs.CR · 2026-04-08

citing papers explorer

Showing 3 of 3 citing papers after filters.

Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents cs.CL · 2026-05-16 · unverdicted · none · ref 7
SkillTTA synthesizes temporary task-specific skills from retrieved training trajectories to boost LLM agent Pass@1 scores on SpreadsheetBench and BigCodeBench without parameter updates.
SMMBench: A Benchmark for Source-Distributed Multimodal Agent Memory cs.CL · 2026-05-15 · unverdicted · none · ref 4
SMMBench is a benchmark evaluating multimodal agents on cross-source reasoning, conflict resolution, preference reasoning, and action prediction, showing current systems struggle with evidence distributed across heterogeneous sources.
Position: Academic Conferences are Potentially Facing Denominator Gaming Caused by Fully Automated Scientific Agents cs.CL · 2026-05-11 · unverdicted · none · ref 6
Malicious actors could use AI agents to submit large numbers of fake papers, inflating the submission count and thereby raising the acceptance odds for a small set of chosen legitimate papers under stable conference acceptance rates.

Skillprobe: Security auditing for emerging agent skill marketplaces via multi-agent collaboration

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer