When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Haoran Yu; Jingzhou Xu; Junxian You; Lifei Liu; Pin Qian; Su Wang; Xiaochong Jiang; Xiaoyuan Wang; Yihang Chen

arxiv: 2606.00448 · v1 · pith:5R5BMNXVnew · submitted 2026-05-30 · 💻 cs.SE · cs.AI· cs.CR

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Su Wang , Pin Qian , Yihang Chen , Junxian You , Xiaoyuan Wang , Xiaochong Jiang , Lifei Liu , Haoran Yu

show 1 more author

Jingzhou Xu

This is my paper

Pith reviewed 2026-06-28 18:44 UTC · model grok-4.3

classification 💻 cs.SE cs.AIcs.CR

keywords compositional riskagent skillsLLM agentssafety measurementskill compositionexploitabilitystatic analysisinstall-time checks

0 comments

The pith

Individually safe skills form genuine compositional risks in agent systems, with 18.2% of flagged pairs validated as real.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether skills that pass individual safety checks can still combine to enable unsafe behaviors in LLM agents. It introduces SkillReact, a framework that first applies a deterministic benchmark to flag structural pair patterns across over 200,000 combinations from a large skill registry. Human adjudication then calibrates the raw flag rate, showing that roughly one in five candidates represents an actual risk. Experiments with an action harness demonstrate that models vary in whether they actually issue the risky tool calls, with some completing full chains and others refusing. The results indicate that per-skill inspection alone leaves thousands of risks undetected by design.

Core claim

On 1,520 ClawHub skills, 651 pass individual inspection and form 211,575 pairs; the benchmark flags 22.25% as structural candidates. A pattern-stratified audit finds a population-weighted validity of 18.2%, implying about 14K genuine risk memberships missed by per-skill scanning. The action-based harness shows realization is gated by host-model disposition: Haiku-4-5 issues the full download-then-execute chain on 36 of 39 trials, Opus-4-7 stops at download, and Sonnet-4-6 refuses. A control holding the request fixed finds highest compliance with no skills installed.

What carries the argument

SkillReact, a compositional security measurement framework with a deterministic static-composition benchmark to flag pair-pattern hits, a two-rater LLM-assisted human-adjudication pipeline to validate them, and an action-based exploitability harness to test model-issued tool calls.

If this is right

Per-skill scanning misses compositional risks by construction because every pair is individually safe.
Whether a compositional risk becomes an executed tool call depends on the host model's disposition.
Compliance with risky requests is highest when no skills are installed.
Install-time compositional checks are needed as complements to per-skill scanning.
Capability isolation can limit which combined capabilities remain reachable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Skill registries could add automated pair-wise checks at install time to catch risks the current per-skill process misses.
The 18.2% validity rate offers a starting baseline for estimating hidden risks in other agent skill collections.
Safety outcomes depend on the interaction between installed skills and the specific underlying model, suggesting model-specific testing of skill sets.
Extending the harness to additional models could map which model families are more or less prone to executing compositional exploits.

Load-bearing premise

The two-rater LLM-assisted human-adjudication pipeline accurately identifies which structural candidates are genuine compositional risks rather than false positives.

What would settle it

Independent re-adjudication of the same pattern-stratified sample of flagged pairs by new human raters yielding a validity rate substantially different from 18.2%.

Figures

Figures reproduced from arXiv: 2606.00448 by Haoran Yu, Jingzhou Xu, Junxian You, Lifei Liu, Pin Qian, Su Wang, Xiaochong Jiang, Xiaoyuan Wang, Yihang Chen.

**Figure 1.** Figure 1: SkillReact in one view. (a) Per-skill safe ̸= set-safe: the same capability policy—flag an object O iff {FR, NO} ⊆ Cap(O) (file read ∧ network out, one of the 10 forbidden patterns)—clears each skill individually (Cap(A) = {FR}, Cap(B) = {NO}: both PASS), yet flags their co-installed union (Cap(S) = {FR, NO}, so {FR, NO} ⊆ Cap(S): FLAG); only the security object changes (skill → installed set), not the rul… view at source ↗

**Figure 2.** Figure 2: Distribution of capabilities across 1,520 analyzed [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Capability co-occurrence heatmap across 1,520 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

read the original abstract

LLM agents increasingly rely on community-contributed skills that expand an agent's operational capability set. We study a core safety problem in agentic AI systems: whether individually safe skills can compose into unsafe installed skill sets. We present SkillReact, a compositional security measurement framework with three components: a deterministic static-composition benchmark, a two-rater LLM-assisted human-adjudication pipeline, and an action-based exploitability harness. On 1,520 ClawHub skills, 651 pass individual inspection and form 211,575 pairs; the benchmark flags 22.25% of these as structural candidates. We treat this raw rate as a recall-oriented scanner ceiling and calibrate it against human judgment: in a pattern-stratified audit, roughly one in five flagged pair-pattern hits survives as a real compositional risk (population-weighted validity 18.2%, our headline result), implying about 14K genuine risk memberships in a single registry that per-skill scanning misses by construction, since every pair is individually safe. An action-based harness then probes when these candidates become model-issued tool calls, and finds realization gated by host-model disposition: on an anchor-conditioned dropper subset, Haiku-4-5 issues the dropper-stage tool call on all 39 direct-prompt trials (36 of them the full download-then-execute chain, 3 download-only), Opus-4-7 stops at the download, and Sonnet-4-6 refuses outright. A control that holds the request fixed and varies only the installed skills finds compliance highest with no skills installed: a composition fixes which capabilities are reachable, while the host model decides whether to use them. Together these motivate install-time compositional checks and capability isolation as complements to per-skill scanning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper measures a real gap in per-skill safety checks for agent skill registries, with 18% of flagged pairs holding up as compositional risks, but the number rests on an adjudication step that lacks reported agreement stats.

read the letter

Hi,

The main thing to know is that this work puts a number on compositional risk that per-skill scanning misses by construction: after scanning 211k pairs from 1520 ClawHub skills, their audit finds 18.2% validity in the flagged structural candidates, which they scale to roughly 14k genuine risk memberships in one registry.

What the paper actually does is build SkillReact with three pieces—a deterministic static benchmark that flags 22.25% of safe-skill pairs, a two-rater LLM-assisted adjudication to calibrate that rate, and an action harness that tests when the risks turn into model-issued calls. The model results are concrete: on the dropper subset, Haiku completes the full chain on most trials, Opus stops at download, and Sonnet refuses. The control that fixes the request and varies only the installed skills shows compliance drops once skills are present, which separates the composition effect from the model disposition.

The framework is straightforward and the scale of the pair analysis is useful for anyone thinking about skill registries. The static benchmark and harness give reproducible starting points.

The soft spot is the adjudication pipeline that produces the 18.2% headline. The abstract mentions calibration against human judgment but gives no inter-rater agreement figures, no rubric, and no breakdown of LLM assistance versus overrides. That step carries the population estimate, so missing those details leaves the central claim harder to assess. The model experiments are also narrow—three models and one dropper subset—so the gating observation is suggestive rather than general.

This is for readers working on agent platforms, skill marketplaces, or safety tooling who need data on install-time checks. It has enough empirical content and a clear measurement goal to deserve a serious referee, provided the audit process gets more documentation.

I'd send it to review with a request for the missing agreement stats and rubric examples.

Best,

Referee Report

1 major / 2 minor

Summary. The paper introduces the SkillReact framework to measure compositional safety risks in LLM agent skill ecosystems, where individually safe skills may form unsafe installed sets. On 1,520 ClawHub skills (651 passing individual checks), it generates 211,575 pairs, flags 22.25% as structural candidates via a deterministic static benchmark, and calibrates via a pattern-stratified two-rater LLM-assisted human audit to a population-weighted validity of 18.2%, implying ~14K genuine risk memberships missed by per-skill scanning. An action-based harness on an anchor-conditioned dropper subset shows model-dependent realization (e.g., Haiku-4-5 issues the full chain on all 39 trials; Opus-4-7 stops at download; Sonnet-4-6 refuses), with a control showing highest compliance with no skills installed.

Significance. If the 18.2% validity holds, the result provides a quantitative, falsifiable estimate of compositional risks that per-skill inspection misses by construction and demonstrates that host-model disposition gates exploitability even when capabilities are installed. The deterministic benchmark and concrete, reproducible harness trials on three specific models are strengths that support follow-up verification.

major comments (1)

[Abstract] Abstract, paragraph on calibration against human judgment: the headline population-weighted validity of 18.2% (and the derived claim of ~14K genuine risk memberships) is produced by the two-rater LLM-assisted human-adjudication pipeline, but the manuscript supplies no inter-rater agreement statistics, explicit adjudication rubric, calibration examples, or breakdown of LLM assistance versus human override. This is load-bearing for the central measurement claim.

minor comments (2)

[Abstract] Abstract: the 'dropper subset' and 'anchor-conditioned' phrasing are introduced without a preceding definition or cross-reference to the methods section.
[Abstract] Abstract: the control experiment ('a control that holds the request fixed and varies only the installed skills') would benefit from an explicit statement of the fixed prompt text used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for identifying a transparency gap in our presentation of the central calibration result. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract, paragraph on calibration against human judgment: the headline population-weighted validity of 18.2% (and the derived claim of ~14K genuine risk memberships) is produced by the two-rater LLM-assisted human-adjudication pipeline, but the manuscript supplies no inter-rater agreement statistics, explicit adjudication rubric, calibration examples, or breakdown of LLM assistance versus human override. This is load-bearing for the central measurement claim.

Authors: We agree the omission weakens the manuscript. The two-rater pipeline is described at a high level in Section 4.2, but the abstract and calibration paragraph indeed contain none of the requested details. In the revised version we will (1) insert a concise methods clause into the abstract, (2) add a dedicated subsection (or expanded paragraph) in Section 4.2 that reports inter-rater agreement (Cohen’s κ), (3) reproduce the full adjudication rubric, (4) provide three representative calibration examples that distinguish LLM-assisted suggestions from human overrides, and (5) tabulate the fraction of cases in which the human raters overruled the LLM. These additions will make the 18.2 % population-weighted validity directly verifiable without altering the numerical result itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central claim rests on external human adjudication

full rationale

The derivation chain begins with a deterministic static-composition benchmark that flags 22.25% of pairs as structural candidates, then applies an external two-rater LLM-assisted human-adjudication pipeline to a pattern-stratified sample to obtain the population-weighted validity of 18.2%. This validity rate is produced by human judgment rather than any equation, fit, or self-citation internal to the paper. No self-definitional steps, fitted-input predictions, load-bearing self-citations, uniqueness theorems, ansatz smuggling, or renaming of known results appear in the abstract or described methodology. The measurement is therefore self-contained against an external benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the accuracy of the human adjudication pipeline and the assumption that the ClawHub sample and pattern-stratified audit generalize to the full set of flagged pairs.

axioms (1)

domain assumption The deterministic static-composition benchmark functions as a high-recall scanner whose false-positive rate can be calibrated by human review.
Used to generate the initial 22.25% candidate rate before human calibration.

pith-pipeline@v0.9.1-grok · 5879 in / 1323 out tokens · 25069 ms · 2026-06-28T18:44:04.075434+00:00 · methodology

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

When AUC 0.998 Is Not Enough: A Candidate Evaluation Protocol for Hidden-State Probes of Indirect Prompt Injection in Multimodal Computer-Use Agents
cs.LG 2026-06 unverdicted novelty 7.0

High AUC from linear probes on model activations for indirect prompt injection does not license an unqualified claim of malicious-content detection, per a Qwen2.5-VL-7B case study with text and visual controls.
Beyond Accuracy: Measuring Bias Acknowledgment in Chain-of-Thought Reasoning for Responsible AI Evaluation
cs.LG 2026-06 unverdicted novelty 6.0

GPT-4o and Claude Sonnet 4 show similar susceptibility to bias on GSM8K (1.3% vs 1.2%) but differ sharply in acknowledgment rates (13% vs 75%) under a rubric-defined metric.
Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite
cs.CL 2026-06 unverdicted novelty 6.0

First end-to-end RAG on mobile NPU delivers 18.1x faster prefilling, 4x lower latency and energy than CPU on Snapdragon X Elite with equivalent quality.

Reference graph

Works this paper leans on

55 extracted references · 29 canonical work pages · cited by 3 Pith papers · 19 internal anchors

[1]

Extend Claude with skills

Anthropic, “Extend Claude with skills.” https://code.claude.com/do cs/en/skills, 2026. Claude Code documentation

2026
[2]

Equipping agents for the real world with Agent Skills

Anthropic, “Equipping agents for the real world with Agent Skills.” https://www.anthropic.com/engineering/equipping-agents-for-the-rea l-world-with-agent-skills, 2025. Anthropic Engineering

2025
[3]

"Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills in the Wild

Y . Liu, Z. Chen, Y . Zhang, G. Deng, Y . Li, J. Ning, Y . Zhang, and L. Y . Zhang, “Malicious agent skills in the wild: A large-scale security empirical study,”arXiv preprint arXiv:2602.06547, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Y . Liu, W. Wang, R. Feng, Y . Zhang, G. Xu, G. Deng, Y . Li, and L. Zhang, “Agent skills in the wild: An empirical study of security vulnerabilities at scale,”arXiv preprint arXiv:2601.10338, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Y . Hou, Z. Yang, Z. Pang, and X. Ma, “SkillSieve: A hierarchical triage framework for detecting malicious AI agent skills,”arXiv preprint arXiv:2604.06550, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[6]

Formal analysis and supply chain security for agentic AI skills.CoRR, abs/2603.00195, 2026

V . P. Bhardwaj, “Formal analysis and supply chain security for agentic AI skills,”arXiv preprint arXiv:2603.00195, 2026

work page arXiv 2026
[7]

Snyk finds prompt injection in 36%, 1467 malicious payloads in a ToxicSkills study of agent skills supply chain compromise,

L. Beurer-Kellner, A. Kudrinskii, M. Milanta, K. B. Nielsen, H. Sarkar, and L. Tal, “Snyk finds prompt injection in 36%, 1467 malicious payloads in a ToxicSkills study of agent skills supply chain compromise,”Snyk Blog, Feb. 5, 2026

2026
[8]

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

Y . Huanget al., “From component manipulation to system compro- mise: Understanding and detecting malicious MCP servers,”arXiv preprint arXiv:2604.01905, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[9]

Safety is non-compositional: A formal framework for capability-based AI systems,

C. Spera, “Safety is non-compositional: A formal framework for capability-based AI systems,”arXiv preprint arXiv:2603.15973, 2026

work page arXiv 2026
[10]

ClawHub: Skill and plugin registry for OpenClaw

OpenClaw, “ClawHub: Skill and plugin registry for OpenClaw.” http s://github.com/openclaw/clawhub, 2026. Public agent-skill registry

2026
[11]

MindGuard: Intrinsic decision inspection for securing LLM agents against metadata poisoning,

Z. Wang, H. Du, G. Shi, J. Zhang, H. Cheng, Y . Yao, K. Guo, and X.- Y . Li, “MindGuard: Intrinsic decision inspection for securing LLM agents against metadata poisoning,”arXiv preprint arXiv:2508.20412, 2025

work page arXiv 2025
[12]

How Your Credentials Are Leaked by LLM Agent Skills: An Empirical Study

Z. Chen, Y . Zhang, Y . Liu, G. Deng, Y . Li, Y . Zhang, J. Ning, L. Y . Zhang, L. Ma, and Z. Li, “Credential leakage in LLM agent skills: A large-scale empirical study,”arXiv preprint arXiv:2604.03070, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[13]

Prompt injection attacks on agentic coding assistants: A systematic analysis of vulnerabilities in skills, tools, and protocol ecosystems,

N. Maloyan and D. Namiot, “Prompt injection attacks on agentic coding assistants: A systematic analysis of vulnerabilities in skills, tools, and protocol ecosystems,”arXiv preprint arXiv:2601.17548, 2026

work page arXiv 2026
[14]

Prompt injection attack to tool selection in LLM agents,

J. Shi, Z. Yuan, G. Tie, P. Zhou, N. Z. Gong, and L. Sun, “Prompt injection attack to tool selection in LLM agents,”Proceedings of NDSS, 2026

2026
[15]

BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

G. Tie, J. Shi, P. Zhou, and L. Sun, “BadSkill: Backdoor at- tacks on agent skills via model-in-skill poisoning,”arXiv preprint arXiv:2604.09378, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

Y . Qu, Y . Liu, T. Geng, G. Deng, Y . Li, L. Y . Zhang, Y . Zhang, and L. Ma, “Supply-chain poisoning attacks against LLM coding agent skill ecosystems,”arXiv preprint arXiv:2604.03081, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[17]

SkillJect: Effectively Automating Skill-Based Prompt Injection for Skill-Enabled Agents

X. Jia, J. Liao, S. Qin, J. Gu, W. Ren, X. Cao, Y . Liu, and P. Torr, “SkillJect: Effectively automating skill-based prompt injection for skill-enabled agents,”arXiv preprint arXiv:2602.14211, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[18]

Noninterference and the composability of security properties,

D. McCullough, “Noninterference and the composability of security properties,” inProceedings of the IEEE Symposium on Security and Privacy, 1988

1988
[19]

A lattice model of secure information flow,

D. E. Denning, “A lattice model of secure information flow,”Com- munications of the ACM, vol. 19, no. 5, pp. 236–243, 1976

1976
[20]

The protection of information in computer systems,

J. H. Saltzer and M. D. Schroeder, “The protection of information in computer systems,”Proceedings of the IEEE, vol. 63, no. 9, pp. 1278– 1308, 1975

1975
[21]

COVERT: Com- positional analysis of Android inter-app permission leakage,

H. Bagheri, A. Sadeghi, J. Garcia, and S. Malek, “COVERT: Com- positional analysis of Android inter-app permission leakage,”IEEE Transactions on Software Engineering, vol. 41, no. 9, 2015

2015
[22]

MR- Droid: A scalable and prioritized analysis of inter-app communication risks,

F. Liu, H. Cai, G. Wang, D. Yao, K. O. Elish, and B. G. Ryder, “MR- Droid: A scalable and prioritized analysis of inter-app communication risks,” inProceedings of the IEEE Mobile Security Technologies Workshop (MoST), 2017

2017
[23]

Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

Y . Qiao, D. Liu, H. Yang, W. Zhou, and S. Hu, “Agent tools orchestration leaks more: Dataset, benchmark, and mitigation,”arXiv preprint arXiv:2512.16310, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[24]

AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,

E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fis- cher, and F. Tram`er, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,”Advances in Neural Information Processing Systems, 2024

2024
[25]

Agent security bench (ASB): Formalizing and bench- marking attacks and defenses in LLM-based agents,

H. Zhang, J. Huang, K. Mei, Y . Yao, Z. Wang, C. Zhan, H. Wang, and Y . Zhang, “Agent security bench (ASB): Formalizing and bench- marking attacks and defenses in LLM-based agents,” inInternational Conference on Learning Representations (ICLR), 2025

2025
[26]

InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,

Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,”Findings of ACL, 2024

2024
[27]

Defeating Prompt Injections by Design

E. Debenedettiet al., “Defeating prompt injections by design (CaMeL),”arXiv preprint arXiv:2503.18813, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[28]

In2025 IEEE Conference on Se- cure and Trustworthy Machine Learning (SaTML), pages 23–42

S. Chennabasappa, C. Nikolaidis, D. Song, D. Molnar, S. Ding, S. Wan, S. Whitman, L. Deason, N. Doucette, A. Montilla, A. Gampa, B. de Paola, D. Gabi, J. Crnkovich, J.-C. Testud, K. He, R. Chaturvedi, W. Zhou, and J. Saxe, “LlamaFirewall: An open source guardrail system for building secure AI agents,”arXiv preprint arXiv:2505.03574, 2025

work page arXiv 2025
[29]

AGrail: A lifelong agent guardrail with effective and adaptive safety detection,

W. Luo, S. Dai, X. Liu, S. Banerjee, H. Sun, M. Chen, and C. Xiao, “AGrail: A lifelong agent guardrail with effective and adaptive safety detection,”Proceedings of ACL, 2025

2025
[30]

AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,

H. Wang, C. M. Poskitt, and J. Sun, “AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,”Proceedings of ICSE, 2026

2026
[31]

Progent: Securing AI Agents with Privilege Control

T. Shi, J. He, Z. Wang, H. Li, L. Wu, W. Guo, and D. Song, “Progent: Securing AI agents with privilege control,”arXiv preprint arXiv:2504.11703, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[32]

Agent governance toolkit: Open-source runtime security for AI agents

Microsoft, “Agent governance toolkit: Open-source runtime security for AI agents.” https://opensource.microsoft.com/blog/2026/04/02/int roducing-the-agent-governance-toolkit-open-source-runtime-securit y-for-ai-agents/, 2026

2026
[33]

Reflect-Guard: Enhancing LLM Safeguards against Adversarial Prompts via Logical Self-Reflection

L. Lin, J. You, Y . Li, L. Lin, Y . Wang, Z. Zhang, and M. Zheng, “Reflect-guard: Enhancing LLM safeguards against adversarial prompts via logical self-reflection,”arXiv preprint arXiv:2605.24834, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[34]

OW ASP agentic skills top 10

OW ASP Foundation, “OW ASP agentic skills top 10.” https://owasp. org/www-project-agentic-skills-top-10/, 2026

2026
[35]

A Security Analysis of the OpenClaw AI Agent Framework

S. Suwansathit, Y . Zhang, and G. Gu, “A security analysis of the OpenClaw AI agent framework,”arXiv preprint arXiv:2603.27517, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[36]

SOK: A Taxonomy of Attack Vectors and Defense Strategies for Agentic Supply Chain Runtime

X. Jiang, S. Yang, W. Yang, Y . Liu, and C. Ji, “SOK: A taxonomy of attack vectors and defense strategies for agentic supply chain runtime,”arXiv preprint arXiv:2602.19555, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[37]

Learning how to use tools, not just when: Pattern-aware tool-integrated reasoning,

N. Xu, Y . Jiang, S. R. Dipta, and H. Zhang, “Learning how to use tools, not just when: Pattern-aware tool-integrated reasoning,”arXiv preprint arXiv:2509.23292, 2026

work page arXiv 2026
[38]

SCRIBE: Structured Mid-Level Supervision for Tool-Using Language Models

Y . Jiang and F. Ferraro, “SCRIBE: Structured mid-level supervision for tool-using language models,”arXiv preprint arXiv:2601.03555, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[39]

From generation to judgment: Opportunities and challenges of LLM-as-a-judge,

D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y . Jiang, C. Chen, T. Wu,et al., “From generation to judgment: Opportunities and challenges of LLM-as-a-judge,” inProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 2757–2791, 2025

2025
[40]

Does tone change the answer? evaluating prompt politeness effects on modern LLMs: GPT, Gemini, Llama,

H. Cai, B. Shen, L. Jin, L. Hu, and X. Fan, “Does tone change the answer? evaluating prompt politeness effects on modern LLMs: GPT, Gemini, Llama,”arXiv preprint arXiv:2512.12812, 2025

work page arXiv 2025
[41]

Performance-efficiency trade-offs in human preference prediction: A comparative study of traditional machine learning and large language models,

Y . Zhang, Z. Xiang, and H. Xu, “Performance-efficiency trade-offs in human preference prediction: A comparative study of traditional machine learning and large language models,” inProceedings of the 31st IEEE Symposium on Computers and Communications (ISCC), IEEE, 2026

2026
[42]

The lethal trifecta for AI agents: Private data, untrusted content, and external communication

S. Willison, “The lethal trifecta for AI agents: Private data, untrusted content, and external communication.” https://simonwillison.net/20 25/Jun/16/the-lethal-trifecta/, 2025. Blog post, 16 June 2025

2025
[43]

The measurement of observer agree- ment for categorical data,

J. R. Landis and G. G. Koch, “The measurement of observer agree- ment for categorical data,”Biometrics, vol. 33, no. 1, pp. 159–174, 1977

1977
[44]

Task-specific efficiency analysis: When small language models outperform large language models,

J. Cao, Y . Ma, X. Li, Q. Ren, and X. Chen, “Task-specific efficiency analysis: When small language models outperform large language models,”arXiv preprint arXiv:2603.21389, 2026

work page arXiv 2026
[45]

CoDES: A context-efficient framework for enhancing small language models via domain-specific adaptation and model ensembling,

L. Hu, Y . Xin, B. Shen, H. Cai, and L. Jin, “CoDES: A context-efficient framework for enhancing small language models via domain-specific adaptation and model ensembling,”Preprints, 10.20944/preprints202603.1152.v1, 2026

work page doi:10.20944/preprints202603.1152.v1 2026
[46]

Toward sustainable on-device intelligence: A survey on energy-efficient RAG systems with small language models,

Z. Cheng, L. Lai, Y . Liu, and Y . Sun, “Toward sustainable on-device intelligence: A survey on energy-efficient RAG systems with small language models,”Available at SSRN 6698538, 2026

2026
[47]

DRP: Distilled reasoning pruning with skill-aware step decomposition for efficient large reasoning models,

Y . Jiang, D. Li, and F. Ferraro, “DRP: Distilled reasoning pruning with skill-aware step decomposition for efficient large reasoning models,” inFindings of the Association for Computational Linguistics (ACL), 2026

2026
[48]

Syner- gized data efficiency and compression (SEC) optimization for large language models,

X. Li, Y . Ma, Y . Huang, X. Wang, Y . Lin, and C. Zhang, “Syner- gized data efficiency and compression (SEC) optimization for large language models,” in2024 4th International Conference on Electronic Information Engineering and Computer Science (EIECS), pp. 586– 591, 2024

2024
[49]

Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation

Y . Jiang, R. Li, S. R. Dipta, D. Li, and Z. Yang, “Cornerstones or stumbling blocks? deciphering the rock tokens in on-policy distilla- tion,”arXiv preprint arXiv:2605.09253, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[50]

ChainCaps: Composition-safe tool-using agents via monotonic capability attenua- tion,

X. Jiang, S. Yang, Z. Li, L. Liu, H. Yu, and Y . Liu, “ChainCaps: Composition-safe tool-using agents via monotonic capability attenua- tion,” inICML 2026 Workshop on Agents in the Wild: Safety, Security, and Beyond (AIWILD), 2026

2026
[51]

Reward Auditor: Inference on Reward Modeling Suitability in Real-World Perturbed Scenarios

J. Zang, Y . Wei, R. Bai, S. Jiang, N. Mo, B. Li, Q. Sun, and H. Liu, “Reward auditor: Inference on reward modeling suitability in real- world perturbed scenarios,”arXiv preprint arXiv:2512.00920, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[52]

Alleviating attention hacking in discriminative re- ward modeling through interaction distillation,

J. Zang, “Alleviating attention hacking in discriminative re- ward modeling through interaction distillation,”arXiv preprint arXiv:2508.02618, 2025

work page arXiv 2025
[53]

AI for Auto-Research: Roadmap & User Guide

L. Kong, X. Sun, W. Chow, L. Li, K. Q. Lin, X. B. Zhang, S. Wang, R. Li, Q. Wu, W. Gao, Y . Wang, S. Xie, J. Liu, L. Qu, S. Li, L. X. Ng, B. R. Cottereau, Z. Liu, T.-S. Chua, and W. T. Ooi, “AI for auto- research: Roadmap & user guide,”arXiv preprint arXiv:2605.18661, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[54]

Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval

Z. Cheng, L. Lai, and Y . Liu, “Resolving the robustness-precision trade-off in financial RAG through hybrid document-routed retrieval,” arXiv preprint arXiv:2603.26815, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[55]

Enhancing financial report question-answering: A retrieval-augmented generation system with reranking analysis,

Z. Cheng, L. Lai, Y . Liu, K. Cheng, and X. Qi, “Enhancing financial report question-answering: A retrieval-augmented generation system with reranking analysis,” in6th International Conference on Electri- cal, Computer and Energy Technologies (ICECET), 2026

2026

[1] [1]

Extend Claude with skills

Anthropic, “Extend Claude with skills.” https://code.claude.com/do cs/en/skills, 2026. Claude Code documentation

2026

[2] [2]

Equipping agents for the real world with Agent Skills

Anthropic, “Equipping agents for the real world with Agent Skills.” https://www.anthropic.com/engineering/equipping-agents-for-the-rea l-world-with-agent-skills, 2025. Anthropic Engineering

2025

[3] [3]

"Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills in the Wild

Y . Liu, Z. Chen, Y . Zhang, G. Deng, Y . Li, J. Ning, Y . Zhang, and L. Y . Zhang, “Malicious agent skills in the wild: A large-scale security empirical study,”arXiv preprint arXiv:2602.06547, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[4] [4]

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Y . Liu, W. Wang, R. Feng, Y . Zhang, G. Xu, G. Deng, Y . Li, and L. Zhang, “Agent skills in the wild: An empirical study of security vulnerabilities at scale,”arXiv preprint arXiv:2601.10338, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[5] [5]

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Y . Hou, Z. Yang, Z. Pang, and X. Ma, “SkillSieve: A hierarchical triage framework for detecting malicious AI agent skills,”arXiv preprint arXiv:2604.06550, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[6] [6]

Formal analysis and supply chain security for agentic AI skills.CoRR, abs/2603.00195, 2026

V . P. Bhardwaj, “Formal analysis and supply chain security for agentic AI skills,”arXiv preprint arXiv:2603.00195, 2026

work page arXiv 2026

[7] [7]

Snyk finds prompt injection in 36%, 1467 malicious payloads in a ToxicSkills study of agent skills supply chain compromise,

L. Beurer-Kellner, A. Kudrinskii, M. Milanta, K. B. Nielsen, H. Sarkar, and L. Tal, “Snyk finds prompt injection in 36%, 1467 malicious payloads in a ToxicSkills study of agent skills supply chain compromise,”Snyk Blog, Feb. 5, 2026

2026

[8] [8]

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

Y . Huanget al., “From component manipulation to system compro- mise: Understanding and detecting malicious MCP servers,”arXiv preprint arXiv:2604.01905, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[9] [9]

Safety is non-compositional: A formal framework for capability-based AI systems,

C. Spera, “Safety is non-compositional: A formal framework for capability-based AI systems,”arXiv preprint arXiv:2603.15973, 2026

work page arXiv 2026

[10] [10]

ClawHub: Skill and plugin registry for OpenClaw

OpenClaw, “ClawHub: Skill and plugin registry for OpenClaw.” http s://github.com/openclaw/clawhub, 2026. Public agent-skill registry

2026

[11] [11]

MindGuard: Intrinsic decision inspection for securing LLM agents against metadata poisoning,

Z. Wang, H. Du, G. Shi, J. Zhang, H. Cheng, Y . Yao, K. Guo, and X.- Y . Li, “MindGuard: Intrinsic decision inspection for securing LLM agents against metadata poisoning,”arXiv preprint arXiv:2508.20412, 2025

work page arXiv 2025

[12] [12]

How Your Credentials Are Leaked by LLM Agent Skills: An Empirical Study

Z. Chen, Y . Zhang, Y . Liu, G. Deng, Y . Li, Y . Zhang, J. Ning, L. Y . Zhang, L. Ma, and Z. Li, “Credential leakage in LLM agent skills: A large-scale empirical study,”arXiv preprint arXiv:2604.03070, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[13] [13]

Prompt injection attacks on agentic coding assistants: A systematic analysis of vulnerabilities in skills, tools, and protocol ecosystems,

N. Maloyan and D. Namiot, “Prompt injection attacks on agentic coding assistants: A systematic analysis of vulnerabilities in skills, tools, and protocol ecosystems,”arXiv preprint arXiv:2601.17548, 2026

work page arXiv 2026

[14] [14]

Prompt injection attack to tool selection in LLM agents,

J. Shi, Z. Yuan, G. Tie, P. Zhou, N. Z. Gong, and L. Sun, “Prompt injection attack to tool selection in LLM agents,”Proceedings of NDSS, 2026

2026

[15] [15]

BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

G. Tie, J. Shi, P. Zhou, and L. Sun, “BadSkill: Backdoor at- tacks on agent skills via model-in-skill poisoning,”arXiv preprint arXiv:2604.09378, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[16] [16]

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

Y . Qu, Y . Liu, T. Geng, G. Deng, Y . Li, L. Y . Zhang, Y . Zhang, and L. Ma, “Supply-chain poisoning attacks against LLM coding agent skill ecosystems,”arXiv preprint arXiv:2604.03081, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[17] [17]

SkillJect: Effectively Automating Skill-Based Prompt Injection for Skill-Enabled Agents

X. Jia, J. Liao, S. Qin, J. Gu, W. Ren, X. Cao, Y . Liu, and P. Torr, “SkillJect: Effectively automating skill-based prompt injection for skill-enabled agents,”arXiv preprint arXiv:2602.14211, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[18] [18]

Noninterference and the composability of security properties,

D. McCullough, “Noninterference and the composability of security properties,” inProceedings of the IEEE Symposium on Security and Privacy, 1988

1988

[19] [19]

A lattice model of secure information flow,

D. E. Denning, “A lattice model of secure information flow,”Com- munications of the ACM, vol. 19, no. 5, pp. 236–243, 1976

1976

[20] [20]

The protection of information in computer systems,

J. H. Saltzer and M. D. Schroeder, “The protection of information in computer systems,”Proceedings of the IEEE, vol. 63, no. 9, pp. 1278– 1308, 1975

1975

[21] [21]

COVERT: Com- positional analysis of Android inter-app permission leakage,

H. Bagheri, A. Sadeghi, J. Garcia, and S. Malek, “COVERT: Com- positional analysis of Android inter-app permission leakage,”IEEE Transactions on Software Engineering, vol. 41, no. 9, 2015

2015

[22] [22]

MR- Droid: A scalable and prioritized analysis of inter-app communication risks,

F. Liu, H. Cai, G. Wang, D. Yao, K. O. Elish, and B. G. Ryder, “MR- Droid: A scalable and prioritized analysis of inter-app communication risks,” inProceedings of the IEEE Mobile Security Technologies Workshop (MoST), 2017

2017

[23] [23]

Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

Y . Qiao, D. Liu, H. Yang, W. Zhou, and S. Hu, “Agent tools orchestration leaks more: Dataset, benchmark, and mitigation,”arXiv preprint arXiv:2512.16310, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[24] [24]

AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,

E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fis- cher, and F. Tram`er, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,”Advances in Neural Information Processing Systems, 2024

2024

[25] [25]

Agent security bench (ASB): Formalizing and bench- marking attacks and defenses in LLM-based agents,

H. Zhang, J. Huang, K. Mei, Y . Yao, Z. Wang, C. Zhan, H. Wang, and Y . Zhang, “Agent security bench (ASB): Formalizing and bench- marking attacks and defenses in LLM-based agents,” inInternational Conference on Learning Representations (ICLR), 2025

2025

[26] [26]

InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,

Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,”Findings of ACL, 2024

2024

[27] [27]

Defeating Prompt Injections by Design

E. Debenedettiet al., “Defeating prompt injections by design (CaMeL),”arXiv preprint arXiv:2503.18813, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[28] [28]

In2025 IEEE Conference on Se- cure and Trustworthy Machine Learning (SaTML), pages 23–42

S. Chennabasappa, C. Nikolaidis, D. Song, D. Molnar, S. Ding, S. Wan, S. Whitman, L. Deason, N. Doucette, A. Montilla, A. Gampa, B. de Paola, D. Gabi, J. Crnkovich, J.-C. Testud, K. He, R. Chaturvedi, W. Zhou, and J. Saxe, “LlamaFirewall: An open source guardrail system for building secure AI agents,”arXiv preprint arXiv:2505.03574, 2025

work page arXiv 2025

[29] [29]

AGrail: A lifelong agent guardrail with effective and adaptive safety detection,

W. Luo, S. Dai, X. Liu, S. Banerjee, H. Sun, M. Chen, and C. Xiao, “AGrail: A lifelong agent guardrail with effective and adaptive safety detection,”Proceedings of ACL, 2025

2025

[30] [30]

AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,

H. Wang, C. M. Poskitt, and J. Sun, “AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,”Proceedings of ICSE, 2026

2026

[31] [31]

Progent: Securing AI Agents with Privilege Control

T. Shi, J. He, Z. Wang, H. Li, L. Wu, W. Guo, and D. Song, “Progent: Securing AI agents with privilege control,”arXiv preprint arXiv:2504.11703, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[32] [32]

Agent governance toolkit: Open-source runtime security for AI agents

Microsoft, “Agent governance toolkit: Open-source runtime security for AI agents.” https://opensource.microsoft.com/blog/2026/04/02/int roducing-the-agent-governance-toolkit-open-source-runtime-securit y-for-ai-agents/, 2026

2026

[33] [33]

Reflect-Guard: Enhancing LLM Safeguards against Adversarial Prompts via Logical Self-Reflection

L. Lin, J. You, Y . Li, L. Lin, Y . Wang, Z. Zhang, and M. Zheng, “Reflect-guard: Enhancing LLM safeguards against adversarial prompts via logical self-reflection,”arXiv preprint arXiv:2605.24834, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[34] [34]

OW ASP agentic skills top 10

OW ASP Foundation, “OW ASP agentic skills top 10.” https://owasp. org/www-project-agentic-skills-top-10/, 2026

2026

[35] [35]

A Security Analysis of the OpenClaw AI Agent Framework

S. Suwansathit, Y . Zhang, and G. Gu, “A security analysis of the OpenClaw AI agent framework,”arXiv preprint arXiv:2603.27517, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[36] [36]

SOK: A Taxonomy of Attack Vectors and Defense Strategies for Agentic Supply Chain Runtime

X. Jiang, S. Yang, W. Yang, Y . Liu, and C. Ji, “SOK: A taxonomy of attack vectors and defense strategies for agentic supply chain runtime,”arXiv preprint arXiv:2602.19555, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[37] [37]

Learning how to use tools, not just when: Pattern-aware tool-integrated reasoning,

N. Xu, Y . Jiang, S. R. Dipta, and H. Zhang, “Learning how to use tools, not just when: Pattern-aware tool-integrated reasoning,”arXiv preprint arXiv:2509.23292, 2026

work page arXiv 2026

[38] [38]

SCRIBE: Structured Mid-Level Supervision for Tool-Using Language Models

Y . Jiang and F. Ferraro, “SCRIBE: Structured mid-level supervision for tool-using language models,”arXiv preprint arXiv:2601.03555, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[39] [39]

From generation to judgment: Opportunities and challenges of LLM-as-a-judge,

D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y . Jiang, C. Chen, T. Wu,et al., “From generation to judgment: Opportunities and challenges of LLM-as-a-judge,” inProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 2757–2791, 2025

2025

[40] [40]

Does tone change the answer? evaluating prompt politeness effects on modern LLMs: GPT, Gemini, Llama,

H. Cai, B. Shen, L. Jin, L. Hu, and X. Fan, “Does tone change the answer? evaluating prompt politeness effects on modern LLMs: GPT, Gemini, Llama,”arXiv preprint arXiv:2512.12812, 2025

work page arXiv 2025

[41] [41]

Performance-efficiency trade-offs in human preference prediction: A comparative study of traditional machine learning and large language models,

Y . Zhang, Z. Xiang, and H. Xu, “Performance-efficiency trade-offs in human preference prediction: A comparative study of traditional machine learning and large language models,” inProceedings of the 31st IEEE Symposium on Computers and Communications (ISCC), IEEE, 2026

2026

[42] [42]

The lethal trifecta for AI agents: Private data, untrusted content, and external communication

S. Willison, “The lethal trifecta for AI agents: Private data, untrusted content, and external communication.” https://simonwillison.net/20 25/Jun/16/the-lethal-trifecta/, 2025. Blog post, 16 June 2025

2025

[43] [43]

The measurement of observer agree- ment for categorical data,

J. R. Landis and G. G. Koch, “The measurement of observer agree- ment for categorical data,”Biometrics, vol. 33, no. 1, pp. 159–174, 1977

1977

[44] [44]

Task-specific efficiency analysis: When small language models outperform large language models,

J. Cao, Y . Ma, X. Li, Q. Ren, and X. Chen, “Task-specific efficiency analysis: When small language models outperform large language models,”arXiv preprint arXiv:2603.21389, 2026

work page arXiv 2026

[45] [45]

CoDES: A context-efficient framework for enhancing small language models via domain-specific adaptation and model ensembling,

L. Hu, Y . Xin, B. Shen, H. Cai, and L. Jin, “CoDES: A context-efficient framework for enhancing small language models via domain-specific adaptation and model ensembling,”Preprints, 10.20944/preprints202603.1152.v1, 2026

work page doi:10.20944/preprints202603.1152.v1 2026

[46] [46]

Toward sustainable on-device intelligence: A survey on energy-efficient RAG systems with small language models,

Z. Cheng, L. Lai, Y . Liu, and Y . Sun, “Toward sustainable on-device intelligence: A survey on energy-efficient RAG systems with small language models,”Available at SSRN 6698538, 2026

2026

[47] [47]

DRP: Distilled reasoning pruning with skill-aware step decomposition for efficient large reasoning models,

Y . Jiang, D. Li, and F. Ferraro, “DRP: Distilled reasoning pruning with skill-aware step decomposition for efficient large reasoning models,” inFindings of the Association for Computational Linguistics (ACL), 2026

2026

[48] [48]

Syner- gized data efficiency and compression (SEC) optimization for large language models,

X. Li, Y . Ma, Y . Huang, X. Wang, Y . Lin, and C. Zhang, “Syner- gized data efficiency and compression (SEC) optimization for large language models,” in2024 4th International Conference on Electronic Information Engineering and Computer Science (EIECS), pp. 586– 591, 2024

2024

[49] [49]

Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation

Y . Jiang, R. Li, S. R. Dipta, D. Li, and Z. Yang, “Cornerstones or stumbling blocks? deciphering the rock tokens in on-policy distilla- tion,”arXiv preprint arXiv:2605.09253, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[50] [50]

ChainCaps: Composition-safe tool-using agents via monotonic capability attenua- tion,

X. Jiang, S. Yang, Z. Li, L. Liu, H. Yu, and Y . Liu, “ChainCaps: Composition-safe tool-using agents via monotonic capability attenua- tion,” inICML 2026 Workshop on Agents in the Wild: Safety, Security, and Beyond (AIWILD), 2026

2026

[51] [51]

Reward Auditor: Inference on Reward Modeling Suitability in Real-World Perturbed Scenarios

J. Zang, Y . Wei, R. Bai, S. Jiang, N. Mo, B. Li, Q. Sun, and H. Liu, “Reward auditor: Inference on reward modeling suitability in real- world perturbed scenarios,”arXiv preprint arXiv:2512.00920, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[52] [52]

Alleviating attention hacking in discriminative re- ward modeling through interaction distillation,

J. Zang, “Alleviating attention hacking in discriminative re- ward modeling through interaction distillation,”arXiv preprint arXiv:2508.02618, 2025

work page arXiv 2025

[53] [53]

AI for Auto-Research: Roadmap & User Guide

L. Kong, X. Sun, W. Chow, L. Li, K. Q. Lin, X. B. Zhang, S. Wang, R. Li, Q. Wu, W. Gao, Y . Wang, S. Xie, J. Liu, L. Qu, S. Li, L. X. Ng, B. R. Cottereau, Z. Liu, T.-S. Chua, and W. T. Ooi, “AI for auto- research: Roadmap & user guide,”arXiv preprint arXiv:2605.18661, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[54] [54]

Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval

Z. Cheng, L. Lai, and Y . Liu, “Resolving the robustness-precision trade-off in financial RAG through hybrid document-routed retrieval,” arXiv preprint arXiv:2603.26815, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[55] [55]

Enhancing financial report question-answering: A retrieval-augmented generation system with reranking analysis,

Z. Cheng, L. Lai, Y . Liu, K. Cheng, and X. Qi, “Enhancing financial report question-answering: A retrieval-augmented generation system with reranking analysis,” in6th International Conference on Electri- cal, Computer and Energy Technologies (ICECET), 2026

2026