pith. sign in

arxiv: 2606.28125 · v1 · pith:W6WHITPBnew · submitted 2026-06-26 · 💻 cs.SE · cs.CR

How Humans, Bots, and Agents Communicate About Vulnerabilities in Pull Requests

Pith reviewed 2026-06-29 03:29 UTC · model grok-4.3

classification 💻 cs.SE cs.CR
keywords vulnerability communicationpull requestsbotscoding agentsexplicit referencesimplicit security languagesoftware securityempirical study
0
0 comments X

The pith

This registered report outlines a planned study comparing how humans, bots, and agents reference vulnerabilities in pull requests using explicit and implicit signals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a study that will examine vulnerability references in pull request discussions by analyzing both formal identifiers like CVEs and informal security language. It will compare patterns across human accounts, bots, and coding agents in the AIDev-pop dataset, covering titles, descriptions, comments, commits, and timelines. The investigation will check associations with actual code vulnerabilities and with review activity and outcomes. Readers would care because automated accounts are increasingly common in development, and their security communication may follow different rules than human patterns. The work aims to provide empirical data on these practices in contemporary software projects.

Core claim

The authors plan to analyze explicit vulnerability references such as CVEs or GHSAs and implicit security-related signals across pull request titles, descriptions, review comments, commit messages, and timeline discussions, then relate these to introduced or fixed vulnerabilities and to review activity and outcomes, comparing across human, bot, and agent accounts.

What carries the argument

Comparison of explicit and implicit vulnerability signals in the AIDev-pop dataset across multiple pull request components to identify differences by account type.

If this is right

  • Vulnerability references may be associated with whether vulnerabilities are introduced or fixed in the modified code.
  • References may relate to pull request review activity and outcomes.
  • The study will generate data on communication practices involving automated accounts in modern software development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Prioritizing implicit signals could surface vulnerability discussions that prior studies limited to explicit identifiers would have missed.
  • Findings on account-type differences could inform design choices for future bots and agents regarding security topics.
  • The planned analysis could be extended to additional datasets to test whether patterns hold beyond the current sample.

Load-bearing premise

The AIDev-pop dataset provides adequate coverage of pull requests involving bots and coding agents to enable valid comparisons of communication patterns across account types.

What would settle it

Discovery that the AIDev-pop dataset contains too few pull requests with bot or agent activity would prevent meaningful comparisons across account types.

Figures

Figures reproduced from arXiv: 2606.28125 by Christoph Treude, Mairieli Wessel, Pien Rooijendijk.

Figure 1
Figure 1. Figure 1: Examples of explicit and implicit references & signals in pull requests. CWEs, GHSAs, or RUSTSEC entries, and implicit security￾related signals, which describe security concerns without referencing a formal identifier. Explicit references provide structured and verifiable links to vulnerability databases and advisory systems and have been widely used to study vulnera￾bility fixes and security-related devel… view at source ↗
Figure 2
Figure 2. Figure 2: Our study’s execution plan 1) Explicit References: Explicit vulnerability references are comparatively easier to identify because they follow standard￾ized textual patterns. We therefore detect explicit references using regular expression matching. We use a keyword-based approach to identify candidate security-related signals, which are subsequently validated through LLM-assisted annotation and manual revi… view at source ↗
read the original abstract

Developers may reference vulnerabilities in pull request discussions through both explicit identifiers, such as CVEs or GHSAs, and implicit security-related language (e.g., "unauthorized access" or "SQL injection"). Prior work has primarily focused on explicit identifiers, potentially overlooking vulnerability discussions that lack formal references. Bots and coding agents are becoming more common in pull requests, raising new questions about how different accounts communicate about vulnerabilities. In this registered report, we describe our planned study of vulnerability communication in pull requests by humans, bots, and coding agents. Building on the AIDev-pop dataset, we analyze explicit vulnerability references and implicit security-related signals across pull request titles, descriptions, review comments, commit messages, and timeline discussions. We further investigate whether these references are associated with vulnerabilities introduced or fixed in the modified code and how they relate to pull request review activity and outcomes. This study contributes a large-scale empirical investigation of vulnerability communication practices in modern software development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript is a registered report outlining a planned large-scale empirical study of vulnerability communication in pull requests. It examines explicit references (CVEs, GHSAs) and implicit security-related language across PR titles, descriptions, review comments, commit messages, and timelines, comparing patterns among humans, bots, and coding agents using the AIDev-pop dataset. The study further plans to link these references to introduced or fixed vulnerabilities and to review activity/outcomes.

Significance. If the planned analyses can be executed, the work would address a gap in prior research by incorporating implicit signals and automated account types, contributing empirical evidence on vulnerability discussions in modern development workflows. The registered-report format is a clear strength, as it commits to the analysis plan in advance and supports reproducibility. The contribution hinges on whether AIDev-pop supplies adequate coverage for the cross-account comparisons.

major comments (1)
  1. [Abstract / Planned Methods] Abstract and planned-methods description: The central claim of performing valid comparisons of vulnerability communication across humans, bots, and coding agents rests on the untested premise that AIDev-pop contains sufficient bot- and agent-authored PRs with explicit or implicit vulnerability signals in titles, descriptions, comments, commits, and timelines. No preliminary counts, sampling strategy, or power analysis for these subgroups are supplied, so the cross-account analysis cannot be guaranteed to be feasible as described.
minor comments (2)
  1. [Abstract] The distinction between 'bots' and 'coding agents' is introduced but not operationally defined; a clear classification rule or reference to how AIDev-pop labels these account types would improve reproducibility.
  2. [Abstract] The abstract states the study will 'investigate whether these references are associated with vulnerabilities introduced or fixed,' but does not specify the code-analysis method (e.g., static analysis tool or diff-based detection) that will be used to identify such vulnerabilities.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on our registered report. We address the single major comment below and commit to revisions that strengthen the description of dataset feasibility.

read point-by-point responses
  1. Referee: [Abstract / Planned Methods] Abstract and planned-methods description: The central claim of performing valid comparisons of vulnerability communication across humans, bots, and coding agents rests on the untested premise that AIDev-pop contains sufficient bot- and agent-authored PRs with explicit or implicit vulnerability signals in titles, descriptions, comments, commits, and timelines. No preliminary counts, sampling strategy, or power analysis for these subgroups are supplied, so the cross-account analysis cannot be guaranteed to be feasible as described.

    Authors: We agree that the registered report would benefit from explicit discussion of dataset coverage to support the planned comparisons. In the revised manuscript we will add a dedicated subsection on the AIDev-pop dataset that reports all publicly documented statistics on the distribution of human-, bot-, and agent-authored PRs. We will also describe our sampling strategy (first filtering the full dataset for PRs containing explicit CVE/GHSA references or implicit security keywords, then stratifying by account type) and commit to performing and transparently reporting a post-hoc power analysis once the filtered sample sizes are known. These additions address the concern while preserving the pre-registered analysis plan. revision: yes

Circularity Check

0 steps flagged

No circularity in registered report plan

full rationale

This is a registered report outlining a planned empirical study on vulnerability communication in PRs using the external AIDev-pop dataset. No equations, derivations, fitted parameters, predictions, or self-citations appear in the provided text. The document describes future analysis steps without any load-bearing claims that reduce to inputs by construction, self-definition, or author-overlapping citations. The central contribution is a descriptive plan, which is self-contained against external benchmarks and contains no derivation chain to inspect.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces no free parameters, mathematical axioms, or invented entities as it is a study design document without modeling or theoretical derivations.

pith-pipeline@v0.9.1-grok · 5698 in / 1123 out tokens · 55084 ms · 2026-06-29T03:29:31.906533+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 17 canonical work pages

  1. [1]

    A Quantitative Study of Security Bug Fixes of GitHub Repositories,

    D. Nakano, M. Yin, R. Sato, A. Hindle, Y . Kamei, and N. Ubayashi, “A Quantitative Study of Security Bug Fixes of GitHub Repositories,”arXiv preprint arXiv:2012.08053, 2020

  2. [2]

    CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software,

    G. Bhandari, A. Naseer, and L. Moonen, “CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software,” inProceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering, ser. PROMISE 2021. New York, NY , USA: Association for Computing Machinery, 2021, p. 30–39. [Online]. Avail...

  3. [3]

    Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories,

    D. Hommersom, A. Sabetta, B. Coppola, D. D. Nucci, and D. A. Tamburri, “Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories,”ACM Trans. Softw. Eng. Methodol., vol. 33, no. 5, p. 1–28, Jun. 2024. [Online]. Available: https://doi.org/10.1145/3649590

  4. [4]

    An Empirical Study on Vulnerability Disclosure Management of Open Source Software Systems,

    S. Liu, J. Zhou, X. Hu, F. R. Cogo, X. Xia, and X. Yang, “An Empirical Study on Vulnerability Disclosure Management of Open Source Software Systems,”ACM Trans. Softw. Eng. Methodol., vol. 34, no. 7, pp. 1–31, Aug. 2025. [Online]. Available: https://doi.org/10.1145/3716822

  5. [5]

    A Mixed-Methods Study of Open-Source Software Maintainers on Vul- nerability Management and Platform Security Features,

    J. Ayala, Y .-J. Tung, and J. Garcia, “A Mixed-Methods Study of Open-Source Software Maintainers on Vul- nerability Management and Platform Security Features,” in34th USENIX Security Symposium (USENIX Security 25), 2025, pp. 2105–2124

  6. [6]

    Are security commit messages informative? Not enough!

    S. Reis, R. Abreu, and C. Pasareanu, “Are security commit messages informative? Not enough!” in Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, ser. EASE ’23. New York, NY , USA: Association for Computing Machinery, 2023, p. 196–199. [Online]. Available: https://doi.org/10.1145/3593434.3593481

  7. [7]

    An empirical study of developers’ discussions about security challenges of different programming languages,

    R. Croft, Y . Xie, M. Zahedi, M. A. Babar, and C. Treude, “An empirical study of developers’ discussions about security challenges of different programming languages,” Empirical Software Engineering, vol. 27, no. 1, p. 27, 2022. [Online]. Available: https://doi.org/10.1007/ s10664-021-10054-w

  8. [8]

    Exploring the Security Awareness of the Python and JavaScript Open Source Communities,

    G. Antal, M. Keleti, and P. Hegedundefineds, “Exploring the Security Awareness of the Python and JavaScript Open Source Communities,” in Proceedings of the 17th International Conference on Mining Software Repositories, ser. MSR ’20. New York, NY , USA: Association for Computing Machinery, 2020, p. 16–20. [Online]. Available: https://doi.org/10.1145/337959...

  9. [9]

    A Comprehensive Study on the Impact of Vulnerable Dependencies on Open- Source Software,

    S. H. B. I. Kumar, L. R. Sampaio, A. Martin, A. Brito, and C. Fetzer, “A Comprehensive Study on the Impact of Vulnerable Dependencies on Open- Source Software,” in2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2024, pp. 96–107. [Online]. Available: https: //doi.org/10.1109/ISSRE62328.2024.00020

  10. [10]

    On Categorizing Open Source Software Security Vulnerability Reporting Mechanisms on GitHub,

    S. Kancharoendee, T. Phichitphanphong, C. Jongyingyos, B. Reid, R. G. Kula, M. Choetkiertikul, C. Ragkhitwet- sagul, and T. Sunetnanta, “On Categorizing Open Source Software Security Vulnerability Reporting Mechanisms on GitHub,” in2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2025, pp. 751–756. [Onlin...

  11. [11]

    Automated Identification of Security Issues from Commit Messages and Bug Reports,

    Y . Zhou and A. Sharma, “Automated Identification of Security Issues from Commit Messages and Bug Reports,” inProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2017. New York, NY , USA: Association for Computing Machinery, 2017, p. 914–919. [Online]. Available: https://doi.org/10.1145/3106237.3117771

  12. [12]

    Automating the detection of code vulnerabilities by analyzing github issues,

    D. Cipollone, C. Wang, M. Scazzariello, S. Ferlin, M. Izadi, D. Kosti ´c, and M. Chiesa, “Automating the detection of code vulnerabilities by analyzing github issues,” in2025 IEEE/ACM International Workshop on Large Language Models for Code (LLM4Code). IEEE, 2025, pp. 41–48. [Online]. Available: https: //doi.org/10.1109/LLM4Code66737.2025.00010

  13. [13]

    Who Said CVE? How Vulnerability Identifiers Are Mentioned by Humans, Bots, and Agents in Pull Requests,

    P. Rooijendijk, C. Treude, and M. Wessel, “Who Said CVE? How Vulnerability Identifiers Are Mentioned by Humans, Bots, and Agents in Pull Requests,” in2026 IEEE/ACM 23rd International Conference on Mining Software Repositories (MSR), 2026. [Online]. Available: https://doi.org/10.1145/3793302.3793616

  14. [14]

    Security in the Age of AI Teammates: An Empirical Study of Agentic Pull Requests on GitHub,

    M. L. Siddiq, X. Zhao, V . C. Lopes, B. Casey, and J. Santos, “Security in the Age of AI Teammates: An Empirical Study of Agentic Pull Requests on GitHub,” arXiv preprint arXiv:2601.00477, 2026

  15. [15]

    Automated vs. human security patching patterns in pull requests: Evidence from the aidev dataset,

    F. Wang, B. Do, and J. Jermier, “Automated vs. human security patching patterns in pull requests: Evidence from the aidev dataset,” 2025. [Online]. Available: https://plg.uwaterloo.ca/ ∼migod/846/current/ projects/04-FelixJacieBrian-report.pdf

  16. [16]

    The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents are Reshaping Software Engineering,

    H. Li, H. Zhang, and A. E. Hassan, “The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents are Reshaping Software Engineering,”arXiv preprint arXiv:2507.15003, 2025

  17. [17]

    On the Use of Dependabot Security Pull Requests,

    M. Alfadel, D. E. Costa, E. Shihab, and M. Mkhallalati, “On the Use of Dependabot Security Pull Requests,” in2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), 2021, pp. 254–265. [Online]. Available: https://doi.org/10.1109/MSR52588. 2021.00037

  18. [18]

    Investigating the Resolution of Vulnerable Dependencies with Dependabot Security Updates,

    H. Mohayeji, A. Agaronian, E. Constantinou, N. Zannone, and A. Serebrenik, “Investigating the Resolution of Vulnerable Dependencies with Dependabot Security Updates,” in2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), 2023, pp. 234–246. [Online]. Available: https://doi.org/10.1109/MSR59073.2023.00042

  19. [19]

    In: 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE).pp.675–675.IEEEComputerSociety,LosAlamitos,CA,USA(May2025)

    B. Steenhoek, K. Sivaraman, R. S. Gonzalez, Y . Mohylevskyy, R. Z. Moghaddam, and W. Le, “Closing the Gap: A User Study on the Real- world Usefulness of AI-powered Vulnerability Detection & Repair in the IDE,” in2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 2025, pp. 01–13. [Online]. Available: https://doi.org/10.1109/I...

  20. [20]

    Insights into Security-Related AI-Generated Pull Requests,

    M. F. Rabbi, A. K. Turzo, A. I. Champa, and M. F. Zi- bran, “Insights into Security-Related AI-Generated Pull Requests,”arXiv preprint arXiv:2604.19965, 2026

  21. [21]

    Cohen’s Kappa Coefficient as a Performance Measure for Feature Selection,

    S. M. Vieira, U. Kaymak, and J. M. Sousa, “Cohen’s Kappa Coefficient as a Performance Measure for Feature Selection,” inInternational conference on fuzzy systems. IEEE, 2010, pp. 1–8. [Online]. Available: https://doi.org/doi={10.1109/FUZZY .2010.5584447}

  22. [22]

    Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why,

    D. Rao and C. Callison-Burch, “Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why,” arXiv preprint arXiv:2606.00093, 2026

  23. [23]

    BotHunter: An Approach to Detect Software Bots in GitHub,

    A. Abdellatif, M. Wessel, I. Steinmacher, M. A. Gerosa, and E. Shihab, “BotHunter: An Approach to Detect Software Bots in GitHub,” inProceedings of the 19th International Conference on Mining Software Repositories, ser. MSR ’22. New York, NY , USA: Association for Computing Machinery, 2022, p. 6–17. [Online]. Available: https://doi.org/10.1145/ 3524842.3527959

  24. [24]

    A Dataset of Bot and Human Activities in GitHub,

    N. Chidambaram, A. Decan, and T. Mens, “A Dataset of Bot and Human Activities in GitHub,” in2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), 2023, pp. 465–469. [Online]. Available: https://doi.org/10.1109/MSR59073. 2023.00070

  25. [25]

    Comparative Analysis of Open-Source Tools for Conducting Static Code Analysis,

    K. Kuszczy ´nski and M. Walkowski, “Comparative Analysis of Open-Source Tools for Conducting Static Code Analysis,”Sensors, vol. 23, no. 18, p. 7978, 2023. [Online]. Available: https://doi.org/10.3390/s23187978

  26. [26]

    Semgrep*: Improving the Limited Performance of Static Application Security Testing (SAST) Tools,

    G. Bennett, T. Hall, E. Winter, and S. Counsell, “Semgrep*: Improving the Limited Performance of Static Application Security Testing (SAST) Tools,” in Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, ser. EASE ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 614–623. [Online]. Ava...