Demystifying the Mythos or Disrupting Bugonomics? From Zero-Day Asymmetry to Defender Remediation Throughput

Alfredo Pesoli; Herman Errico; Lorenzo Cavallaro

arxiv: 2605.24632 · v1 · pith:FOFGTXJHnew · submitted 2026-05-23 · 💻 cs.CR · cs.AI· cs.LG

Demystifying the Mythos or Disrupting Bugonomics? From Zero-Day Asymmetry to Defender Remediation Throughput

Alfredo Pesoli , Herman Errico , Lorenzo Cavallaro This is my paper

Pith reviewed 2026-06-30 12:54 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.LG

keywords LLM vulnerability discoverybugonomicszero-day economicsremediation throughputopen source securityvulnerability triageexploit market pricesmaintainer capacity

0 comments

The pith

LLM-assisted discovery makes low-signal vulnerability candidates cheaper while shifting the bottleneck to defender remediation throughput.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how large language models change the economics of vulnerability discovery and fixing by lowering the cost of candidate generation, code analysis, and report preparation at scale. It draws on public data from LLM previews and real browser collaborations plus exploit-market prices to argue that the outcome is not simply more high-value zero-days but greater pressure on validation, triage, and patching capacity. A sympathetic reader would care because the argument reframes AI security effects around operational defender scaling rather than offensive capability alone. The claim is most acute for open-source projects whose maintainer resources are fixed.

Core claim

Using public data from Anthropic's Mythos Preview and Mozilla Firefox collaborations along with exploit-market anchors and reward programs, the paper claims the near-term shift from LLM-driven discovery is not an increase in zero-days but a move toward broader defender remediation throughput where low-signal candidates become cheaper, evidence-rich remediation becomes more important, and scarce capacity moves toward maintainer review and release work.

What carries the argument

The bugonomics lens that tracks the operational economics of producing, proving, prioritizing, and fixing defects, applied to the transition from zero-day asymmetry to defender remediation throughput.

If this is right

Low-signal candidates become cheaper to produce at codebase scale.
Evidence-rich remediation work gains relative importance over raw discovery.
Scarce capacity shifts from bug hunting toward maintainer review and release processes.
The pressure is most visible in open source where funding and staffing do not automatically expand with report volume.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Projects may need new automated pre-filters that score reports on evidence quality before human triage begins.
Reward programs could move from flat bounties to tiered payouts that reward proof-of-impact quality over candidate novelty.
A two-tier reporting system could emerge in which automated low-evidence submissions receive minimal response while high-evidence ones compete for limited maintainer time.

Load-bearing premise

LLM-assisted discovery will substantially increase report volume while maintainer-side validation, triage, funding, and release capacity will not scale accordingly, especially in open source settings.

What would settle it

A sustained rise in LLM-generated reports accompanied by stable or declining average time-to-patch and no growth in backlogs across major open-source projects would falsify the claim that remediation throughput is the binding constraint.

read the original abstract

Recent demonstrations of large language models producing candidate and confirmed vulnerabilities in production software have renewed the narrative that AI will reshape offensive and defensive security. Headlines emphasize capability; they rarely interrogate costs and incentives. This paper examines LLM-driven vulnerability discovery through a bugonomics lens: the operational economics of producing, proving, prioritizing, and fixing security-relevant defects. Historically, the most visible high-end bugonomics was offense-priced because production-grade zero-days and exploit chains were expensive specialist outputs for governments, brokers, and offensive vendors. Defender-side bugonomics already existed in vulnerability research, reward programs, and vendor remediation work; LLM-assisted systems change its scale and distribution. They make candidate generation, code comprehension, harness construction, proof-of-impact drafting, and report preparation cheaper at codebase scale. Exploits and proofs of concept remain important, but in defender workflows they primarily prove impact, guide prioritization, and justify remediation. The resulting bottleneck is not only finding more bugs; it is absorbing, validating, triaging, patching, and shipping a larger stream of reports. Using public data from Anthropic's Mythos Preview and Mozilla Firefox collaborations, along with public exploit-market price anchors and vulnerability reward programs, we argue that the near-term shift is not simply more zero-days. It is a move toward broader defender remediation throughput: low-signal candidates become cheaper, evidence-rich remediation become more important, and scarce capacity shifts toward maintainer review and release work. The effect is acute in open source, where LLM-assisted discovery can increase report volume while maintainer-side validation, triage, funding, and release capacity may not scale.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is an interpretive framing of LLM-driven bug reports shifting the bottleneck to defender remediation in open source, but the key claim about non-scaling capacity lacks supporting analysis.

read the letter

The paper's main point is that LLMs lower the cost of generating vulnerability candidates and reports, so the practical limit moves to how fast maintainers can validate, triage, and ship fixes—especially in open source where resources are thin. It uses public examples from Anthropic's Mythos Preview and Mozilla Firefox work, plus exploit price anchors and bounty programs, to sketch this change in incentives.

What stands out is the attention to the defender workflow side. The discussion of how proof-of-impact and report preparation get cheaper, while actual patching capacity does not automatically grow, is a useful reminder for anyone managing real codebases. The open-source angle is grounded enough to be worth noting.

The soft spot is the load-bearing assumption that maintainer validation and release throughput will not adapt. The text states this as a structural feature without historical comparisons, estimates of triage tool adoption, or any modeling of capacity response. That leaves the predicted shift from zero-day economics to remediation volume as an assertion rather than a demonstrated outcome. The 'bugonomics' framing also adds a label more than a new measurable structure.

This is for readers already working on security economics, open-source maintenance, or AI tooling in defensive workflows. It could prompt practical thinking about report volume but does not deliver new measurements or falsifiable predictions.

I would send it to peer review if the full manuscript adds even modest quantitative checks on scaling rates; otherwise it functions better as a position piece than a research article.

Referee Report

1 major / 1 minor

Summary. The paper claims that LLM-assisted vulnerability discovery shifts the economics of bug finding ('bugonomics') away from offense-dominated zero-day markets toward defender remediation throughput, with the new bottleneck being the absorption, validation, triage, patching, and release of higher report volumes. It invokes public data from Anthropic's Mythos Preview, Mozilla Firefox collaborations, exploit-market prices, and vulnerability reward programs to argue that this effect is especially acute in open source, where discovery costs fall but maintainer capacity does not scale accordingly.

Significance. If the argument holds, the paper supplies a conceptual framework for analyzing how AI tools redistribute costs and incentives between discovery and remediation in security. It draws attention to open-source maintainer constraints as a potential limiting factor and could inform the design of vulnerability programs and triage processes.

major comments (1)

[Abstract] Abstract: the central claim that 'maintainer-side validation, triage, funding, and release capacity may not scale' with LLM-driven report volume is load-bearing for the predicted shift from zero-day asymmetry to remediation throughput, yet the manuscript presents this non-scaling as a structural feature of open-source settings without quantitative comparison of historical scaling rates, adoption of LLM triage tools, or modeling of capacity elasticity.

minor comments (1)

The 'bugonomics' framework is introduced to organize costs and incentives, but the abstract does not supply explicit definitions, independent external benchmarks, or falsifiable predictions that would allow readers to test the framework separately from the conclusions.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the load-bearing nature of the non-scaling claim. Our response clarifies the paper's scope as a conceptual framework supported by cited public data rather than a quantitative model, while acknowledging where additional context could strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'maintainer-side validation, triage, funding, and release capacity may not scale' with LLM-driven report volume is load-bearing for the predicted shift from zero-day asymmetry to remediation throughput, yet the manuscript presents this non-scaling as a structural feature of open-source settings without quantitative comparison of historical scaling rates, adoption of LLM triage tools, or modeling of capacity elasticity.

Authors: The manuscript frames the argument as an economic and incentive analysis rather than an empirical econometric study. The non-scaling premise draws from established characteristics of open-source maintenance (limited maintainer time, volunteer structures, and fixed release cadences) documented in prior OSS literature, combined with the observed drop in candidate-generation costs from the cited Anthropic Mythos Preview and Mozilla data. These sources illustrate increased report volume without corresponding expansion in triage and patching throughput. We do not claim to have modeled elasticity or performed new historical scaling comparisons; the contribution is the identification of the resulting bottleneck shift. We can expand the related-work section to reference existing studies on OSS maintainer capacity constraints, but we maintain that the current evidence base suffices for the conceptual claim. revision: partial

Circularity Check

0 steps flagged

No circularity: argument uses external public data without self-referential reduction

full rationale

The paper introduces a 'bugonomics' framework as an analytical lens but does not define its terms or conclusions in terms of each other by construction. Claims rest on cited public datasets (Anthropic Mythos Preview, Mozilla Firefox collaborations, exploit-market prices, vulnerability reward programs) rather than fitted parameters renamed as predictions or self-citations. No equations, uniqueness theorems, or ansatzes are invoked that reduce the throughput-shift argument to the framework's own inputs. The non-scaling of defender capacity is stated as a structural observation about open-source settings, not derived from the framework itself. This is a self-contained argumentative analysis with independent external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The analysis rests on the domain assumption that LLM systems reduce candidate generation costs at scale and introduces the new conceptual term 'bugonomics' without independent prior evidence or validation.

axioms (1)

domain assumption LLM-assisted systems make candidate generation, code comprehension, harness construction, proof-of-impact drafting, and report preparation cheaper at codebase scale.
This premise is stated directly in the abstract as the basis for the claimed change in bugonomics scale and distribution.

invented entities (1)

bugonomics no independent evidence
purpose: A lens for the operational economics of producing, proving, prioritizing, and fixing security-relevant defects.
New term coined in the paper to structure the discussion of costs and incentives.

pith-pipeline@v0.9.1-grok · 5831 in / 1391 out tokens · 46190 ms · 2026-06-30T12:54:52.149137+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Antaeus: Hunting Repository-Level Logic Vulnerabilities via Context-Grounded LLM Reasoning
cs.CR 2026-07 unverdicted novelty 6.0

Antaeus detects 15 logic vulnerabilities across 28 repositories via a pipeline of function prioritization, repository-level LLM reasoning, and comparative validation, outperforming baselines at similar cost.

Reference graph

Works this paper leans on

26 extracted references · 1 canonical work pages · cited by 1 Pith paper

[1]

The AI Vulnerability Storm: Building a Mythos-ready Security Program,

G. Evron, R. T. Lee, R. Mogull, et al., “The AI Vulnerability Storm: Building a Mythos-ready Security Program,” Cloud Security Alliance CISO Community, SANS Institute, [un]prompted, OW ASP Gen AI Security Project, Apr. 18, 2026

2026
[2]

AI Cyber Challenge marks pivotal inflection point for cyber defense,

DARPA, “AI Cyber Challenge marks pivotal inflection point for cyber defense,” Aug. 8, 2025. [Online]. Available: https://www.darpa.mil/ news/2025/aixcc-results

2025
[3]

The idea behind BynarIO,

Bynar.io, “The idea behind BynarIO,” 2025. [Online]. Available: https: //bynar.io/blog/the-idea-behind-bynario

2025
[4]

Introducing Trusted Access for Cyber,

OpenAI, “Introducing Trusted Access for Cyber,” Feb. 5, 2026. [Online]. Available: https://openai.com/index/trusted-access-for-cyber/

2026
[5]

Trusted access for the next era of cyber defense,

OpenAI, “Trusted access for the next era of cyber defense,” Apr. 14, 2026. [Online]. Available: https://openai.com/index/ scaling-trusted-access-for-cyber-defense/

2026
[6]

Assessing Claude Mythos Preview’s cybersecurity capabilities,

Anthropic Frontier Red Team, “Assessing Claude Mythos Preview’s cybersecurity capabilities,” Apr. 2026. [Online]. Available: https://red. anthropic.com/2026/mythos-preview/

2026
[7]

Partnering with Mozilla to improve Firefox’s secu- rity,

Anthropic, “Partnering with Mozilla to improve Firefox’s secu- rity,” Mar. 2026. [Online]. Available: https://www.anthropic.com/news/ mozilla-firefox-security

2026
[8]

Behind the Scenes Hardening Firefox with Claude Mythos Preview,

B. Grinstead, C. Holler, and F. Braun, “Behind the Scenes Hardening Firefox with Claude Mythos Preview,” Mozilla Hacks, May 7, 2026. [Online]. Available: https://hacks.mozilla.org/2026/05/ behind-the-scenes-hardening-firefox/

2026
[9]

Claude API Pricing,

Anthropic, “Claude API Pricing,” 2026. [Online]. Available: https: //platform.claude.com/docs/en/about-claude/pricing

2026
[10]

Zero Days, Thousands of Nights: The Life and Times of Zero-Day Vulnerabilities and Their Exploits,

L. Ablon and A. Bogart, “Zero Days, Thousands of Nights: The Life and Times of Zero-Day Vulnerabilities and Their Exploits,” RAND Corpo- ration, 2017. [Online]. Available: https://www.rand.org/pubs/research reports/RR1751.html

2017
[11]

Price of zero-day exploits rises as companies harden products against hackers,

L. Franceschi-Bicchierai, “Price of zero-day exploits rises as companies harden products against hackers,” TechCrunch, Apr. 6, 2024. [Online]. Available: https://techcrunch.com/2024/04/06/ price-of-zero-day-exploits-rises-as-companies-harden-products-against-hackers/

2024
[12]

About 0-days In-the-Wild

Google Project Zero, “About 0-days In-the-Wild.” [Online]. Available: https://googleprojectzero.github.io/0days-in-the-wild/about.html
[13]

Root Cause Analyses,

Google Project Zero, “Root Cause Analyses,” 0-days In-the-Wild. [On- line]. Available: https://googleprojectzero.github.io/0days-in-the-wild/ rca.html
[14]

VRP 2025 Year in Review,

Google Vulnerability Rewards Program Team, “VRP 2025 Year in Review,” Google Security Blog, Mar. 31, 2026. [Online]. Available: https://blog.google/security/vrp-2025-year-in-review/

2025
[15]

Evolving the Android & Chrome VRPs for the AI Era,

Google Bug Hunters, “Evolving the Android & Chrome VRPs for the AI Era,” Apr. 30, 2026. [Online]. Available: https://bughunters.google. com/blog/evolving-the-android-chrome-vrps-for-the-ai-era

2026
[16]

2025 Data Breach Investigations Report: Executive Summary,

Verizon, “2025 Data Breach Investigations Report: Executive Summary,”

2025
[17]

Available: https://www.verizon.com/business/resources/ reports/2025-dbir-executive-summary.pdf

[Online]. Available: https://www.verizon.com/business/resources/ reports/2025-dbir-executive-summary.pdf

2025
[18]

M-Trends 2025,

Mandiant, “M-Trends 2025,” 2025. [Online]. Available: https://services. google.com/fh/files/misc/m-trends-2025-en.pdf

2025
[19]

Look What You Made Us Patch: 2025 Zero-Days in Review,

Google Threat Intelligence Group, “Look What You Made Us Patch: 2025 Zero-Days in Review,” Mar. 2026. [Online]. Available: https:// cloud.google.com/blog/topics/threat-intelligence/2025-zero-day-review

2025
[20]

VulnCheck State of Exploitation 2026,

P. Garrity, “VulnCheck State of Exploitation 2026,” VulnCheck, Jan. 21, 2026. [Online]. Available: https://www.vulncheck.com/blog/ state-of-exploitation-2026

2026
[21]

Introducing the 2026 VulnCheck Exploit Intelligence Report,

C. Condon, “Introducing the 2026 VulnCheck Exploit Intelligence Report,” VulnCheck, Feb. 25, 2026. [Online]. Available: https://www. vulncheck.com/blog/2026-vulncheck-exploit-intelligence-report

2026
[22]

American Fuzzy Lop,

M. Zalewski, “American Fuzzy Lop,” 2013. [Online]. Available: https: //lcamtuf.coredump.cx/afl/ 11

2013
[23]

libFuzzer: a library for coverage-guided fuzz testing

LLVM Project, “libFuzzer: a library for coverage-guided fuzz testing.” [Online]. Available: https://llvm.org/docs/LibFuzzer.html
[24]

Address- Sanitizer: A Fast Address Sanity Checker,

K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov, “Address- Sanitizer: A Fast Address Sanity Checker,” USENIX ATC, 2012

2012
[25]

KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs,

C. Cadar, D. Dunbar, and D. Engler, “KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs,” OSDI, 2008

2008
[26]

Weird Machines, Exploitability, and Provable Unexploitabil- ity,

T. Dullien, “Weird Machines, Exploitability, and Provable Unexploitabil- ity,”IEEE Transactions on Emerging Topics in Computing, vol. 8, no. 2, pp. 391–403, 2020, doi: 10.1109/TETC.2017.2785299. 12

work page doi:10.1109/tetc.2017.2785299 2020

[1] [1]

The AI Vulnerability Storm: Building a Mythos-ready Security Program,

G. Evron, R. T. Lee, R. Mogull, et al., “The AI Vulnerability Storm: Building a Mythos-ready Security Program,” Cloud Security Alliance CISO Community, SANS Institute, [un]prompted, OW ASP Gen AI Security Project, Apr. 18, 2026

2026

[2] [2]

AI Cyber Challenge marks pivotal inflection point for cyber defense,

DARPA, “AI Cyber Challenge marks pivotal inflection point for cyber defense,” Aug. 8, 2025. [Online]. Available: https://www.darpa.mil/ news/2025/aixcc-results

2025

[3] [3]

The idea behind BynarIO,

Bynar.io, “The idea behind BynarIO,” 2025. [Online]. Available: https: //bynar.io/blog/the-idea-behind-bynario

2025

[4] [4]

Introducing Trusted Access for Cyber,

OpenAI, “Introducing Trusted Access for Cyber,” Feb. 5, 2026. [Online]. Available: https://openai.com/index/trusted-access-for-cyber/

2026

[5] [5]

Trusted access for the next era of cyber defense,

OpenAI, “Trusted access for the next era of cyber defense,” Apr. 14, 2026. [Online]. Available: https://openai.com/index/ scaling-trusted-access-for-cyber-defense/

2026

[6] [6]

Assessing Claude Mythos Preview’s cybersecurity capabilities,

Anthropic Frontier Red Team, “Assessing Claude Mythos Preview’s cybersecurity capabilities,” Apr. 2026. [Online]. Available: https://red. anthropic.com/2026/mythos-preview/

2026

[7] [7]

Partnering with Mozilla to improve Firefox’s secu- rity,

Anthropic, “Partnering with Mozilla to improve Firefox’s secu- rity,” Mar. 2026. [Online]. Available: https://www.anthropic.com/news/ mozilla-firefox-security

2026

[8] [8]

Behind the Scenes Hardening Firefox with Claude Mythos Preview,

B. Grinstead, C. Holler, and F. Braun, “Behind the Scenes Hardening Firefox with Claude Mythos Preview,” Mozilla Hacks, May 7, 2026. [Online]. Available: https://hacks.mozilla.org/2026/05/ behind-the-scenes-hardening-firefox/

2026

[9] [9]

Claude API Pricing,

Anthropic, “Claude API Pricing,” 2026. [Online]. Available: https: //platform.claude.com/docs/en/about-claude/pricing

2026

[10] [10]

Zero Days, Thousands of Nights: The Life and Times of Zero-Day Vulnerabilities and Their Exploits,

L. Ablon and A. Bogart, “Zero Days, Thousands of Nights: The Life and Times of Zero-Day Vulnerabilities and Their Exploits,” RAND Corpo- ration, 2017. [Online]. Available: https://www.rand.org/pubs/research reports/RR1751.html

2017

[11] [11]

Price of zero-day exploits rises as companies harden products against hackers,

L. Franceschi-Bicchierai, “Price of zero-day exploits rises as companies harden products against hackers,” TechCrunch, Apr. 6, 2024. [Online]. Available: https://techcrunch.com/2024/04/06/ price-of-zero-day-exploits-rises-as-companies-harden-products-against-hackers/

2024

[12] [12]

About 0-days In-the-Wild

Google Project Zero, “About 0-days In-the-Wild.” [Online]. Available: https://googleprojectzero.github.io/0days-in-the-wild/about.html

[13] [13]

Root Cause Analyses,

Google Project Zero, “Root Cause Analyses,” 0-days In-the-Wild. [On- line]. Available: https://googleprojectzero.github.io/0days-in-the-wild/ rca.html

[14] [14]

VRP 2025 Year in Review,

Google Vulnerability Rewards Program Team, “VRP 2025 Year in Review,” Google Security Blog, Mar. 31, 2026. [Online]. Available: https://blog.google/security/vrp-2025-year-in-review/

2025

[15] [15]

Evolving the Android & Chrome VRPs for the AI Era,

Google Bug Hunters, “Evolving the Android & Chrome VRPs for the AI Era,” Apr. 30, 2026. [Online]. Available: https://bughunters.google. com/blog/evolving-the-android-chrome-vrps-for-the-ai-era

2026

[16] [16]

2025 Data Breach Investigations Report: Executive Summary,

Verizon, “2025 Data Breach Investigations Report: Executive Summary,”

2025

[17] [17]

Available: https://www.verizon.com/business/resources/ reports/2025-dbir-executive-summary.pdf

[Online]. Available: https://www.verizon.com/business/resources/ reports/2025-dbir-executive-summary.pdf

2025

[18] [18]

M-Trends 2025,

Mandiant, “M-Trends 2025,” 2025. [Online]. Available: https://services. google.com/fh/files/misc/m-trends-2025-en.pdf

2025

[19] [19]

Look What You Made Us Patch: 2025 Zero-Days in Review,

Google Threat Intelligence Group, “Look What You Made Us Patch: 2025 Zero-Days in Review,” Mar. 2026. [Online]. Available: https:// cloud.google.com/blog/topics/threat-intelligence/2025-zero-day-review

2025

[20] [20]

VulnCheck State of Exploitation 2026,

P. Garrity, “VulnCheck State of Exploitation 2026,” VulnCheck, Jan. 21, 2026. [Online]. Available: https://www.vulncheck.com/blog/ state-of-exploitation-2026

2026

[21] [21]

Introducing the 2026 VulnCheck Exploit Intelligence Report,

C. Condon, “Introducing the 2026 VulnCheck Exploit Intelligence Report,” VulnCheck, Feb. 25, 2026. [Online]. Available: https://www. vulncheck.com/blog/2026-vulncheck-exploit-intelligence-report

2026

[22] [22]

American Fuzzy Lop,

M. Zalewski, “American Fuzzy Lop,” 2013. [Online]. Available: https: //lcamtuf.coredump.cx/afl/ 11

2013

[23] [23]

libFuzzer: a library for coverage-guided fuzz testing

LLVM Project, “libFuzzer: a library for coverage-guided fuzz testing.” [Online]. Available: https://llvm.org/docs/LibFuzzer.html

[24] [24]

Address- Sanitizer: A Fast Address Sanity Checker,

K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov, “Address- Sanitizer: A Fast Address Sanity Checker,” USENIX ATC, 2012

2012

[25] [25]

KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs,

C. Cadar, D. Dunbar, and D. Engler, “KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs,” OSDI, 2008

2008

[26] [26]

Weird Machines, Exploitability, and Provable Unexploitabil- ity,

T. Dullien, “Weird Machines, Exploitability, and Provable Unexploitabil- ity,”IEEE Transactions on Emerging Topics in Computing, vol. 8, no. 2, pp. 391–403, 2020, doi: 10.1109/TETC.2017.2785299. 12

work page doi:10.1109/tetc.2017.2785299 2020