Free-Riding the Agentic Web: A Systematic Security Analysis of x402 Payments

Cong Wang; Lei Wu; Shengchen Ling; Yajin Zhou; Yihang Huang; Yuan Chen; Yuefeng Du

arxiv: 2605.30998 · v2 · pith:KYU43U4Fnew · submitted 2026-05-29 · 💻 cs.CR · cs.CE

Free-Riding the Agentic Web: A Systematic Security Analysis of x402 Payments

Shengchen Ling , Yihang Huang , Yuefeng Du , Yuan Chen , Yajin Zhou , Lei Wu , Cong Wang This is my paper

Pith reviewed 2026-06-28 22:12 UTC · model grok-4.3

classification 💻 cs.CR cs.CE

keywords x402 protocolsecurity analysisagentic webresource leakagepayment flawspay-per-token pricingblockchain settlementHTTP blockchain bridge

0 comments

The pith

x402 payments harbor four flaw classes enabling up to 100% resource leakage plus a structural pricing limit of √(1+Θ) manipulation gap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper performs a systematic security review of the x402 protocol, which links synchronous HTTP calls to asynchronous blockchain settlements for payments in agent-driven web services. Organized around five invariants drawn from specifications and vendor expectations, the analysis maps every violation to its responsible layer in the stack. It identifies four concrete flaw classes that permit attackers to substitute resources, race duplicate settlements, overdraft allowances, and deny settlements, with leakage reaching 100% against official SDKs and live deployments. It also proves that any output-only pay-per-token pricing cannot remain fair to honest users while bounding inflation from hidden thinking tokens without introducing a square-root manipulation gap.

Core claim

Through invariant-based analysis the x402 stack contains four flaw classes—cross-resource substitution, duplicate-settlement race, allowance overdraft, and denial of settlement—that produce resource leakage up to 100% in official SDKs and production deployments. For pay-per-token schemes the paper proves a structural limit: no output-only pricing can be both fair to honest users and bounded against inflation of hidden thinking tokens, with the price of fairness being a √(1+Θ) manipulation gap. Proposed per-flaw mitigations together with a defense triple deliver provable guarantees that cut per-call reasoning cost by 47% and invert attacker leverage from 8.7× to 0.9× at 2.8% overhead.

What carries the argument

Five invariants grounded in protocol specifications, literature, and vendor expectations that organize the analysis and map every violation to its responsible layer.

If this is right

Official SDKs and production deployments reach resource-leakage ratios up to 100% under the four identified flaw classes.
A defense triple with provable guarantees reduces per-call reasoning cost by 47% and reverses attacker leverage from 8.7× to 0.9× at 2.8% overhead.
Per-flaw mitigations address cross-resource substitution, duplicate-settlement race, allowance overdraft, and denial of settlement individually.
Pay-per-token pricing carries an unavoidable √(1+Θ) manipulation gap when restricted to output-only schemes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

State-synchronization mismatches between synchronous web requests and asynchronous blockchain settlement may appear in other payment protocols that combine the two.
The quantitative √(1+Θ) bound offers a concrete metric that could be applied when evaluating pricing designs in additional token-based services.
The disclosed mitigations could be tested for transferability to related systems that bridge HTTP semantics with on-chain finality.

Load-bearing premise

The five invariants used to organize the analysis are correctly and completely grounded in the protocol specifications, literature, and vendor expectations, allowing every violation to be resolved to the responsible layer without unexamined interactions.

What would settle it

Observing zero leakage across all four flaw classes in a production x402 deployment that follows official SDKs, or exhibiting an output-only pricing scheme that achieves both user fairness and bounded inflation without a √(1+Θ) gap.

Figures

Figures reproduced from arXiv: 2605.30998 by Cong Wang, Lei Wu, Shengchen Ling, Yajin Zhou, Yihang Huang, Yuan Chen, Yuefeng Du.

**Figure 2.** Figure 2: The x402 protocol workflow. The process involves [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Settlement logic in x402 SDK implementation. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Logic failures across official x402 SDKs. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Facilitator logic: the absence of nonce locking. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

The x402 protocol has crossed from prototype to infrastructure for the agentic web, driving 130 million all-time transactions and embedded in Google Cloud, Cloudflare, and Stripe. Yet bridging synchronous HTTP requests with asynchronous blockchain finality creates state-synchronization challenges, and x402's security has so far been examined only in piecemeal vendor disclosures. It is moreover not one artefact but a stack of an HTTP semantic, per-chain schemes, and a long tail of SDK and deployment choices whose required guarantees prior work has not established. We perform a systematic security analysis organized around five invariants grounded in specifications, literature, and vendor expectations, resolving every violation to the responsible layer. We identify four flaw classes: cross-resource substitution, duplicate-settlement race (independently corroborated by subsequent third-party reports), allowance overdraft, and denial of settlement. Against official SDKs and a production deployment, these reach resource-leakage ratios up to 100%. For pay-per-token scheme we prove a structural limit: no output-only pricing can be both fair to honest users and bounded against inflation of the hidden "thinking" tokens, the price of fairness being a $\sqrt{1+\Theta}$ manipulation gap. We propose per-flaw mitigations and a defense triple with provable guarantees, cutting per-call reasoning cost by 47% and inverting attacker leverage from 8.7$\times$ to 0.9$\times$ at only 2.8% overhead. All findings have been disclosed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives the first systematic invariant-based analysis of x402, cataloging four flaw classes with up to 100% leakage in tested setups plus a structural pricing proof, but the invariants' completeness is the main open question.

read the letter

The main takeaway is that x402 already runs at scale with 130 million transactions and big infrastructure players, and this is the first paper to move past vendor notes to a full-stack analysis organized around five invariants. It flags four flaw classes—cross-resource substitution, duplicate-settlement race, allowance overdraft, and denial of settlement—that hit resource-leakage ratios up to 100% against official SDKs and a production deployment. The pay-per-token section adds a structural proof that no output-only pricing can stay both fair to honest users and bounded against hidden thinking-token inflation, with the fairness cost being a sqrt(1+Θ) manipulation gap. They also give per-flaw mitigations and a defense triple that cuts reasoning cost by 47% and reverses attacker leverage from 8.7× to 0.9× at 2.8% overhead.

What the paper does well is the layered resolution of violations and the concrete empirical checks. Grounding the invariants in specifications, literature, and vendor expectations lets them map each issue to a responsible layer, and the numbers on SDKs plus the independent later report on the duplicate race give something testable. The mitigation results are reported with specific deltas, which is useful.

The soft spot is the completeness of those five invariants. If they miss an interaction between HTTP semantics, per-chain settlement races, and SDK state handling, the flaw classes would not be exhaustive and the leakage ratios plus the pricing gap would rest on an incomplete partition of the state space. The abstract does not show the full derivation, so that assumption needs checking in the body. The pricing proof looks more self-contained and less sensitive to the partition, so it may hold up better.

This is for researchers and engineers working on agentic micropayments or HTTP-blockchain bridges. A reader who needs a catalog of concrete x402 issues and defense options would get value. It deserves a serious referee because the protocol is deployed and the claims are specific enough to verify.

Referee Report

1 major / 1 minor

Summary. The manuscript performs a systematic security analysis of the x402 protocol, which bridges HTTP requests with blockchain settlements and has seen 130 million transactions with adoption in Google Cloud, Cloudflare, and Stripe. The analysis is organized around five invariants drawn from protocol specifications, literature, and vendor expectations. It identifies four flaw classes (cross-resource substitution, duplicate-settlement race, allowance overdraft, denial of settlement) that produce resource-leakage ratios up to 100% when tested against official SDKs and a production deployment. For pay-per-token pricing it supplies a structural proof that no output-only scheme can simultaneously be fair to honest users and bounded against hidden-token inflation, with the fairness price being a √(1+Θ) manipulation gap. Mitigations and a defense triple are proposed that reduce per-call reasoning cost by 47% and invert attacker leverage from 8.7× to 0.9× at 2.8% overhead.

Significance. If the invariants prove exhaustive and the empirical and proof results hold, the work is significant for securing high-volume agentic payment infrastructure. Strengths include the empirical leakage measurements on real SDKs and deployments, the structural proof for the pricing limit, and the quantified defense triple with provable guarantees. These elements supply both diagnostic coverage and concrete, low-overhead countermeasures for a protocol already embedded in production systems.

major comments (1)

[§4 (Invariants)] §4 (Invariants): The five invariants are presented as complete and sufficient to resolve every violation to its responsible layer, thereby establishing that the four flaw classes are exhaustive and that the reported 100% leakage ratios plus the √(1+Θ) gap fully characterize the attack surface. No explicit enumeration or formal argument is supplied showing that all interactions among HTTP semantics, per-chain settlement races, and SDK-specific state are covered; an omitted cross-layer interaction would render the partition incomplete and undermine the central claims.

minor comments (1)

[Abstract] Abstract: the parenthetical note that the duplicate-settlement race was 'independently corroborated by subsequent third-party reports' should include the specific citations so readers can locate the corroboration.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the completeness argument for our invariants. We address the major comment below and will revise the manuscript to strengthen the presentation of coverage.

read point-by-point responses

Referee: [§4 (Invariants)] §4 (Invariants): The five invariants are presented as complete and sufficient to resolve every violation to its responsible layer, thereby establishing that the four flaw classes are exhaustive and that the reported 100% leakage ratios plus the √(1+Θ) gap fully characterize the attack surface. No explicit enumeration or formal argument is supplied showing that all interactions among HTTP semantics, per-chain settlement races, and SDK-specific state are covered; an omitted cross-layer interaction would render the partition incomplete and undermine the central claims.

Authors: We acknowledge that §4 grounds the invariants in the x402 specification, HTTP and blockchain security literature, and vendor expectations but does not supply an explicit enumeration or formal completeness argument for every possible cross-layer interaction. The analysis instead demonstrates coverage by deriving each invariant from the protocol's core state partitions (HTTP request semantics, asynchronous settlement finality, and SDK-managed allowances/nonces) and validating the resulting flaw classes through concrete attacks on official SDKs and a production deployment. To address the concern directly, the revised manuscript will expand §4 with a table that enumerates the principal interaction classes (HTTP header vs. on-chain nonce races, allowance state vs. duplicate settlement, cross-resource substitution across SDK state machines) and provides a short argument that any unlisted interaction reduces to one of the four identified flaw classes. This addition clarifies the partition without altering the empirical leakage measurements or the structural pricing proof. revision: yes

Circularity Check

0 steps flagged

No circularity; analysis grounded externally

full rationale

The paper's derivation organizes the security analysis around five invariants that are stated to be grounded in external protocol specifications, literature, and vendor expectations rather than derived from the paper's own findings. Flaw classes are validated against official SDKs and a production deployment, and the pay-per-token structural limit is presented as an independent mathematical proof with no reduction to fitted inputs or self-citations. No self-definitional steps, fitted predictions renamed as results, or load-bearing self-citation chains appear in the text. The chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no information on free parameters, axioms, or invented entities can be extracted.

pith-pipeline@v0.9.1-grok · 5821 in / 1212 out tokens · 29565 ms · 2026-06-28T22:12:24.422870+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Can Trustless Agents Be Trusted? An Empirical Study of the ERC-8004 Decentralized AI Agent Ecosystem
cs.CR 2026-06 unverdicted novelty 7.0

First empirical study of ERC-8004 finds identity registries mostly inactive and reputation system manipulable with 59-90% of reviewers showing coordinated Sybil behavior, leaving most agents without valid feedback aft...

Reference graph

Works this paper leans on

21 extracted references · 2 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Launching the x402 foundation with coinbase, and support for x402 transactions, 2025

Will Allen, Cam Whiteside, Rohin Lohe, and Steve James. Launching the x402 foundation with coinbase, and support for x402 transactions, 2025. Online at: https://blog.cloudflare.com/x402/

2025
[2]

Agentharm: A benchmark for measuring harmfulness of LLM agents

Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, J Zico Kolter, Matt Fredrik- son, Yarin Gal, and Xander Davies. Agentharm: A benchmark for measuring harmfulness of LLM agents. InThe Thirteenth International Conference on Learning Representations, 2025

2025
[3]

What is the model context protocol (mcp)?,

Anthropic. What is the model context protocol (mcp)?,
[4]

io/docs/getting-started/intro

Online at: https://modelcontextprotocol. io/docs/getting-started/intro
[5]

A formal security analysis of the w3c web payment apis: Attacks and verification

Quoc Huy Do, Pedram Hosseyni, Ralf Küsters, Guido Schmitz, Nils Wenzler, and Tim Würtele. A formal security analysis of the w3c web payment apis: Attacks and verification. In2022 IEEE Symposium on Security and Privacy (SP), 2022

2022
[6]

x402_tx_by_month, 2026

Dune. x402_tx_by_month, 2026. Online at: https: //dune.com/queries/6212622

work page arXiv 2026
[11]

Introducing x402: a new standard for internet-native payments,

Dan Kim Erik Reppel, Nemil Dalal. Introducing x402: a new standard for internet-native payments,
[12]

Online at: https://www.coinbase.com/ developer-platform/discover/launches/x402
[13]

L402: Lightning http 402 proto- col, 2025

Lightning Labs. L402: Lightning http 402 proto- col, 2025. Online at: https://docs.lightning. engineering/the-lightning-network/l402

2025
[14]

Toward understanding se- curity issues in the model context protocol ecosystem,

Xiaofan Li and Xing Gao. Toward understanding se- curity issues in the model context protocol ecosystem,
[15]

URL: https://arxiv.org/abs/2510.16558, arXiv:2510.16558

work page internal anchor Pith review Pith/arXiv arXiv
[16]

Messy states of wiring: Vulnerabilities in emerging personal payment systems

Jiadong Lou, Xu Yuan, and Ning Zhang. Messy states of wiring: Vulnerabilities in emerging personal payment systems. In30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 2021

2021
[17]

All your shops are belong to us: Security weaknesses in e-commerce platforms

Rohan Pagey, Mohammad Mannan, and Amr Youssef. All your shops are belong to us: Security weaknesses in e-commerce platforms. InProceedings of the ACM Web Conference 2023, WWW ’23. Association for Comput- ing Machinery, 2023

2023
[18]

x402, 2025

Coinbase Developer Platform. x402, 2025. Online at: https://www.x402.org/

2025
[19]

Powering ai commerce with the new agent payments protocol (ap2), 2025

Rao Surapaneni Stavan Parikh. Powering ai commerce with the new agent payments protocol (ap2), 2025. Online at: https://cloud.google. com/blog/products/ai-machine-learning/ announcing-agents-to-payments-ap2-protocol

2025
[20]

Detecting logic vulnerabilities in e-commerce applications

Fangqi Sun, Liang Xu, and Zhendong Su. Detecting logic vulnerabilities in e-commerce applications. In NDSS, 2014

2014
[21]

Native internet payments, 2025

thirdweb. Native internet payments, 2025. Online at: https://thirdweb.com/x402

2025
[22]

Visa introduces trusted agent protocol: An ecosystem-led framework for ai commerce

Visa Inc. Visa introduces trusted agent protocol: An ecosystem-led framework for ai commerce. https: //investor.visa.com/news/news-details/2025/ Visa-Introduces-Trusted-Agent-Protocol-An-Ecosystem-Led-Framework-for-AI-Commerce/ , 2025

2025
[23]

How to shop for free online – security anal- ysis of cashier-as-a-service based web stores

Rui Wang, Shuo Chen, XiaoFeng Wang, and Shaz Qadeer. How to shop for free online – security anal- ysis of cashier-as-a-service based web stores. In2011 IEEE Symposium on Security and Privacy, 2011

2011
[24]

Integuard: Toward automatic protection of third- party web service integrations

Luyi Xing, Yangyi Chen, XiaoFeng Wang, and Shuo Chen. Integuard: Toward automatic protection of third- party web service integrations. InNetwork & Dis- tributed System Security Symposium (NDSS), 2013

2013
[25]

Show me the money! finding flawed implementations of third-party in-app payment in android apps

Wenbo Yang, Yuanyuan Zhang, Juanru Li, Hui Liu, Qing Wang, Yueheng Zhang, and Dawu Gu. Show me the money! finding flawed implementations of third-party in-app payment in android apps. InNDSS, 2017. 14 Ethical Considerations This research investigates security vulnerabilities in financial protocols and AI infrastructure. To uphold ethical standards and pre...

2017

[1] [1]

Launching the x402 foundation with coinbase, and support for x402 transactions, 2025

Will Allen, Cam Whiteside, Rohin Lohe, and Steve James. Launching the x402 foundation with coinbase, and support for x402 transactions, 2025. Online at: https://blog.cloudflare.com/x402/

2025

[2] [2]

Agentharm: A benchmark for measuring harmfulness of LLM agents

Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, J Zico Kolter, Matt Fredrik- son, Yarin Gal, and Xander Davies. Agentharm: A benchmark for measuring harmfulness of LLM agents. InThe Thirteenth International Conference on Learning Representations, 2025

2025

[3] [3]

What is the model context protocol (mcp)?,

Anthropic. What is the model context protocol (mcp)?,

[4] [4]

io/docs/getting-started/intro

Online at: https://modelcontextprotocol. io/docs/getting-started/intro

[5] [5]

A formal security analysis of the w3c web payment apis: Attacks and verification

Quoc Huy Do, Pedram Hosseyni, Ralf Küsters, Guido Schmitz, Nils Wenzler, and Tim Würtele. A formal security analysis of the w3c web payment apis: Attacks and verification. In2022 IEEE Symposium on Security and Privacy (SP), 2022

2022

[6] [6]

x402_tx_by_month, 2026

Dune. x402_tx_by_month, 2026. Online at: https: //dune.com/queries/6212622

work page arXiv 2026

[7] [11]

Introducing x402: a new standard for internet-native payments,

Dan Kim Erik Reppel, Nemil Dalal. Introducing x402: a new standard for internet-native payments,

[8] [12]

Online at: https://www.coinbase.com/ developer-platform/discover/launches/x402

[9] [13]

L402: Lightning http 402 proto- col, 2025

Lightning Labs. L402: Lightning http 402 proto- col, 2025. Online at: https://docs.lightning. engineering/the-lightning-network/l402

2025

[10] [14]

Toward understanding se- curity issues in the model context protocol ecosystem,

Xiaofan Li and Xing Gao. Toward understanding se- curity issues in the model context protocol ecosystem,

[11] [15]

URL: https://arxiv.org/abs/2510.16558, arXiv:2510.16558

work page internal anchor Pith review Pith/arXiv arXiv

[12] [16]

Messy states of wiring: Vulnerabilities in emerging personal payment systems

Jiadong Lou, Xu Yuan, and Ning Zhang. Messy states of wiring: Vulnerabilities in emerging personal payment systems. In30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 2021

2021

[13] [17]

All your shops are belong to us: Security weaknesses in e-commerce platforms

Rohan Pagey, Mohammad Mannan, and Amr Youssef. All your shops are belong to us: Security weaknesses in e-commerce platforms. InProceedings of the ACM Web Conference 2023, WWW ’23. Association for Comput- ing Machinery, 2023

2023

[14] [18]

x402, 2025

Coinbase Developer Platform. x402, 2025. Online at: https://www.x402.org/

2025

[15] [19]

Powering ai commerce with the new agent payments protocol (ap2), 2025

Rao Surapaneni Stavan Parikh. Powering ai commerce with the new agent payments protocol (ap2), 2025. Online at: https://cloud.google. com/blog/products/ai-machine-learning/ announcing-agents-to-payments-ap2-protocol

2025

[16] [20]

Detecting logic vulnerabilities in e-commerce applications

Fangqi Sun, Liang Xu, and Zhendong Su. Detecting logic vulnerabilities in e-commerce applications. In NDSS, 2014

2014

[17] [21]

Native internet payments, 2025

thirdweb. Native internet payments, 2025. Online at: https://thirdweb.com/x402

2025

[18] [22]

Visa introduces trusted agent protocol: An ecosystem-led framework for ai commerce

Visa Inc. Visa introduces trusted agent protocol: An ecosystem-led framework for ai commerce. https: //investor.visa.com/news/news-details/2025/ Visa-Introduces-Trusted-Agent-Protocol-An-Ecosystem-Led-Framework-for-AI-Commerce/ , 2025

2025

[19] [23]

How to shop for free online – security anal- ysis of cashier-as-a-service based web stores

Rui Wang, Shuo Chen, XiaoFeng Wang, and Shaz Qadeer. How to shop for free online – security anal- ysis of cashier-as-a-service based web stores. In2011 IEEE Symposium on Security and Privacy, 2011

2011

[20] [24]

Integuard: Toward automatic protection of third- party web service integrations

Luyi Xing, Yangyi Chen, XiaoFeng Wang, and Shuo Chen. Integuard: Toward automatic protection of third- party web service integrations. InNetwork & Dis- tributed System Security Symposium (NDSS), 2013

2013

[21] [25]

Show me the money! finding flawed implementations of third-party in-app payment in android apps

Wenbo Yang, Yuanyuan Zhang, Juanru Li, Hui Liu, Qing Wang, Yueheng Zhang, and Dawu Gu. Show me the money! finding flawed implementations of third-party in-app payment in android apps. InNDSS, 2017. 14 Ethical Considerations This research investigates security vulnerabilities in financial protocols and AI infrastructure. To uphold ethical standards and pre...

2017