pith. sign in

arxiv: 2605.30998 · v2 · pith:KYU43U4Fnew · submitted 2026-05-29 · 💻 cs.CR · cs.CE

Free-Riding the Agentic Web: A Systematic Security Analysis of x402 Payments

Pith reviewed 2026-06-28 22:12 UTC · model grok-4.3

classification 💻 cs.CR cs.CE
keywords x402 protocolsecurity analysisagentic webresource leakagepayment flawspay-per-token pricingblockchain settlementHTTP blockchain bridge
0
0 comments X

The pith

x402 payments harbor four flaw classes enabling up to 100% resource leakage plus a structural pricing limit of √(1+Θ) manipulation gap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper performs a systematic security review of the x402 protocol, which links synchronous HTTP calls to asynchronous blockchain settlements for payments in agent-driven web services. Organized around five invariants drawn from specifications and vendor expectations, the analysis maps every violation to its responsible layer in the stack. It identifies four concrete flaw classes that permit attackers to substitute resources, race duplicate settlements, overdraft allowances, and deny settlements, with leakage reaching 100% against official SDKs and live deployments. It also proves that any output-only pay-per-token pricing cannot remain fair to honest users while bounding inflation from hidden thinking tokens without introducing a square-root manipulation gap.

Core claim

Through invariant-based analysis the x402 stack contains four flaw classes—cross-resource substitution, duplicate-settlement race, allowance overdraft, and denial of settlement—that produce resource leakage up to 100% in official SDKs and production deployments. For pay-per-token schemes the paper proves a structural limit: no output-only pricing can be both fair to honest users and bounded against inflation of hidden thinking tokens, with the price of fairness being a √(1+Θ) manipulation gap. Proposed per-flaw mitigations together with a defense triple deliver provable guarantees that cut per-call reasoning cost by 47% and invert attacker leverage from 8.7× to 0.9× at 2.8% overhead.

What carries the argument

Five invariants grounded in protocol specifications, literature, and vendor expectations that organize the analysis and map every violation to its responsible layer.

If this is right

  • Official SDKs and production deployments reach resource-leakage ratios up to 100% under the four identified flaw classes.
  • A defense triple with provable guarantees reduces per-call reasoning cost by 47% and reverses attacker leverage from 8.7× to 0.9× at 2.8% overhead.
  • Per-flaw mitigations address cross-resource substitution, duplicate-settlement race, allowance overdraft, and denial of settlement individually.
  • Pay-per-token pricing carries an unavoidable √(1+Θ) manipulation gap when restricted to output-only schemes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • State-synchronization mismatches between synchronous web requests and asynchronous blockchain settlement may appear in other payment protocols that combine the two.
  • The quantitative √(1+Θ) bound offers a concrete metric that could be applied when evaluating pricing designs in additional token-based services.
  • The disclosed mitigations could be tested for transferability to related systems that bridge HTTP semantics with on-chain finality.

Load-bearing premise

The five invariants used to organize the analysis are correctly and completely grounded in the protocol specifications, literature, and vendor expectations, allowing every violation to be resolved to the responsible layer without unexamined interactions.

What would settle it

Observing zero leakage across all four flaw classes in a production x402 deployment that follows official SDKs, or exhibiting an output-only pricing scheme that achieves both user fairness and bounded inflation without a √(1+Θ) gap.

Figures

Figures reproduced from arXiv: 2605.30998 by Cong Wang, Lei Wu, Shengchen Ling, Yajin Zhou, Yihang Huang, Yuan Chen, Yuefeng Du.

Figure 1
Figure 1. Figure 1: The exponential adoption of the x402 protocol (May [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The x402 protocol workflow. The process involves [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Settlement logic in x402 SDK implementation. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Logic failures across official x402 SDKs. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Facilitator logic: the absence of nonce locking. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

The x402 protocol has crossed from prototype to infrastructure for the agentic web, driving 130 million all-time transactions and embedded in Google Cloud, Cloudflare, and Stripe. Yet bridging synchronous HTTP requests with asynchronous blockchain finality creates state-synchronization challenges, and x402's security has so far been examined only in piecemeal vendor disclosures. It is moreover not one artefact but a stack of an HTTP semantic, per-chain schemes, and a long tail of SDK and deployment choices whose required guarantees prior work has not established. We perform a systematic security analysis organized around five invariants grounded in specifications, literature, and vendor expectations, resolving every violation to the responsible layer. We identify four flaw classes: cross-resource substitution, duplicate-settlement race (independently corroborated by subsequent third-party reports), allowance overdraft, and denial of settlement. Against official SDKs and a production deployment, these reach resource-leakage ratios up to 100%. For pay-per-token scheme we prove a structural limit: no output-only pricing can be both fair to honest users and bounded against inflation of the hidden "thinking" tokens, the price of fairness being a $\sqrt{1+\Theta}$ manipulation gap. We propose per-flaw mitigations and a defense triple with provable guarantees, cutting per-call reasoning cost by 47% and inverting attacker leverage from 8.7$\times$ to 0.9$\times$ at only 2.8% overhead. All findings have been disclosed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript performs a systematic security analysis of the x402 protocol, which bridges HTTP requests with blockchain settlements and has seen 130 million transactions with adoption in Google Cloud, Cloudflare, and Stripe. The analysis is organized around five invariants drawn from protocol specifications, literature, and vendor expectations. It identifies four flaw classes (cross-resource substitution, duplicate-settlement race, allowance overdraft, denial of settlement) that produce resource-leakage ratios up to 100% when tested against official SDKs and a production deployment. For pay-per-token pricing it supplies a structural proof that no output-only scheme can simultaneously be fair to honest users and bounded against hidden-token inflation, with the fairness price being a √(1+Θ) manipulation gap. Mitigations and a defense triple are proposed that reduce per-call reasoning cost by 47% and invert attacker leverage from 8.7× to 0.9× at 2.8% overhead.

Significance. If the invariants prove exhaustive and the empirical and proof results hold, the work is significant for securing high-volume agentic payment infrastructure. Strengths include the empirical leakage measurements on real SDKs and deployments, the structural proof for the pricing limit, and the quantified defense triple with provable guarantees. These elements supply both diagnostic coverage and concrete, low-overhead countermeasures for a protocol already embedded in production systems.

major comments (1)
  1. [§4 (Invariants)] §4 (Invariants): The five invariants are presented as complete and sufficient to resolve every violation to its responsible layer, thereby establishing that the four flaw classes are exhaustive and that the reported 100% leakage ratios plus the √(1+Θ) gap fully characterize the attack surface. No explicit enumeration or formal argument is supplied showing that all interactions among HTTP semantics, per-chain settlement races, and SDK-specific state are covered; an omitted cross-layer interaction would render the partition incomplete and undermine the central claims.
minor comments (1)
  1. [Abstract] Abstract: the parenthetical note that the duplicate-settlement race was 'independently corroborated by subsequent third-party reports' should include the specific citations so readers can locate the corroboration.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the completeness argument for our invariants. We address the major comment below and will revise the manuscript to strengthen the presentation of coverage.

read point-by-point responses
  1. Referee: [§4 (Invariants)] §4 (Invariants): The five invariants are presented as complete and sufficient to resolve every violation to its responsible layer, thereby establishing that the four flaw classes are exhaustive and that the reported 100% leakage ratios plus the √(1+Θ) gap fully characterize the attack surface. No explicit enumeration or formal argument is supplied showing that all interactions among HTTP semantics, per-chain settlement races, and SDK-specific state are covered; an omitted cross-layer interaction would render the partition incomplete and undermine the central claims.

    Authors: We acknowledge that §4 grounds the invariants in the x402 specification, HTTP and blockchain security literature, and vendor expectations but does not supply an explicit enumeration or formal completeness argument for every possible cross-layer interaction. The analysis instead demonstrates coverage by deriving each invariant from the protocol's core state partitions (HTTP request semantics, asynchronous settlement finality, and SDK-managed allowances/nonces) and validating the resulting flaw classes through concrete attacks on official SDKs and a production deployment. To address the concern directly, the revised manuscript will expand §4 with a table that enumerates the principal interaction classes (HTTP header vs. on-chain nonce races, allowance state vs. duplicate settlement, cross-resource substitution across SDK state machines) and provides a short argument that any unlisted interaction reduces to one of the four identified flaw classes. This addition clarifies the partition without altering the empirical leakage measurements or the structural pricing proof. revision: yes

Circularity Check

0 steps flagged

No circularity; analysis grounded externally

full rationale

The paper's derivation organizes the security analysis around five invariants that are stated to be grounded in external protocol specifications, literature, and vendor expectations rather than derived from the paper's own findings. Flaw classes are validated against official SDKs and a production deployment, and the pay-per-token structural limit is presented as an independent mathematical proof with no reduction to fitted inputs or self-citations. No self-definitional steps, fitted predictions renamed as results, or load-bearing self-citation chains appear in the text. The chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no information on free parameters, axioms, or invented entities can be extracted.

pith-pipeline@v0.9.1-grok · 5821 in / 1212 out tokens · 29565 ms · 2026-06-28T22:12:24.422870+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Can Trustless Agents Be Trusted? An Empirical Study of the ERC-8004 Decentralized AI Agent Ecosystem

    cs.CR 2026-06 unverdicted novelty 7.0

    First empirical study of ERC-8004 finds identity registries mostly inactive and reputation system manipulable with 59-90% of reviewers showing coordinated Sybil behavior, leaving most agents without valid feedback aft...

Reference graph

Works this paper leans on

21 extracted references · 2 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Launching the x402 foundation with coinbase, and support for x402 transactions, 2025

    Will Allen, Cam Whiteside, Rohin Lohe, and Steve James. Launching the x402 foundation with coinbase, and support for x402 transactions, 2025. Online at: https://blog.cloudflare.com/x402/

  2. [2]

    Agentharm: A benchmark for measuring harmfulness of LLM agents

    Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, J Zico Kolter, Matt Fredrik- son, Yarin Gal, and Xander Davies. Agentharm: A benchmark for measuring harmfulness of LLM agents. InThe Thirteenth International Conference on Learning Representations, 2025

  3. [3]

    What is the model context protocol (mcp)?,

    Anthropic. What is the model context protocol (mcp)?,

  4. [4]

    io/docs/getting-started/intro

    Online at: https://modelcontextprotocol. io/docs/getting-started/intro

  5. [5]

    A formal security analysis of the w3c web payment apis: Attacks and verification

    Quoc Huy Do, Pedram Hosseyni, Ralf Küsters, Guido Schmitz, Nils Wenzler, and Tim Würtele. A formal security analysis of the w3c web payment apis: Attacks and verification. In2022 IEEE Symposium on Security and Privacy (SP), 2022

  6. [6]

    x402_tx_by_month, 2026

    Dune. x402_tx_by_month, 2026. Online at: https: //dune.com/queries/6212622

  7. [11]

    Introducing x402: a new standard for internet-native payments,

    Dan Kim Erik Reppel, Nemil Dalal. Introducing x402: a new standard for internet-native payments,

  8. [12]

    Online at: https://www.coinbase.com/ developer-platform/discover/launches/x402

  9. [13]

    L402: Lightning http 402 proto- col, 2025

    Lightning Labs. L402: Lightning http 402 proto- col, 2025. Online at: https://docs.lightning. engineering/the-lightning-network/l402

  10. [14]

    Toward understanding se- curity issues in the model context protocol ecosystem,

    Xiaofan Li and Xing Gao. Toward understanding se- curity issues in the model context protocol ecosystem,

  11. [15]

    URL: https://arxiv.org/abs/2510.16558, arXiv:2510.16558

  12. [16]

    Messy states of wiring: Vulnerabilities in emerging personal payment systems

    Jiadong Lou, Xu Yuan, and Ning Zhang. Messy states of wiring: Vulnerabilities in emerging personal payment systems. In30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 2021

  13. [17]

    All your shops are belong to us: Security weaknesses in e-commerce platforms

    Rohan Pagey, Mohammad Mannan, and Amr Youssef. All your shops are belong to us: Security weaknesses in e-commerce platforms. InProceedings of the ACM Web Conference 2023, WWW ’23. Association for Comput- ing Machinery, 2023

  14. [18]

    x402, 2025

    Coinbase Developer Platform. x402, 2025. Online at: https://www.x402.org/

  15. [19]

    Powering ai commerce with the new agent payments protocol (ap2), 2025

    Rao Surapaneni Stavan Parikh. Powering ai commerce with the new agent payments protocol (ap2), 2025. Online at: https://cloud.google. com/blog/products/ai-machine-learning/ announcing-agents-to-payments-ap2-protocol

  16. [20]

    Detecting logic vulnerabilities in e-commerce applications

    Fangqi Sun, Liang Xu, and Zhendong Su. Detecting logic vulnerabilities in e-commerce applications. In NDSS, 2014

  17. [21]

    Native internet payments, 2025

    thirdweb. Native internet payments, 2025. Online at: https://thirdweb.com/x402

  18. [22]

    Visa introduces trusted agent protocol: An ecosystem-led framework for ai commerce

    Visa Inc. Visa introduces trusted agent protocol: An ecosystem-led framework for ai commerce. https: //investor.visa.com/news/news-details/2025/ Visa-Introduces-Trusted-Agent-Protocol-An-Ecosystem-Led-Framework-for-AI-Commerce/ , 2025

  19. [23]

    How to shop for free online – security anal- ysis of cashier-as-a-service based web stores

    Rui Wang, Shuo Chen, XiaoFeng Wang, and Shaz Qadeer. How to shop for free online – security anal- ysis of cashier-as-a-service based web stores. In2011 IEEE Symposium on Security and Privacy, 2011

  20. [24]

    Integuard: Toward automatic protection of third- party web service integrations

    Luyi Xing, Yangyi Chen, XiaoFeng Wang, and Shuo Chen. Integuard: Toward automatic protection of third- party web service integrations. InNetwork & Dis- tributed System Security Symposium (NDSS), 2013

  21. [25]

    Show me the money! finding flawed implementations of third-party in-app payment in android apps

    Wenbo Yang, Yuanyuan Zhang, Juanru Li, Hui Liu, Qing Wang, Yueheng Zhang, and Dawu Gu. Show me the money! finding flawed implementations of third-party in-app payment in android apps. InNDSS, 2017. 14 Ethical Considerations This research investigates security vulnerabilities in financial protocols and AI infrastructure. To uphold ethical standards and pre...