pith. machine review for the scientific record. sign in

arxiv: 2605.11640 · v1 · submitted 2026-05-12 · 💱 q-fin.TR · cs.CY· q-fin.CP

Recognition: no theorem link

Fill-Side Non-Retail Trading on Polymarket: An Empirical Study of Behavioral Tiers and Microstructure Signatures Under Quote-Attribution Constraints

Maksym Nechepurenko

Pith reviewed 2026-05-13 01:19 UTC · model grok-4.3

classification 💱 q-fin.TR cs.CYq-fin.CP
keywords prediction marketsPolymarketnon-retail tradingmicrostructurebehavioral tiersclusteringfill-side dataquote attribution
0
0 comments X

The pith

Polymarket's non-retail trading exhibits uni-modal fill-side behavior, allowing tier-based separation of dominant participants.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates the supply-side microstructure of Polymarket by analyzing on-chain fill events, as quote placement and cancellation data remain inaccessible from public sources. Applying density-based clustering to a six-feature vector derived from fills across multiple sensitivity settings yields a single dense cluster with no noise points, indicating that address behaviors do not separate into the anticipated multiple archetypes. The analysis nevertheless demonstrates that stratifying addresses by activity levels isolates three non-retail tiers responsible for the bulk of notional volume. These results lead to the withdrawal of address-level market-making claims and emphasize the need for richer data in such studies. The findings are supported by a deposited dataset for further examination.

Core claim

The central discovery is that fill-side behavior on Polymarket is uni-modal under the available six-feature vector, as DBSCAN clustering consistently identifies one dense cluster contradicting the hypothesis of four-to-five separable archetypes, while feature-tier stratification independently achieves robust retail versus non-retail separation with the top tiers holding 81.4 percent of total notional across 12.6 percent of addresses.

What carries the argument

The six-feature fill-side vector extracted from OrderFilled events, used for both DBSCAN clustering to test multi-modality and independent tier stratification to identify non-retail participants.

If this is right

  • Address-level claims regarding market-making and liquidity provision are invalid due to the absence of quote-lifecycle data.
  • Prediction market microstructure analysis must rely on fill-side metrics or external data sources when quote information is off-chain.
  • Non-retail participants can be identified through volume-based tiers without requiring clustering.
  • Concentrated notional volume in a small percentage of addresses suggests market dominance by sophisticated traders.
  • Future empirical work on similar platforms should account for structural data limitations like the quote-lifecycle failure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar constraints likely apply to other prediction market platforms using off-chain order books, limiting detailed behavioral studies.
  • Combining fill data with market-level book diagnostics could enable better detection of manipulation strategies.
  • Testing the uni-modal result across longer time periods or different market conditions would strengthen or challenge the finding.
  • The tier method could be applied to other trading venues to compare non-retail participation patterns.

Load-bearing premise

The six-feature fill-side vector is adequate to detect or rule out distinct behavioral archetypes and to enable meaningful tier separation despite the loss of quote information.

What would settle it

Finding multiple clusters or noise points in the DBSCAN analysis of fill-side features from an independent Polymarket data window would falsify the uni-modal conclusion.

Figures

Figures reproduced from arXiv: 2605.11640 by Maksym Nechepurenko.

Figure 1
Figure 1. Figure 1: (a) Raw per-market Kyle’s λ distribution on the 24,778 markets with non-trivial fill activity, clipped to [−1000, +1000] for display (full raw range is [−7.28×1016 , +2.11×1016]; 496 markets flagged as extreme outliers). The raw distribution is unusable for downstream regression. (b) Winsorized Kyle’s λ at [P01, P99] = [−4.2042, +0.1052] (red dashed lines): the winsorized version is well-behaved and is the… view at source ↗
Figure 2
Figure 2. Figure 2: Address population (a) vs total fill notional (b) by feature tier, on 77,203 addresses (post￾CTFExchange exclusion) over the empirical window 2026-04-21 to 2026-04-27. The whale-tier (68 addresses, 0.09% of population) holds 28.0% of total notional; the strict non-retail subtotal (whale + high-frequency-operator + power-trader; 12.6% of addresses) holds 81.4% of total notional. The episodic-retail base (82… view at source ↗
Figure 3
Figure 3. Figure 3: Fill-notional Lorenz curve across all 77,203 addresses with ≥ 5 fills in the empirical window. The Gini coefficient is 0.932, indicating extreme concentration. The marked point shows that the bottom 87.4% of addresses hold only 18.6% of total fill notional, with the top 12.6% holding the remaining 81.4% (strict non-retail subtotal). Concentration is comparable to or exceeds typical equity-market notional d… view at source ↗
Figure 4
Figure 4. Figure 4: k-means k = 5 cluster centroids with 95% bootstrap confidence intervals on the six-feature fill-side vector f(a) = (f2, f3, f5, f6, f7, f9), computed on 77,203 addresses. The K5-Broad-HF partition has the highest trade intensity (f2) and lowest market concentration (f6, broad participation), consistent with fill-side high-frequency activity; K5-Broad-HN has higher notional per fill (f3); K5-Specialist has … view at source ↗
Figure 5
Figure 5. Figure 5: Tier × k-means cross-tabulation visualized as a heatmap (log-scale color). The strong diagonal-like pattern in the upper rows (whale-tier and high-frequency-operator tier concentrate in K5-Broad-HF; high-breadth-operator concentrates entirely in K5-Broad-HF) confirms that feature-tier and k-means partitions identify overlapping but non-identical structure. Episodic retail spreads across all five partitions… view at source ↗
Figure 6
Figure 6. Figure 6: Bilateral Spearman ρ between per-market archetype shares (narchetypes = 5: UNKNOWN, fill-MM, fill-LP, SPECIALIST, RETAIL) and microstructure metrics (nmetrics = 22: ILS, OFI, OI at 5m/15m/1h, TS, PR at 60m/240m, VPIN-50, winsorized Kyle’s λ, three SCI weight schemes over two windows, trade-size kurtosis, Hawkes branching ratio, and others). Metrics sorted by maximum absolute ρ (highest-signal metrics on th… view at source ↗
read the original abstract

Prediction markets cannot exist without market makers, arbitrageurs, and other non-retail liquidity providers, yet the supply-side microstructure of Polymarket-class venues has not been characterized at on-chain pseudonymous-address scale. This paper studies non-retail participation on Polymarket using an empirical run on the PMXT v2 archive over 2026-04-21 through 2026-04-27 (13,356,931 OrderFilled events; 77,204 addresses with five+ fills; 43,116 markets). We report three findings. First, Polymarket's off-chain CLOB architecture renders address-level quote-lifecycle attribution permanently unavailable: OrderPlaced and OrderCancelled events are off-chain and absent from public archives, so quote-intensity, two-sided-ratio, and posted-spread features cannot be built at address level. We document this as a structural validity-gate failure (G-QUOTE-LIFE universal fail) and restrict analysis to a six-feature fill-side vector. Second, density-based clustering (DBSCAN, fifteen sensitivity configurations) on the fill-side vector produces a single dense cluster with zero noise: fill-side behavior in the empirical window is uni-modal under the six-feature vector, contradicting the pre-registered hypothesis of four-to-five separable archetypes. Third, robust retail vs non-retail separation is achievable through clustering-independent feature-tier stratification: whale-tier, high-frequency-operator, and power-trader tiers jointly hold 81.4% of total notional across 12.6% of addresses. Address-level market-making and liquidity-provision claims are withdrawn per the G-QUOTE-LIFE failure; spoof-by-non-fill manipulation detection is downgraded to market-level book diagnostics. A privacy-respecting derived-dataset deposit accompanies the paper as Bundle 3 of the PMXT family. Fourth paper in a four-paper programme on event-linked perpetuals and leveraged prediction-market microstructure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper examines non-retail trading on Polymarket via an empirical analysis of 13,356,931 OrderFilled events across 77,204 addresses and 43,116 markets from 2026-04-21 to 2026-04-27. It documents a structural failure (G-QUOTE-LIFE) preventing address-level quote-lifecycle attribution due to off-chain OrderPlaced and OrderCancelled events, restricting analysis to a six-feature fill-side vector. DBSCAN clustering (15 sensitivity configurations) yields a single dense cluster with zero noise, indicating uni-modal fill-side behavior that contradicts the pre-registered hypothesis of four-to-five archetypes. Clustering-independent feature-tier stratification shows whale, high-frequency-operator, and power-trader tiers holding 81.4% of notional volume across 12.6% of addresses. Address-level market-making claims are withdrawn, and a derived dataset is deposited.

Significance. If the results hold, the work offers useful empirical evidence on liquidity concentration in prediction markets and the constraints of on-chain microstructure data, with the 81.4% notional concentration finding being particularly robust and reproducible via the public dataset deposit. The explicit reporting of DBSCAN outcomes across configurations strengthens the uni-modality observation. However, the significance of the contradiction to the pre-registered hypothesis is limited by the reduced feature set.

major comments (2)
  1. [Abstract and hypothesis section] Abstract and the section on pre-registered hypothesis: The claim that the single DBSCAN cluster on the six-feature fill-side vector contradicts the pre-registered hypothesis of four-to-five separable archetypes is undermined because the hypothesis was formulated including quote-related features (intensity, two-sided ratio, posted spread) that are unavailable due to the G-QUOTE-LIFE failure. The observed uni-modality may reflect collapsed distinctions from the reduced vector rather than true behavioral homogeneity, weakening the contradiction and the robustness of the subsequent retail/non-retail separation.
  2. [Tier stratification section] Section on tier stratification: The clustering-independent feature-tier stratification defining whale-tier, high-frequency-operator, and power-trader tiers lacks explicit cutoff definitions for the features and any robustness checks against variations in those thresholds. This is load-bearing for the claim that these tiers jointly hold 81.4% of total notional across 12.6% of addresses, especially since feature cutoffs are identified as free parameters.
minor comments (1)
  1. [Methods] Methods section: While the paper reports fifteen DBSCAN sensitivity configurations and the consistent single-cluster outcome, providing the specific eps and min_samples values tested and a table summarizing results across configurations would enhance transparency and reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's detailed feedback on our manuscript. We address the major comments point by point below, agreeing where revisions are needed to strengthen the presentation of our findings on the constraints of on-chain data and the observed trading behaviors.

read point-by-point responses
  1. Referee: [Abstract and hypothesis section] Abstract and the section on pre-registered hypothesis: The claim that the single DBSCAN cluster on the six-feature fill-side vector contradicts the pre-registered hypothesis of four-to-five separable archetypes is undermined because the hypothesis was formulated including quote-related features (intensity, two-sided ratio, posted spread) that are unavailable due to the G-QUOTE-LIFE failure. The observed uni-modality may reflect collapsed distinctions from the reduced vector rather than true behavioral homogeneity, weakening the contradiction and the robustness of the subsequent retail/non-retail separation.

    Authors: We thank the referee for highlighting this important nuance. The pre-registered hypothesis was indeed formulated under the assumption of full quote-lifecycle data availability. However, upon discovering the G-QUOTE-LIFE structural failure, we restricted our analysis to the fill-side vector and observed uni-modality across multiple DBSCAN configurations. While this does qualify the strength of the contradiction to the original hypothesis, the finding of uni-modal fill-side behavior remains a key empirical result under the available data constraints. We will revise the abstract and hypothesis section to explicitly state that the pre-registered archetypes anticipated quote features, and clarify that the observed uni-modality pertains to the reduced six-feature vector. This revision will also reinforce the robustness of the subsequent tier-based retail/non-retail separation, which is independent of clustering. revision: yes

  2. Referee: [Tier stratification section] Section on tier stratification: The clustering-independent feature-tier stratification defining whale-tier, high-frequency-operator, and power-trader tiers lacks explicit cutoff definitions for the features and any robustness checks against variations in those thresholds. This is load-bearing for the claim that these tiers jointly hold 81.4% of total notional across 12.6% of addresses, especially since feature cutoffs are identified as free parameters.

    Authors: We agree that explicit cutoff definitions and robustness checks are necessary for transparency. In the revised manuscript, we will include precise definitions of the feature thresholds used to define the whale-tier, high-frequency-operator, and power-trader tiers. Additionally, we will conduct and report sensitivity analyses by varying these thresholds within reasonable ranges to demonstrate that the 81.4% notional concentration and 12.6% address share remain stable. The derived dataset deposit allows for independent verification of these stratifications. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical derivation chain

full rationale

The paper conducts a data-driven empirical analysis on Polymarket fill events using a six-feature fill-side vector after acknowledging the G-QUOTE-LIFE structural data limitation that removes quote-lifecycle features. DBSCAN clustering (across sensitivity configurations) is applied directly to the observed data and reports a single dense cluster outcome that contradicts the pre-registered hypothesis; tier stratification for retail/non-retail separation is explicitly described as clustering-independent and consists of reporting notional concentration statistics (81.4% in 12.6% of addresses) from the same empirical distribution. No equations, fitted parameters presented as predictions, self-citations, or ansatzes are invoked in a load-bearing way that reduces any claim to its inputs by construction. The chain remains self-contained against the archive data and pre-registered elements without tautological reduction.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claims rest on the sufficiency of fill-side features after data loss and on chosen thresholds for tier stratification; no new physical entities are postulated.

free parameters (2)
  • DBSCAN eps and min_samples values
    Fifteen sensitivity configurations were run; exact parameter sets are not enumerated in the abstract.
  • Feature cutoffs for whale, high-frequency-operator, and power-trader tiers
    Stratification thresholds on volume, frequency, and related metrics are required to produce the 12.6%/81.4% split.
axioms (1)
  • domain assumption The six fill-side features adequately proxy behavioral archetypes despite permanent loss of quote-lifecycle attribution.
    Invoked to justify proceeding with clustering and tier analysis after documenting the G-QUOTE-LIFE failure.

pith-pipeline@v0.9.0 · 5663 in / 1450 out tokens · 57200 ms · 2026-05-13T01:19:30.253030+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 2 internal anchors

  1. [1]

    The Adoption of Blockchain-based Decentralized Exchanges

    Capponi, Agostino and Ruizhe Jia (2021). “The Adoption of Blockchain-based Decentralized Exchanges”. In:Working paper. 50 Dubach (2026). “Polymarket Anatomy”. Working paper / preprint, 2026; cited in Paper 1 for depth profile geometric grid distribution. Complete citation pending venue identification at camera-ready

  2. [2]

    A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise

    Ester, Martin, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu (1996). “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”. In:Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD- 96), pp. 226–231. ForesightFlow (2026).ForesightFlow Datasets. Online repository, accessed 2026...

  3. [3]

    Strategic Trading When Agents Forecast the Forecasts of Others

    Foster, F. Douglas and S. Viswanathan (1996). “Strategic Trading When Agents Forecast the Forecasts of Others”. In:Journal of Finance51.4, pp. 1437–1478

  4. [4]

    Bid, Ask and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders

    Glosten, Lawrence R. and Paul R. Milgrom (1985). “Bid, Ask and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders”. In:Journal of Financial Economics14.1, pp. 71–100

  5. [5]

    Combinatorial Information Market Design

    Hanson, Robin (2003). “Combinatorial Information Market Design”. In:Information Systems Frontiers5.1, pp. 107–119

  6. [6]

    Oxford University Press

    Hasbrouck, Joel (2007).Empirical Market Microstructure: The Institutions, Economics, and Econometrics of Securities Trading. Oxford University Press

  7. [7]

    (2020).BlockSci: Design and Applications of a Blockchain Analysis Platform

    Kalodner, Harry et al. (2020).BlockSci: Design and Applications of a Blockchain Analysis Platform

  8. [8]

    Continuous Auctions and Insider Trading

    Kyle, Albert S. (1985). “Continuous Auctions and Insider Trading”. In:Econometrica53.6, pp. 1315–1335

  9. [9]

    Decentralized Exchanges

    Lehar, Alfred and Christine A. Parlour (2022). “Decentralized Exchanges”. In:Working paper

  10. [10]

    Interpreting the Predictions of Prediction Markets

    Manski, Charles F. (2006). “Interpreting the Predictions of Prediction Markets”. In:Economics Letters91.3, pp. 425–429

  11. [11]

    A Fistful of Bitcoins: Characterizing Payments Among Men with No Names

    Meiklejohn, Sarah, Marjori Pomarole, Grant Jordan, Kirill Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage (2013). “A Fistful of Bitcoins: Characterizing Payments Among Men with No Names”. In:Internet Measurement Conference (IMC)

  12. [12]

    Empirical Evaluation of Deadline-Resolved Information Leakage on Documented Polymarket Insider Cases

    Nechepurenko, Maksym (2026a). “A Taxonomy of Event-Linked Perpetual Futures: Variant Designs Beyond the Single-Market Binary Case”. Paper 2, four-paper Event-Linked Perpet- uals programme. Working paper, Devnull Research. Available at SSRN:https://papers. ssrn.com/abstract=6748298.url:https://papers.ssrn.com/abstract=6748298. — (2026b). “Empirical Evaluat...

  13. [13]

    The Signal Credibility Index for Prediction Markets: A Microstructure-Grounded Diagnostic with Weighted and Time-Varying Extensions

    Nechepurenko, Maksym (2026h). “Resolution-Aware Perpetual Futures on Binary Prediction Markets: An Empirical Risk-Design Framework Using Polymarket Data”. Paper 1, four- paper Event-Linked Perpetuals programme. Working paper, Devnull Research. Available at SSRN: https://papers.ssrn.com/abstract=6748278.url: https://papers.ssrn. com/abstract=6748278. — (20...

  14. [14]

    Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis

    Rousseeuw, Peter J. (1987). “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis”. In:Journal of Computational and Applied Mathematics20, pp. 53–65

  15. [15]

    Inferring the Components of the Bid-Ask Spread: Theory and Empirical Tests

    Stoll, Hans R. (1989). “Inferring the Components of the Bid-Ask Spread: Theory and Empirical Tests”. In:Journal of Finance44.1, pp. 115–134

  16. [16]

    Prediction Markets

    Wolfers, Justin and Eric Zitzewitz (2004). “Prediction Markets”. In:Journal of Economic Perspectives18.2, pp. 107–126. 52